學術產出-Theses

Article View/Open

Publication Export

Google ScholarTM

政大圖書館

Citation Infomation

題名 中文裁判書之要旨擷取:以最高法院裁判書為例
Automatic Extraction of Gist of Chinese Judgments of the Supreme Court
作者 陳冠群
Chen, Kuan-Chun
貢獻者 劉昭麟
Liu, Chao-Lin
陳冠群
Chen, Kuan-Chun
關鍵詞 法資訊學
自動摘要
自然語言處理
Legal informatics
Automatic summarization
Natural language processing
日期 2018
上傳時間 27-Jul-2018 12:21:16 (UTC+8)
參考文獻 [1] Jerome H. Friedman. 2002. Stochastic gradient boosting. Computational Statistics & Data Analysis. 38, 4 (February 2002), 367-378. DOI: http://dx.doi.org/10.1016/S0167- 9473(01)00065-2
[2] Ethem Alpaydin. 2010. Introduction to Machine Learning (2nd ed.). The MIT Press. 489-493.
[3] 謝淳達。2005。利用詞組檢索中文訴訟文書之研究。碩士論文。國立政治大學,台北市,台灣。
[4] 藍家樑。2009。中文訴訟文書檢索系統雛形實作。碩士論文。國立政治大學,台北市,台灣。
[5] 廖鼎銘。2004。觸犯多款法條之賭博與竊盜案件的法院文書的分類與分析。碩士論文。國立政治大學,台北市,台灣。
[6] 何君豪。2006。階層式分群法在民事裁判要旨分群上之應用。碩士論文。國立政治大學,台北市,台灣。
[7] 鄭人豪。2006。中文詞彙集的來源與權重對中文裁判書分類成效的影響。碩士論文。國立政治大學,台北市,台灣。
[8] 林琬真。2012。機器學習於中文法律文件之標記與分類。碩士論文。國立臺灣大學,台北市,台灣。
[9] 黃玉婷。2012。以文字探勘技術產製求量刑因子之研究—以我國智慧財產權法律為中心探討。碩士論文。東吳大學,台北市,台灣。
[10]Stephen Cole Kleene. 1956. Representation of events in nerve nets and finite automata. Automata Studies. Princeton University Press, Princeton, New Jersey, 3-42.
[11]林筱瓴。2013。文字探勘在判決書上之應用—以著作權法民事賠償為中心。碩士論文。國立臺灣大學,台北市,台灣。
[12]陳政瑜。2015。基於文字探勘技術探討司法裁判書之撰寫一致性:以刑事訴訟停止羈押聲請裁定書為例。碩士論文。國立臺灣大學,台北市,台灣。
[13]黃詩淳及邵軒磊。2017。運用機器學習預測法院裁判─法資訊學之實踐。月旦法學雜誌,第 270 期,86-96。DOI: http://doi.org/10.3966/102559312017110270006
[14]黃詩淳及邵軒磊。2018。酌定子女親權之重要因素:以決策樹方法分析相關裁判。臺大法學論叢,第 47 卷,第 1 期,299-344。
[15]Dou Shen, Jian-Tao Sun, Hua Li, Qiang Yang, and Zheng Chen. 2007. Document summarization using conditional random fields. In Proceedings of the 20th International Joint Conference on Artificial Intelligence (IJCAI`07), Rajeev Sangal, Harish Mehta, and R. K. Bagga (Eds.). Morgan Kaufmann Publishers Inc., San Francisco, CA, USA, 2862-2867.
[16]Kam-Fai Wong, Mingli Wu, and Wenjie Li. 2008. Extractive summarization using supervised and semi-supervised learning. In Proceedings of the 22nd International Conference on Computational Linguistics - Volume 1 (COLING `08), Vol. 1. Association for Computational Linguistics, Stroudsburg, PA, USA, 985-992.
[17]Jianpeng Cheng and Mirella Lapata. 2016. Neural summarization by extracting sentences and words. In Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics. Association for Computational Linguistics, Berlin, Germany, 484-494.
[18]Yann LeCun, Léon Bottou, Yoshua Bengio and Patrick Haffner. 1998. Gradient-based learning applied to document recognition. Proceedings of the IEEE 86, 11 (1998), 2278–2324. DOI:http://dx.doi.org/10.1109/5.726791
[19]John J. Hopfield. 1988. Neural networks and physical systems with emergent collective computational abilities. In Neurocomputing: foundations of research, James A. Anderson and Edward Rosenfeld (Eds.). MIT Press, Cambridge, MA, USA, 457-464.
[20]Wei-Yun Ma and Keh-Jiann Chen. 2003. Introduction to CKIP Chinese word segmentation system for the first international Chinese Word Segmentation Bakeoff. In Proceedings of the Second SIGHAN Workshop on Chinese Language Processing - Volume 17 (SIGHAN `03), Vol. 17. Association for Computational Linguistics, Stroudsburg, PA, USA, 168-171.
[21]Jeffrey D. Ullman, Alfred V Aho, and Daniel S Hirschberg. 1976. Bounds on the complexity of the longest common subsequence problem. Journal of the ACM 23, 1 (January 1976), 1-12. DOI=http://dx.doi.org/10.1145/321921.321922
[22]Lee R. Dice. 1945. Measures of the amount of ecologic association between species. Ecology 26, 3 (1945), 297–302. DOI: http://dx.doi.org/10.2307/1932409
[23]David Harris and Sarah Harris. 2012. Digital design and computer architecture (2nd. ed.). Morgan Kaufmann, San Francisco, CA, USA, 129.
[24]Yoshua Bengio, Réjean Ducharme, Pascal Vincent, and Christian Janvin. 2003. A neural probabilistic language model. Journal of Machine Learning Research, 3 (March 2003), 1137-1155.
[25]Tomas Mikolov, Kai Chen, Greg Corrado and Jeffrey Dean. 2013. Efficient estimation of word representations in vector space. arXiv:1301.3781. Retrieved from https://arxiv.org/abs/1301.3781
[26]Piotr Bojanowski, Edouard Grave, Armand Joulin and Tomas Mikolov. 2016. Enriching word vectors with subword information. arXiv:1607.04606. Retrieved from https://arxiv.org/abs/1607.04606
[27]Radim Rehurek and Petr Sojka. 2010. Software framework for topic modelling with large corpora. In Proceedings of the LREC 2010 Workshop on New Challenges for NLP Frameworks, European Language Resources Association, Valletta, Malta, 45-50.
[28]Ronan Collobert, Jason Weston, Léon Bottou, Michael Karlen, Koray Kavukcuoglu, and Pavel Kuksa. 2011. Natural language processing (almost) from scratch. Journal of Machine Learning Research, 12 (November 2011), 2493-2537.
[29]Sepp Hochreiter and Jürgen Schmidhuber. 1997. Long short-term memory. Neural Computation 9, 8 (November 1997), 1735-1780. DOI: http://dx.doi.org/10.1162/neco.1997.9.8.1735
[30]Guolin Ke, Qi Meng, Thomas Finley, Taifeng Wang, Wei Chen, Weidong Ma, Qiwei Ye, and Tie-Yan Liu. 2017. LightGBM: A highly efficient gradient boosting decision tree. In Advances in Neural Information Processing Systems 30 (NIPS 2017). 3149-3157.
[31]Ethem Alpaydin. 2010. Introduction to Machine Learning (2nd ed.). The MIT Press. 220-223.
[32]Diederik P. Kingma, Jimmy Ba. 2016. Adam: A method for stochastic optimization. arXiv:1412.6980. Retrieved from https://arxiv.org/abs/1412.6980
[33]Mike Schuster and Kuldip K. Paliwal. 1997. Bidirectional recurrent neural networks. IEEE Transactions on Signal Processing, 45(11), 2673-2681.
[34]Tin Kam Ho. 1995. Random decision forests. In Proceedings of the Third International Conference on Document Analysis and Recognition (ICDAR `95), Vol. 1. IEEE Computer Society, Washington, DC, USA, 278-282.
描述 碩士
國立政治大學
資訊科學系
105753014
資料來源 http://thesis.lib.nccu.edu.tw/record/#G0105753014
資料類型 thesis
dc.contributor.advisor 劉昭麟zh_TW
dc.contributor.advisor Liu, Chao-Linen_US
dc.contributor.author (Authors) 陳冠群zh_TW
dc.contributor.author (Authors) Chen, Kuan-Chunen_US
dc.creator (作者) 陳冠群zh_TW
dc.creator (作者) Chen, Kuan-Chunen_US
dc.date (日期) 2018en_US
dc.date.accessioned 27-Jul-2018 12:21:16 (UTC+8)-
dc.date.available 27-Jul-2018 12:21:16 (UTC+8)-
dc.date.issued (上傳時間) 27-Jul-2018 12:21:16 (UTC+8)-
dc.identifier (Other Identifiers) G0105753014en_US
dc.identifier.uri (URI) http://nccur.lib.nccu.edu.tw/handle/140.119/118964-
dc.description (描述) 碩士zh_TW
dc.description (描述) 國立政治大學zh_TW
dc.description (描述) 資訊科學系zh_TW
dc.description (描述) 105753014zh_TW
dc.description.tableofcontents 1 緒論 1
1.1 研究背景與動機 1
1.2 研究目的 1
1.3 主要貢獻 1
1.4 論文架構 2
2 文獻回顧 3
2.1 應用自然語言處理技術於中文裁判書 3
2.1.1 裁判檢索系統 3
2.1.2 案件分類或分群 3
2.1.3 裁判因素分析與結果預測 4
2.2 以機器學習為基礎之自動摘要 5
3 語料來源與系統架構 6
3.1 語料來源 6
3.1.1 最高法院裁判書 6
3.1.2 司法院各級法院裁判書 6
3.2 系統架構 7
4 語料前處理 9
4.1 原始資料解析 9
4.1.1 HTML文字區塊擷取 10
4.1.2 裁判全文段落切割 10
4.1.3 裁判要旨註解去除 11
4.1.4 裁判要旨及裁判理由之分句切割 12
4.1.5 JSON格式轉換與儲存 13
4.2 斷詞及詞性標記 14
4.2.1 自動斷詞系統 14
4.2.2 斷詞結果 14
4.2.3 詞性標記結果 15
4.3 裁判要旨與裁判理由之對應 15
4.3.1 分句對應關係計算方法 15
4.3.2 分句異同判斷方式之改良 17
5 特徵擷取 20
5.1 基本特徵 20
5.2 裁判特徵 21
5.2.1 裁判類型 21
5.2.2 裁判性質 21
5.2.3 案件字別 21
5.2.4 裁判時間 22
5.3 分句標記特徵 23
5.3.1 規則式分句標記 23
5.3.2 法規名稱標記 25
5.4 詞彙特徵 27
5.4.1 Word embedding模型 27
5.4.2 特徵擷取方式 27
5.5 詞性特徵 28
5.6 句首詞彙特徵 28
5.7 特徵型態 29
6 裁判要旨擷取模型 30
6.1 裁判要旨擷取模型之建立 30
6.2 深度學習模型 30
6.2.1 Fully-connected neural networks 30
6.2.2 Recurrent neural networks 31
6.2.3 混合式模型 32
6.3 Gradient boosting模型 33
6.4 兩階段學習方法 34
6.5 半監督式學習方法 35
6.6 Ensemble學習方法 35
7 實驗設計與結果分析 37
7.1 實驗語料 37
7.2 實驗結果評估方法 37
7.3 實驗參數 38
7.4 基本特徵及裁判特徵實驗 39
7.4.1 實驗設計 39
7.4.2 結果分析 40
7.5 分句標記特徵實驗 42
7.5.1 實驗設計 42
7.5.2 結果分析 42
7.6 詞彙特徵實驗 44
7.6.1 實驗設計 44
7.6.2 結果分析 44
7.7 詞性特徵實驗 45
7.7.1 實驗設計 45
7.7.2 結果分析 46
7.8 句首詞彙特徵實驗 47
7.8.1 實驗設計 47
7.8.2 結果分析 48
7.9 分類模型實驗 49
7.9.1 實驗設計 49
7.9.2 結果分析 49
7.10 兩階段學習方法實驗 50
7.10.1 實驗設計 50
7.10.2 結果分析 51
7.11 半監督式學習方法實驗 52
7.11.1 實驗設計 52
7.11.2 結果分析 53
7.12 Ensemble學習方法實驗 54
7.12.1 實驗設計 54
7.12.2 結果分析 54
8 結論與未來展望 56
8.1 結論 56
8.2 未來展望 56
參考文獻 58
附錄 61
zh_TW
dc.format.extent 2227790 bytes-
dc.format.mimetype application/pdf-
dc.source.uri (資料來源) http://thesis.lib.nccu.edu.tw/record/#G0105753014en_US
dc.subject (關鍵詞) 法資訊學zh_TW
dc.subject (關鍵詞) 自動摘要zh_TW
dc.subject (關鍵詞) 自然語言處理zh_TW
dc.subject (關鍵詞) Legal informaticsen_US
dc.subject (關鍵詞) Automatic summarizationen_US
dc.subject (關鍵詞) Natural language processingen_US
dc.title (題名) 中文裁判書之要旨擷取:以最高法院裁判書為例zh_TW
dc.title (題名) Automatic Extraction of Gist of Chinese Judgments of the Supreme Courten_US
dc.type (資料類型) thesisen_US
dc.relation.reference (參考文獻) [1] Jerome H. Friedman. 2002. Stochastic gradient boosting. Computational Statistics & Data Analysis. 38, 4 (February 2002), 367-378. DOI: http://dx.doi.org/10.1016/S0167- 9473(01)00065-2
[2] Ethem Alpaydin. 2010. Introduction to Machine Learning (2nd ed.). The MIT Press. 489-493.
[3] 謝淳達。2005。利用詞組檢索中文訴訟文書之研究。碩士論文。國立政治大學,台北市,台灣。
[4] 藍家樑。2009。中文訴訟文書檢索系統雛形實作。碩士論文。國立政治大學,台北市,台灣。
[5] 廖鼎銘。2004。觸犯多款法條之賭博與竊盜案件的法院文書的分類與分析。碩士論文。國立政治大學,台北市,台灣。
[6] 何君豪。2006。階層式分群法在民事裁判要旨分群上之應用。碩士論文。國立政治大學,台北市,台灣。
[7] 鄭人豪。2006。中文詞彙集的來源與權重對中文裁判書分類成效的影響。碩士論文。國立政治大學,台北市,台灣。
[8] 林琬真。2012。機器學習於中文法律文件之標記與分類。碩士論文。國立臺灣大學,台北市,台灣。
[9] 黃玉婷。2012。以文字探勘技術產製求量刑因子之研究—以我國智慧財產權法律為中心探討。碩士論文。東吳大學,台北市,台灣。
[10]Stephen Cole Kleene. 1956. Representation of events in nerve nets and finite automata. Automata Studies. Princeton University Press, Princeton, New Jersey, 3-42.
[11]林筱瓴。2013。文字探勘在判決書上之應用—以著作權法民事賠償為中心。碩士論文。國立臺灣大學,台北市,台灣。
[12]陳政瑜。2015。基於文字探勘技術探討司法裁判書之撰寫一致性:以刑事訴訟停止羈押聲請裁定書為例。碩士論文。國立臺灣大學,台北市,台灣。
[13]黃詩淳及邵軒磊。2017。運用機器學習預測法院裁判─法資訊學之實踐。月旦法學雜誌,第 270 期,86-96。DOI: http://doi.org/10.3966/102559312017110270006
[14]黃詩淳及邵軒磊。2018。酌定子女親權之重要因素:以決策樹方法分析相關裁判。臺大法學論叢,第 47 卷,第 1 期,299-344。
[15]Dou Shen, Jian-Tao Sun, Hua Li, Qiang Yang, and Zheng Chen. 2007. Document summarization using conditional random fields. In Proceedings of the 20th International Joint Conference on Artificial Intelligence (IJCAI`07), Rajeev Sangal, Harish Mehta, and R. K. Bagga (Eds.). Morgan Kaufmann Publishers Inc., San Francisco, CA, USA, 2862-2867.
[16]Kam-Fai Wong, Mingli Wu, and Wenjie Li. 2008. Extractive summarization using supervised and semi-supervised learning. In Proceedings of the 22nd International Conference on Computational Linguistics - Volume 1 (COLING `08), Vol. 1. Association for Computational Linguistics, Stroudsburg, PA, USA, 985-992.
[17]Jianpeng Cheng and Mirella Lapata. 2016. Neural summarization by extracting sentences and words. In Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics. Association for Computational Linguistics, Berlin, Germany, 484-494.
[18]Yann LeCun, Léon Bottou, Yoshua Bengio and Patrick Haffner. 1998. Gradient-based learning applied to document recognition. Proceedings of the IEEE 86, 11 (1998), 2278–2324. DOI:http://dx.doi.org/10.1109/5.726791
[19]John J. Hopfield. 1988. Neural networks and physical systems with emergent collective computational abilities. In Neurocomputing: foundations of research, James A. Anderson and Edward Rosenfeld (Eds.). MIT Press, Cambridge, MA, USA, 457-464.
[20]Wei-Yun Ma and Keh-Jiann Chen. 2003. Introduction to CKIP Chinese word segmentation system for the first international Chinese Word Segmentation Bakeoff. In Proceedings of the Second SIGHAN Workshop on Chinese Language Processing - Volume 17 (SIGHAN `03), Vol. 17. Association for Computational Linguistics, Stroudsburg, PA, USA, 168-171.
[21]Jeffrey D. Ullman, Alfred V Aho, and Daniel S Hirschberg. 1976. Bounds on the complexity of the longest common subsequence problem. Journal of the ACM 23, 1 (January 1976), 1-12. DOI=http://dx.doi.org/10.1145/321921.321922
[22]Lee R. Dice. 1945. Measures of the amount of ecologic association between species. Ecology 26, 3 (1945), 297–302. DOI: http://dx.doi.org/10.2307/1932409
[23]David Harris and Sarah Harris. 2012. Digital design and computer architecture (2nd. ed.). Morgan Kaufmann, San Francisco, CA, USA, 129.
[24]Yoshua Bengio, Réjean Ducharme, Pascal Vincent, and Christian Janvin. 2003. A neural probabilistic language model. Journal of Machine Learning Research, 3 (March 2003), 1137-1155.
[25]Tomas Mikolov, Kai Chen, Greg Corrado and Jeffrey Dean. 2013. Efficient estimation of word representations in vector space. arXiv:1301.3781. Retrieved from https://arxiv.org/abs/1301.3781
[26]Piotr Bojanowski, Edouard Grave, Armand Joulin and Tomas Mikolov. 2016. Enriching word vectors with subword information. arXiv:1607.04606. Retrieved from https://arxiv.org/abs/1607.04606
[27]Radim Rehurek and Petr Sojka. 2010. Software framework for topic modelling with large corpora. In Proceedings of the LREC 2010 Workshop on New Challenges for NLP Frameworks, European Language Resources Association, Valletta, Malta, 45-50.
[28]Ronan Collobert, Jason Weston, Léon Bottou, Michael Karlen, Koray Kavukcuoglu, and Pavel Kuksa. 2011. Natural language processing (almost) from scratch. Journal of Machine Learning Research, 12 (November 2011), 2493-2537.
[29]Sepp Hochreiter and Jürgen Schmidhuber. 1997. Long short-term memory. Neural Computation 9, 8 (November 1997), 1735-1780. DOI: http://dx.doi.org/10.1162/neco.1997.9.8.1735
[30]Guolin Ke, Qi Meng, Thomas Finley, Taifeng Wang, Wei Chen, Weidong Ma, Qiwei Ye, and Tie-Yan Liu. 2017. LightGBM: A highly efficient gradient boosting decision tree. In Advances in Neural Information Processing Systems 30 (NIPS 2017). 3149-3157.
[31]Ethem Alpaydin. 2010. Introduction to Machine Learning (2nd ed.). The MIT Press. 220-223.
[32]Diederik P. Kingma, Jimmy Ba. 2016. Adam: A method for stochastic optimization. arXiv:1412.6980. Retrieved from https://arxiv.org/abs/1412.6980
[33]Mike Schuster and Kuldip K. Paliwal. 1997. Bidirectional recurrent neural networks. IEEE Transactions on Signal Processing, 45(11), 2673-2681.
[34]Tin Kam Ho. 1995. Random decision forests. In Proceedings of the Third International Conference on Document Analysis and Recognition (ICDAR `95), Vol. 1. IEEE Computer Society, Washington, DC, USA, 278-282.
zh_TW
dc.identifier.doi (DOI) 10.6814/THE.NCCU.CS.003.2018.B02-