透過起訴書輔助法院判決－以竊盜罪為例

李右元; Lee, Yu-Yuan

Please use this identifier to cite or link to this item: https://ah.lib.nccu.edu.tw/handle/140.119/134084

題名:	透過起訴書輔助法院判決－以竊盜罪為例 Using Indictments to Assist Judges in Judging – A Case Study with Offenses of Larceny
作者:	李右元 Lee, Yu-Yuan
貢獻者:	劉昭麟 Liu, Chao-Lin 李右元 Lee, Yu-Yuan
關鍵詞:	判決結果預測類似案件推薦法律科技應用輔助判決深度學習 judgement predicting similar cases recommending the application of LegalTech judgement assistance deep learning
日期:	2021
上傳時間:	2-Mar-2021
摘要:	近年來，隨著技術的成長，自然語言處理的工作在不同的領域間發展，其中亦包含法律面向。在台灣，法律與科技的應用目前仍在起步的階段，有些社群活動亦開始著重於此面向，例如法律科技黑客松。\n　　就台灣的刑事訴訟而言，案件會先經由檢察官的偵查，若被告遭受起訴處分，案件才會移交由法官進行審理及判決。而訴訟的過程往往曠日廢時，其潛在原因可能是被告對於判決結果的不符而上訴。此外，因應國民法官的推動，台灣可能逐步走向參審制的判決。相較於現任法官，國民法官可能沒有法律相關的知識或判決相關的經驗，使其對於最終判決的影響可能較不客觀。\n　　因此本實驗以輔助判決為目標，其對象可以是一般民眾、被告、國民法官，甚至是現任法官及律師等。實驗結合判決結果預測以及類似案件推薦兩部分工作，除了提供使用者可能的判決結果，亦透過「與預測結果相符」及「與預測結果不符」二類相似案件，提供不同面向的案件做為比較及參考。\n　　在過去判決結果預測的相關實驗中，多是以裁判書作為實驗語料。我們則將訴訟流程往回推一步，採用起訴書作為主要語料，希望能在判決結果確定前就對於案件提供相關輔助功能。而在起訴書數量較少，且判決類別不平均的情況下，判決結果預測的實驗目前最高平均值能達到0.665 的Macro_F1分數，在類似案件推薦的實驗中也確實能透過起訴書內容，找出類似案件。 With the advance of science and technology, the works of natural language processing have been growing in many different fields in recent years. In Taiwan, the applications between law and technology are still in their infancy, while some communities have begun to focus on these aspect, such as Legaltech Hackathon.\nIn terms of criminal proceeding in Taiwan, the case will first be investigated by the prosecutor. If the defendant is charged, the case will be transferred to the judge for trial and judgement. But the judicial proceeding is usually time-consuming, and the reason may be that the prosecutor or defendant appealed against the judgement. Furthermore, with the promotions of citizen judge system, Taiwan may gradually move towards a lay judge system. Compared with professional judges, citizen judges may not have legal knowledge or judgement-related experience, which may lead to the less objective judgements.\nTherefore, the goal of this experiment is to assist judgements, and its objects can be defendants, citizen judges, and even professional judges and lawyers. Our experiment combines two parts of the work, which are judgement predicting and similar cases recommending. In addition to providing users with possible judgements, the two types of similar cases "consistent with the prediction" and "not consistent with the prediction" are also provided for comparison of different aspects of the case.\nIn the past related experiments, court’s judgements were mostly used as experiment corpus. To provide relevant auxiliary functions for the case before the judgement confirmed, we use the indictments as our main corpus. However, with small amount of indictments and unbalanced judgement types, our judgement predicting can still have 0.665 of macro f-1 score, and the similar cases can indeed be found through the content of the indictment.
參考文獻:	[1] C. Cortes and V. Vapnik, “Support-vector networks”, Machine Learning, vol. 20, no. 3, pp. 273-297, 1995.\n[2] J. Devlin, M.-W. Chang, K. Lee, and K. Toutanova, “BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding”, North American Chapter of the Association for Computational Linguistics: Human Language Technologies, NAACL ’19, vol. 1, pp. 4171-4186, Jun. 2019.\n[3] R.-E. Fan, K.-W. Chang, C.-J. Hsieh, X.-R. Wang, and C.-J. Lin, “LIBLINEAR: A Library for Large Linear Classification”, Journal of Machine Learning Research, JMLR, vol. 9, pp. 1871-1874, Aug. 2008. Available: http://www.csie.ntu.edu.tw/~cjlin/liblinear/\n[4] Z. S. Harris, “Distributional Structure”, WORD, vol. 10, no. 2-3, pp. 146-162, 1954.\n[5] A. Heidarian, M. J. Dinneen, “A Hybrid Geometric Approach for Measuring Similarity Level Among Documents and Document Clustering”, IEEE Second International Conference on Big Data Computing Service and Applications, Mar. 2016.\n[6] S. Hochreiter and J. Schmidhuber, “Long Short-Term Memory”, Neural Computation, vol. 9, no. 8, pp. 1735-1780, Nov. 1997.\n[7] J. D. Lafferty, A. McCallum, and F. C. N. Pereira, “Conditional Random Fields: Probabilistic Models for Segmenting and Labeling Sequence Data”, International Conference on Machine Learning, ICML ’01, pp. 282-289, Jun. 2001.\n[8] N. Landwehr, M. Hall, and E. Frank, “Logistic Model Trees”, Machine Learning, vol. 59, no. 1-2, pp. 161-205, May. 2005.\n[9] Q. Le, T. Mikolov, “Distributed Representations of Sentences and Documents”, International Conference on Machine Learning, ICML ’14, vol.32, pp. 1188-1196, June 2014.\nAvailable: https://radimrehurek.com/gensim/models/doc2vec.html\n[10] P.-H. Li, T.-J. Fu, and W.-Y. Ma, “Why Attention? Analyze BiLSTM Deficiency and Its Remedies in the Case of NER”, AAAI Conference on Artificial Intelligence, AAAI ’20, vol.34, no. 5, Apr. 2020. Available: https://github.com/ckiplab/ckiptagger/\n[11] S. Long, C. Tu, Z. Liu, and M. Sun, “Automatic Judgment Prediction via Legal Reading Comprehension”, Chinese Computational Linguistics, CCL ’19, Lecture Notes in Computer Science, vol. 11856, pp. 558-572, Oct. 2019.\n[12] W.-Y. Ma, and K.-J. Chen, “Introduction to CKIP Chinese Word Segmentation System forthe First International Chinese Word Segmentation Bakeoff”, the Second SIGHAN Workshop on Chinese Language Processing, pp. 168-171, Jul. 2003.\nAvailable: http://ckipsvr.iis.sinica.edu.tw/\n[13] C. D. Manning, M. Surdeanu, J. Bauer, J. Finkel, P. Inc, S. J. Bethard, D. Mcclosky, “The Stanford CoreNLP Natural Language Processing Toolkit”, Association for Computational Linguistics: System Demonstrations, pp. 55-60. 2014.\nAvailable: https://stanfordnlp.github.io/CoreNLP/\n[14] T. Mikolov, K. Chen, G. Corrado, and J. Dean, “Efficient Estimation of Word Representations in Vector Space”, International Conference on Learning Representations, ICLR ’13, Jan. 2013. Available: https://radimrehurek.com/gensim/models/word2vec.html\n[15] T. Mikolov, I. Sutskever, K. Chen, G. Corrado, and J. Dean, “Distributed representations of words and phrases and their compositionality”, Neural Information Processing Systems, NIPS ’13, vol. 2, pp. 3111-3119, 2013\n[16] F. Pedregosa, G. Varoquaux, A. Gramfort, V. Michel, B. Thirion, O. Grisel, M. Blondel, P. Prettenhofer, R. Weiss, V. Dubourg, J. Vanderplas, A. Passos, D. Cournapeau, M. Brucher, M. Perrot, É. Duchesnay, “Scikit-learn: Machine Learning in Python”, Journal of Machine Learning Research, JMLR, vol. 12, pp. 2825-2830, 2011.\nAvailable: https://scikit-learn.org/stable/\n[17] O.-M. Şulea, M. Zampieri, M. Vela, and J. van Genabith, “Predicting the Law Area and Decisions of French Supreme Court Cases”, Recent Advances in Natural Language Processing, RANLP ’17, pp.716-722, Sep. 2017.\n[18] T. Wolf, L. Debut, V. Sanh, J. Chaumond, C. Delangue, A. Moi, P. Cistac, T. Rault, R. Louf, M. Funtowicz, J. Davison, S. Shleifer, P. -v. Platen, C. Ma, Y. Jernite, J. Plu, C. Xu, T. -L. Scao, S. Gugger, M. Drame, Q. Lhoest, and A. -M. Rush, “HuggingFace`s Transformers: State-of-the-art Natural Language Processing”, arXiv-1910.03771, 2019.\nAvailable: https://github.com/huggingface/transformers/\n[19] H. Zhong, Z. Guo, C. Tu, C. Xiao, Z. Liu, and M. Sun, “Legal Judgment Prediction via Topological Learning”, Empirical Methods in Natural Language Processing, EMNLP ’18, pp. 3540-3549, Oct.-Nov. 2018.\n[20] fxsjy, “jieba”. Available: https://github.com/fxsjy/jieba/\n[21] ldkrsi, “jieba-zh_TW”. Available: https://github.com/ldkrsi/jieba-zh_TW/\n[22] Chollet, Francois and others., “Keras”. Available: https://keras.io/\n[23] 井上正仁，〈裁判員制度與刑事司法 — 兩人三腳十年之旅(上)〉，司法周刊，第1996期，頁2，2020年3月。\n[24] 井上正仁，〈裁判員制度與刑事司法 — 兩人三腳十年之旅(中)〉，司法周刊，第1997期，頁2-3，2020年4月。\n[25] 井上正仁，〈裁判員制度與刑事司法 — 兩人三腳十年之旅(上)〉，司法周刊，第1998期，頁3，2020年4月。\n[26] 王業沛、宋夢姣、王譞、趙志宏，〈基於深度學習的判決結果傾向性分析〉，計算機應用研究，第36卷，第2期，頁335-338，2019年2月。\n[27] 沈宜生，〈職業法官與平民法官在法庭的互動以及判決意見的差異(上)〉，司法周刊，第1882期，頁2-3，2017年12月。\n[28] 沈宜生，〈職業法官與平民法官在法庭的互動以及判決意見的差異(下)〉，司法周刊，第1883期，頁2-3，2018年1月。\n[29] 林琬真、郭宗廷、張桐嘉、顏厥安、陳昭如、林守德，〈利用機器學習於中文法律文件之標記、案件分類及量刑預測〉，中文計算語言學期刊，第17:4期，頁49-67，2012年12月。\n[30] Lawsnote，〈法律科技黑客松2019〉，網址：https://hackathon.lawsnote.com/\n[31] 司法院，〈裁判書開放資料下載頁面〉，網址：http://data.judicial.gov.tw/\n[32] 司法院，〈量刑趨勢建議系統〉，\n網址：https://sen.judicial.gov.tw/pub_platform/sugg/index.html\n[33] 法務部，〈法務部檢察機關公開書類查詢系統〉，\n網址：https://psue.moj.gov.tw/psiqs/index.jsp/\n[34] colah, “Understanding LSTM Networks”, Aug. 2015.\nRetrieved from: https://colah.github.io/posts/2015-08-Understanding-LSTMs/\n[35] copperking, “Please explain Support Vector Machines (SVM) like I am a 5 year old”, Jan. 2013.\nRetrieved from: https://www.reddit.com/r/MachineLearning/comments/15zrpp/please_explain_support_vector_machines_svm_like_i/?utm_source=share&utm_medium=web2x&context=3\n[36] KiroSummer, “How to implement a different version of BiLSTM”, Mar. 2018.\nRetrieved from: https://discuss.pytorch.org/t/how-to-implement-a-different-version-of-bilstm/14698\n[37] 徐蘭萍，〈臺灣新北地方法院108年簡字第61號刑事判決〉，司法院法學資料檢索系統，2019年2月。\n參考來源：https://law.judicial.gov.tw/FJUD/data.aspx?ty=JD&id=PCDM,108,%E7%B0%A1,61,20190212,1\n[38] 楊景舜，〈臺灣新北地方檢察署檢察官聲請簡易判決處刑書107年度偵字第31434號〉，檢察機關公開書類查詢系統，2018年12月。\n參考來源：https://psue.moj.gov.tw/psiqs/print.jsp?d=9464d33adc0bca1f4353624f50cdc1e5\n[39] 臺灣彰化地方檢察署，〈為什麼要推動修復式司法？〉，2015年3月。\n參考來源：https://www.chc.moj.gov.tw/296309/296431/709708/296446/453081/\n[40] 薛植和，〈臺灣新北地方檢察署檢察官起訴書109年度偵字第590號〉，檢察機關公開書類查詢系統，2020年4月。\n參考來源：https://psue.moj.gov.tw/psiqs/print.jsp?d=6ab40b4d5b331df329a3198c061c3984
描述:	碩士國立政治大學資訊科學系 107753027
資料來源:	http://thesis.lib.nccu.edu.tw/record/#G0107753027
資料類型:	thesis
Appears in Collections:	學位論文

Files in This Item:

File	Description	Size	Format
302701.pdf		12.17 MB	Adobe PDF2	View/Open

Show full item record

Google Scholar^TM

Check

Files in This Item:

Google Scholar^TM

Altmetric

Altmetric

Files in This Item:

Google ScholarTM

Altmetric

Altmetric

Google Scholar^TM