高等法院金融上訴裁判書之分析與文件分類 | Publication

Publications-Theses

Article View/Open

pdf(141)

Publication Export

Google Scholar^TM

題名	高等法院金融上訴裁判書之分析與文件分類 Analysis and Document Classification of the High Court’s Financial Appeal Judgment
作者	張三妹 Chang, San-Mei
貢獻者	劉昭麟 Liu, Chao-Lin 張三妹 Chang, San-Mei
關鍵詞	機器學習自動文件分類非監督式學習模型訓練模型評價 Machine Learning Automatic Document Classification Unsupervised Learning Model Training Model Evaluation
日期	2024
上傳時間	1-Mar-2024 14:11:29 (UTC+8)
摘要	近年金融詐騙案件屢創新高，常常不止一審，本研究目的期望能分析金融領域上訴案件的統計數據，以便深入了解近10年金融上訴案件之統計和上訴趨勢。綜觀坊間並無針對金上訴案件相關案件深入探討或分類，期望透過本研究提供法律體系和金融相關研究者更深層的見解及分析資訊。收集高等法院第二審金融上訴案件近10年上訴案件的數據資料，針對裁判書解析出案件判決結果，依相近判決結果分類並統計數據，以看出歷年來上訴案件之趨勢及判決結果分類數量，並應用機器學習自動文件分類，依分類之目錄及對應已解析好之裁判書，以多項式分佈的單純貝氏演算法做模型訓練，並依據訓練及預測之結果，分析錯誤分類之原因，以勘誤人工分類並增加分類的準確度，同時搭配非監督式學習k-means分群演算法，所產出之類別文件中，其許可權較高的前20個單字，以做為調整分類參考依據。原本以人工檢視識別出金融上訴案件之分類，得花費相當大量時間及人力，經由機器學習模型訓練及模型評價調整找出較佳之分類及提高分類之正確性。應用以上技術來協助資料分析的工作，執行後產生之預測文件分類目錄及分析資訊，應能有助於後續需研究金融裁判書分析人員之初步分類，節省逐一查看及人工分類整理之時間，以達能增進研究及分析效率。 In recent years, financial fraud cases have kept increasing, often involving more than one instance. The purpose of this study is to analyze the statistical data of appeal cases in the financial field, so as to gain an in-depth understanding of the statistics and appeal trends of financial appeal cases in the past 10 years. There is no in-depth discuss or classification related to the financial appeal case, so hope that this study will provide deeper insights and analytical information for legal system and finance-related researchers. Collected the high court’s financial appeal judgment in the past 10 years, analyzed the case results according to the parsed judgement, classified and count the data according to similar judgment results, so as to see the trend of appeal cases and the number of classifications of judgment results over the years. Machine learning is applied for automatic document classification. According to the classified directory and the corresponding parsed judgment, the naive Bayes classifier of polynomial distribution is used for model training. Based on training and prediction to analyze the reasons for incorrect classification, corrected the manual classification and increased the accuracy of classification. Also use the unsupervised learning k-means clustering algorithm, the top 20 characters with higher weights of generated category files are used as a reference for adjusting classification. Originally, identify the classification of financial appeal cases through manual review, which took a lot of time and manpower. Through machine learning model training and model evaluation to find better classifications and enhance the correctness of classification. The above technologies are applied to assist in data analysis. The predicted classification directory generated after execution and analysis information should be helpful for researcher who need to study financial judgments in the future, saving the time of reviewing one by one and manual classifying and organizing, in order to improve the efficiency of research and analysis.
參考文獻	[1] 司法院，裁判書開放資料下載頁面，檢自https://data.judicial.gov.tw/。 [2] 中央研究院資訊所，中央研究院詞庫小組（CKIP 中文斷詞系統），檢自https://ckipsvr.iis.sinica.edu.tw。 [3] 林琬真，機器學習與中文法律之標記與分類，碩士論文，國立臺灣大學，臺北市，臺灣，2012。 [4] 陳冠群，中文裁判書之要旨擷取：以最高法院裁判書為例，碩士論文，國立政治大學，臺北市，臺灣，2018。 [5] 彭建亮，藉由機器學習中決策樹方法實施相近法律案例參照之研究-以某機構法律案件執行為例，碩士論文，中國科技大學，臺北市，臺灣，2022。 [6] 詹德恩，我國金融犯罪特性與抗制難題，中正財經法學，第7期，頁159-220，2013。 [7] 余嘉哲，非法經營銀行業務罪之研究，碩士論文，國立政治大學，臺北市，臺灣，2020。 [8] Tom M. Mitchell, Machine Learning, McGraw-Hill, pp. 177-182, 1997. [9] Ethem Alpaydin, Introduction to Machine Learning, third edition, MIT Press, pp. 163-176, 2014. [10] 吳文峰，中文郵件分類器之設計及實作，碩士論文，逢甲大學，臺中市，臺灣，2002。 [11] 陳明原，應用貝氏理論及模糊邏輯進行專利分類及TRIZ方法改善之研究，碩士論文，國立臺灣科技大學，臺北市，臺灣，2006。 [12] 張良豪，利用貝氏屬性挑選法與先驗分配提升簡易貝氏分類器之效能，碩士論文，國立成功大學，臺南市，臺灣，2009。 [13] 周子皓，基於語境特徵及分群模型之中文多義詞消歧，碩士論文，國立政治大學，臺北市，臺灣，2019。 [14] 王希佩，以機器學習建構股價預測模型：以台灣股市為例，碩士論文，國立中央大學，桃園市，臺灣，2021。 [15] Fabian Pedregosa et al., Scikit-learn: Machine Learning in Python, Journal of Machine Learning Research, JMLR, vol. 12, 2011. [16] Ethem Alpaydin, Introduction to Machine Learning, third edition, MIT Press, pp. 561-564, 2014. [17] 法務部調查局洗錢防制處，金融情報中心態樣分享，檢自https://www.mjib.gov.tw/EditPage?PageID=564dcbaf-1d4e-45a9-bf1e-a26ea488bd76。 [18] Ethem Alpaydin, Introduction to Machine Learning, third edition, MIT Press, pp. 21-47, 2014. [19] Ian Goodfellow and Yoshua Bengio and Aaron Courville, Deep Learning, MIT Press, pp. 146-151, 2016. [20] 中華民國內政部移民署，入出國及移民法令彙編專區，檢自https://www.immigration.gov.tw/5385/7353/7356/。 [21] 臺灣高等法院，法律座談刑事類案件說明，檢自https://tph.judicial.gov.tw/tw/np-1369-051.html。 [22] 臺灣開放數據，臺灣法院判決文書，檢自https://taiwanopendata.com/law/。
描述	碩士國立政治大學資訊科學系碩士在職專班 108971020
資料來源	http://thesis.lib.nccu.edu.tw/record/#G0108971020
資料類型	thesis

dc.contributor.advisor	劉昭麟	zh_TW
dc.contributor.advisor	Liu, Chao-Lin	en_US
dc.contributor.author (Authors)	張三妹	zh_TW
dc.contributor.author (Authors)	Chang, San-Mei	en_US
dc.creator (作者)	張三妹	zh_TW
dc.creator (作者)	Chang, San-Mei	en_US
dc.date (日期)	2024	en_US
dc.date.accessioned	1-Mar-2024 14:11:29 (UTC+8)	-
dc.date.available	1-Mar-2024 14:11:29 (UTC+8)	-
dc.date.issued (上傳時間)	1-Mar-2024 14:11:29 (UTC+8)	-
dc.identifier (Other Identifiers)	G0108971020	en_US
dc.identifier.uri (URI)	https://nccur.lib.nccu.edu.tw/handle/140.119/150258	-
dc.description (描述)	碩士	zh_TW
dc.description (描述)	國立政治大學	zh_TW
dc.description (描述)	資訊科學系碩士在職專班	zh_TW
dc.description (描述)	108971020	zh_TW
dc.description.abstract (摘要)	近年金融詐騙案件屢創新高，常常不止一審，本研究目的期望能分析金融領域上訴案件的統計數據，以便深入了解近10年金融上訴案件之統計和上訴趨勢。綜觀坊間並無針對金上訴案件相關案件深入探討或分類，期望透過本研究提供法律體系和金融相關研究者更深層的見解及分析資訊。收集高等法院第二審金融上訴案件近10年上訴案件的數據資料，針對裁判書解析出案件判決結果，依相近判決結果分類並統計數據，以看出歷年來上訴案件之趨勢及判決結果分類數量，並應用機器學習自動文件分類，依分類之目錄及對應已解析好之裁判書，以多項式分佈的單純貝氏演算法做模型訓練，並依據訓練及預測之結果，分析錯誤分類之原因，以勘誤人工分類並增加分類的準確度，同時搭配非監督式學習k-means分群演算法，所產出之類別文件中，其許可權較高的前20個單字，以做為調整分類參考依據。原本以人工檢視識別出金融上訴案件之分類，得花費相當大量時間及人力，經由機器學習模型訓練及模型評價調整找出較佳之分類及提高分類之正確性。應用以上技術來協助資料分析的工作，執行後產生之預測文件分類目錄及分析資訊，應能有助於後續需研究金融裁判書分析人員之初步分類，節省逐一查看及人工分類整理之時間，以達能增進研究及分析效率。	zh_TW
dc.description.abstract (摘要)	In recent years, financial fraud cases have kept increasing, often involving more than one instance. The purpose of this study is to analyze the statistical data of appeal cases in the financial field, so as to gain an in-depth understanding of the statistics and appeal trends of financial appeal cases in the past 10 years. There is no in-depth discuss or classification related to the financial appeal case, so hope that this study will provide deeper insights and analytical information for legal system and finance-related researchers. Collected the high court’s financial appeal judgment in the past 10 years, analyzed the case results according to the parsed judgement, classified and count the data according to similar judgment results, so as to see the trend of appeal cases and the number of classifications of judgment results over the years. Machine learning is applied for automatic document classification. According to the classified directory and the corresponding parsed judgment, the naive Bayes classifier of polynomial distribution is used for model training. Based on training and prediction to analyze the reasons for incorrect classification, corrected the manual classification and increased the accuracy of classification. Also use the unsupervised learning k-means clustering algorithm, the top 20 characters with higher weights of generated category files are used as a reference for adjusting classification. Originally, identify the classification of financial appeal cases through manual review, which took a lot of time and manpower. Through machine learning model training and model evaluation to find better classifications and enhance the correctness of classification. The above technologies are applied to assist in data analysis. The predicted classification directory generated after execution and analysis information should be helpful for researcher who need to study financial judgments in the future, saving the time of reviewing one by one and manual classifying and organizing, in order to improve the efficiency of research and analysis.	en_US
dc.description.tableofcontents	第一章緒論 1 1.1 研究背景與動機 1 1.2 研究目的 1 1.3 主要貢獻 2 1.4 論文架構 2 第二章文獻探討 3 2.1 應用自然語言處理技術於中文裁判書之相關研究 3 2.2 金融案件犯罪類型相關研究 4 2.3 分類技術應用相關研究 4 第三章相關研究與技術背景 7 3.1 資料集特徵 7 3.2 選用工具 8 3.3 評估指標 10 3.3.1 模型評價(classification report) 10 3.3.2 混淆矩陣(confusion matrix) 11 第四章研究方法 13 4.1 研究架構 13 4.2 實驗過程-前置檔案處理 14 4.3 實驗過程 23 4.3.1 監督式流程-資料集為2018年至2022年 24 4.3.2 監督式流程-資料集為2013年至2022年 34 4.3.3 非監督式流程 43 第五章實驗設計與結果分析 58 5.1 研究設計說明 58 5.1.1 分類調整說明 59 5.2 實驗結果分析 61 第六章結論 71 6.1 實驗與分析結論 71 6.2 未來展望 72 參考文獻 73 附錄A 口試委員提問與相關編修 75	zh_TW
dc.format.extent	4837063 bytes	-
dc.format.mimetype	application/pdf	-
dc.source.uri (資料來源)	http://thesis.lib.nccu.edu.tw/record/#G0108971020	en_US
dc.subject (關鍵詞)	機器學習	zh_TW
dc.subject (關鍵詞)	自動文件分類	zh_TW
dc.subject (關鍵詞)	非監督式學習	zh_TW
dc.subject (關鍵詞)	模型訓練	zh_TW
dc.subject (關鍵詞)	模型評價	zh_TW
dc.subject (關鍵詞)	Machine Learning	en_US
dc.subject (關鍵詞)	Automatic Document Classification	en_US
dc.subject (關鍵詞)	Unsupervised Learning	en_US
dc.subject (關鍵詞)	Model Training	en_US
dc.subject (關鍵詞)	Model Evaluation	en_US
dc.title (題名)	高等法院金融上訴裁判書之分析與文件分類	zh_TW
dc.title (題名)	Analysis and Document Classification of the High Court’s Financial Appeal Judgment	en_US
dc.type (資料類型)	thesis	en_US
dc.relation.reference (參考文獻)	[1] 司法院，裁判書開放資料下載頁面，檢自https://data.judicial.gov.tw/。 [2] 中央研究院資訊所，中央研究院詞庫小組（CKIP 中文斷詞系統），檢自https://ckipsvr.iis.sinica.edu.tw。 [3] 林琬真，機器學習與中文法律之標記與分類，碩士論文，國立臺灣大學，臺北市，臺灣，2012。 [4] 陳冠群，中文裁判書之要旨擷取：以最高法院裁判書為例，碩士論文，國立政治大學，臺北市，臺灣，2018。 [5] 彭建亮，藉由機器學習中決策樹方法實施相近法律案例參照之研究-以某機構法律案件執行為例，碩士論文，中國科技大學，臺北市，臺灣，2022。 [6] 詹德恩，我國金融犯罪特性與抗制難題，中正財經法學，第7期，頁159-220，2013。 [7] 余嘉哲，非法經營銀行業務罪之研究，碩士論文，國立政治大學，臺北市，臺灣，2020。 [8] Tom M. Mitchell, Machine Learning, McGraw-Hill, pp. 177-182, 1997. [9] Ethem Alpaydin, Introduction to Machine Learning, third edition, MIT Press, pp. 163-176, 2014. [10] 吳文峰，中文郵件分類器之設計及實作，碩士論文，逢甲大學，臺中市，臺灣，2002。 [11] 陳明原，應用貝氏理論及模糊邏輯進行專利分類及TRIZ方法改善之研究，碩士論文，國立臺灣科技大學，臺北市，臺灣，2006。 [12] 張良豪，利用貝氏屬性挑選法與先驗分配提升簡易貝氏分類器之效能，碩士論文，國立成功大學，臺南市，臺灣，2009。 [13] 周子皓，基於語境特徵及分群模型之中文多義詞消歧，碩士論文，國立政治大學，臺北市，臺灣，2019。 [14] 王希佩，以機器學習建構股價預測模型：以台灣股市為例，碩士論文，國立中央大學，桃園市，臺灣，2021。 [15] Fabian Pedregosa et al., Scikit-learn: Machine Learning in Python, Journal of Machine Learning Research, JMLR, vol. 12, 2011. [16] Ethem Alpaydin, Introduction to Machine Learning, third edition, MIT Press, pp. 561-564, 2014. [17] 法務部調查局洗錢防制處，金融情報中心態樣分享，檢自https://www.mjib.gov.tw/EditPage?PageID=564dcbaf-1d4e-45a9-bf1e-a26ea488bd76。 [18] Ethem Alpaydin, Introduction to Machine Learning, third edition, MIT Press, pp. 21-47, 2014. [19] Ian Goodfellow and Yoshua Bengio and Aaron Courville, Deep Learning, MIT Press, pp. 146-151, 2016. [20] 中華民國內政部移民署，入出國及移民法令彙編專區，檢自https://www.immigration.gov.tw/5385/7353/7356/。 [21] 臺灣高等法院，法律座談刑事類案件說明，檢自https://tph.judicial.gov.tw/tw/np-1369-051.html。 [22] 臺灣開放數據，臺灣法院判決文書，檢自https://taiwanopendata.com/law/。	zh_TW

Publications-Theses

Article View/Open

Publication Export

Google ScholarTM

NCCU Library

Citation Infomation

Related Publications in TAIR

Google Scholar^TM