Publications-NSC Projects

Article View/Open

Publication Export

Google ScholarTM

NCCU Library

Citation Infomation

Related Publications in TAIR

題名 巨量資料下的二位元矩陣分解在財稅資料上的應用
The Application of Binary Matrix Factorization in Large Amount of Taxation Data
作者 曾正男
貢獻者 應數系
關鍵詞 財稅資料; 大資料; 小樣本; 二元矩陣分解; 監督式學習; 非監督式學習
Taxation data; big data; few samplings; binary matrix factorization; supervised learning; unsupervised learning
日期 2019-11
上傳時間 16-Apr-2025 14:28:26 (UTC+8)
摘要 財政部國稅局每年約有80萬筆報稅的資料,並且有相對少量的書審過程來稽核是否納稅義務人有逃漏稅的嫌疑。我們將設計一個混合unsupervised learning 和 supervised learning 的方法,並提出一種新的矩陣分解方式,從該矩陣的分解內容可以直接讀取對應不同類別的稅務資料在階層式分類法上清楚的說明是哪些變數造成分類的主要原因,並且同時在每一個特定的資料類別上,提供哪些是正常報稅,哪些是有逃漏稅可能的參考。這樣的架構,不只是在財稅之料上的應用,對於大數據的小樣本稽核都有同樣的幫助。
The Internal Revenue Service of the Ministry of Finance reports about 800,000 tax filings each year and has a relatively small amount of book review to see whether the taxpayers are suspicious of tax evasion. We will design a hybrid unsupervised learning and supervised learning approach and propose a new binary matrix decomposition method that can be directly read from the decomposition of the matrix corresponding to different types of tax information. And at the same time, this matrix factorization helps us to identify the normal tax filings or not. The result of this research is not only applied on taxation data, but also on the application of small samples of big data.
關聯 科技部, MOST107-2115-M004-002, 107.08-108.07
資料類型 report
dc.contributor 應數系
dc.creator (作者) 曾正男
dc.date (日期) 2019-11
dc.date.accessioned 16-Apr-2025 14:28:26 (UTC+8)-
dc.date.available 16-Apr-2025 14:28:26 (UTC+8)-
dc.date.issued (上傳時間) 16-Apr-2025 14:28:26 (UTC+8)-
dc.identifier.uri (URI) https://nccur.lib.nccu.edu.tw/handle/140.119/156620-
dc.description.abstract (摘要) 財政部國稅局每年約有80萬筆報稅的資料,並且有相對少量的書審過程來稽核是否納稅義務人有逃漏稅的嫌疑。我們將設計一個混合unsupervised learning 和 supervised learning 的方法,並提出一種新的矩陣分解方式,從該矩陣的分解內容可以直接讀取對應不同類別的稅務資料在階層式分類法上清楚的說明是哪些變數造成分類的主要原因,並且同時在每一個特定的資料類別上,提供哪些是正常報稅,哪些是有逃漏稅可能的參考。這樣的架構,不只是在財稅之料上的應用,對於大數據的小樣本稽核都有同樣的幫助。
dc.description.abstract (摘要) The Internal Revenue Service of the Ministry of Finance reports about 800,000 tax filings each year and has a relatively small amount of book review to see whether the taxpayers are suspicious of tax evasion. We will design a hybrid unsupervised learning and supervised learning approach and propose a new binary matrix decomposition method that can be directly read from the decomposition of the matrix corresponding to different types of tax information. And at the same time, this matrix factorization helps us to identify the normal tax filings or not. The result of this research is not only applied on taxation data, but also on the application of small samples of big data.
dc.format.extent 116 bytes-
dc.format.mimetype text/html-
dc.relation (關聯) 科技部, MOST107-2115-M004-002, 107.08-108.07
dc.subject (關鍵詞) 財稅資料; 大資料; 小樣本; 二元矩陣分解; 監督式學習; 非監督式學習
dc.subject (關鍵詞) Taxation data; big data; few samplings; binary matrix factorization; supervised learning; unsupervised learning
dc.title (題名) 巨量資料下的二位元矩陣分解在財稅資料上的應用
dc.title (題名) The Application of Binary Matrix Factorization in Large Amount of Taxation Data
dc.type (資料類型) report