Publications-Theses
Article View/Open
Publication Export
-
題名 聯邦學習:肺癌生存率預測 作者 劉源
Liu, Yuan貢獻者 謝明華
Hsieh, Ming-hua
劉源
Liu, Yuan關鍵詞 聯邦學習
肺癌
數據孤島
Federal learning
Lung cancer
Data island日期 2022 上傳時間 10-Feb-2022 12:56:26 (UTC+8) 摘要 在數據保護愈發嚴格的情勢下,保險公司在遵守數據安全保護的前提下,如何利用更多的數據對於癌症險的出險、費率進行進一步預測。本文探討了一種解決企業之間數據不能相互傳輸的方式:聯邦學習。本文透過預測肺癌的存活率,比較了聯邦學習和傳統機器學習的評估效果。結果發現,聯邦學習在數據不能出本地的情況下,依舊可以達到和傳統機器學習類似的效果。因此,本文認為,聯邦學習可以在保險公司的費率、出險率的預測上提供一種新的思路,幫助保險公司克服所面臨的數據量不足,受到法規限制等問題。
Under the situation of increasingly strict data protection, it’s important for insurance companies to further predict the risk and rate of cancer insurance with more data. This paper discusses a way to solve the problem that data cannot be transmitted between enterprises—Federated learning. By predicting the survival rate of lung cancer, this paper compares the effects of federal learning and traditional machine learning. The results show that federated learning can achieve the same effect as traditional machine learning when the data must stay in local. Therefore, this paper shows that under the restriction of laws and regulations federal learning can provide a new direction in the prediction of survival rate for insurance companies to overcome the problems of insufficient data.參考文獻 中文部分1. 周脈耕,王黎君,黃正京,楊功煥.2002.人口老化及危險因素改變對肺癌死亡率的影響[J]. 中國衛生統計.2. 李媛秋, 劉劍君,麼鴻雁 .(2019). "肺癌發病和死亡流行情況与人類發展指數展的關係分析." 中國腫瘤 28(9): 646-650.3. 楊強,黃安埠,劉洋,陳田健.(2021).聯邦學習實戰.4. 馬立偉, 曾強, 呂秋平, 范成燁, & 程鵬. (2015). 大數据癌症風險預測系统. 世界复合醫學(1), 5.5. 衛生福利部中央健康保健署.(2019). 醫療支出費用.6. 潘憶文(I-Wen Pan), 簡君儒(Chun-Ru Chien), & 施雅真(Ya-Chen Tina Shih). (2012). 美國癌症登記及老人醫療保險資料庫之發展與應用-論台灣癌症登記與健康保險聯結資料庫之可行性. 台灣公共衛生雜誌, 31(4), 299-310.7. Thomas Wetter. (2006). 運用三種資料探勘方法預測子宮頸癌存活情形之比較. 台灣家庭醫學雜誌, 16(3), 192-203.8. 胡麗霞, 江長思, 羅燕, 梅東東, 龔靜山, & 馬捷. (2019). 基于機器學習的放射組學 預測肥細小肺癌egfr基因突變. 醫學影像學雜誌, 29(7), 4.9. 財團法人保險事業發展中心.(2019). 108年人壽保險業務統計年報.10. 王健宗、孔令煒、黃章成、陳霖捷、劉懿、何安珣、肖京. (2020). 聯邦學習算法綜述. 大數據,6(6),19.11. 董厶溢.(2020). 基於聯邦學習數據處理方法、裝置、設備及介質與流程 .英文部分:1. Yang, Q. , Y Liu, Y Cheng, Y Kang, & Yu, H. . (2019). Federated Learning. Morgan & Claypool.2. Yang, Q. , Liu, Y. , Chen, T. , & Tong, Y. . (2019). Federated machine learning: concept and applications. ACM Transactions on Intelligent Systems and Technology, 10(2), 1-19.3. Liu, Y. , Liu, Y. , Liu, Z. , Zhang, J. , Meng, C. , & Zheng, Y. . Federated forest. IEEE Transactions on Big Data, PP(99), 1-1.4. Yang K , Jiang T , Shi Y , et al. Federated Learning via Over-the-Air Computation[J]. 2018.5. V Hartmann, Modi, K. , Pujol, J. M. , & West, R. . (2019). Privacy-preserving classification with secret vector machines.6. WILD, C. P., E. WEIDERPASS and B. W. STEWART (2020). Cancer Report :Cancer research for cancer prevention.7. Huang, L. , & Liu, D. . (2019). Patient clustering improves efficiency of federated machine learning to predict mortality and hospital stay time using distributed electronic medical records. Journal of Biomedical Informatics, 99, 103291-.8. Ferlay, Shin, Bray, & Mathers. (2010). Globocan 2008, cancer incidence and mortality worldwide: iarc cancerbase no. 10. International Journal of Cancer Journal International Du Cancer, 136(5), E359–E386.9. Xia, Y. , Yang, D. , Li, W. , Myronenko, A. , Xu, D. , & Obinata, H. , et al. (2021). Auto-fedavg: learnable federated averaging for multi-institutional medical image segmentation.10. Rehak, D. R. , Dodds, P. , & Lannom, L. . (2005). A model and infrastructure for federated learning content repositories.11. Mcmahan, H. B. , Moore, E. , D Ramage, Hampson, S. , & Arcas, B. . (2016). Communication-efficient learning of deep networks from decentralized data.12. Peter Kairouz, H.Brendan McMahan, Brendan Avent, & et al. (2019). Advances and open problems in federated learning.13. He, H. , & Garcia, E. A. . (2009). Learning from imbalanced data. IEEE Transactions on Knowledge and Data Engineering, 21(9), 1263-1284.14. Ganganwar, V. (2012). An overview of classification algorithms for imbalanced datasets. International Journal of Emerging Technology and Advanced Engineering, 2(4), 42-47. 描述 碩士
國立政治大學
風險管理與保險學系
108358029資料來源 http://thesis.lib.nccu.edu.tw/record/#G0108358029 資料類型 thesis dc.contributor.advisor 謝明華 zh_TW dc.contributor.advisor Hsieh, Ming-hua en_US dc.contributor.author (Authors) 劉源 zh_TW dc.contributor.author (Authors) Liu, Yuan en_US dc.creator (作者) 劉源 zh_TW dc.creator (作者) Liu, Yuan en_US dc.date (日期) 2022 en_US dc.date.accessioned 10-Feb-2022 12:56:26 (UTC+8) - dc.date.available 10-Feb-2022 12:56:26 (UTC+8) - dc.date.issued (上傳時間) 10-Feb-2022 12:56:26 (UTC+8) - dc.identifier (Other Identifiers) G0108358029 en_US dc.identifier.uri (URI) http://nccur.lib.nccu.edu.tw/handle/140.119/138895 - dc.description (描述) 碩士 zh_TW dc.description (描述) 國立政治大學 zh_TW dc.description (描述) 風險管理與保險學系 zh_TW dc.description (描述) 108358029 zh_TW dc.description.abstract (摘要) 在數據保護愈發嚴格的情勢下,保險公司在遵守數據安全保護的前提下,如何利用更多的數據對於癌症險的出險、費率進行進一步預測。本文探討了一種解決企業之間數據不能相互傳輸的方式:聯邦學習。本文透過預測肺癌的存活率,比較了聯邦學習和傳統機器學習的評估效果。結果發現,聯邦學習在數據不能出本地的情況下,依舊可以達到和傳統機器學習類似的效果。因此,本文認為,聯邦學習可以在保險公司的費率、出險率的預測上提供一種新的思路,幫助保險公司克服所面臨的數據量不足,受到法規限制等問題。 zh_TW dc.description.abstract (摘要) Under the situation of increasingly strict data protection, it’s important for insurance companies to further predict the risk and rate of cancer insurance with more data. This paper discusses a way to solve the problem that data cannot be transmitted between enterprises—Federated learning. By predicting the survival rate of lung cancer, this paper compares the effects of federal learning and traditional machine learning. The results show that federated learning can achieve the same effect as traditional machine learning when the data must stay in local. Therefore, this paper shows that under the restriction of laws and regulations federal learning can provide a new direction in the prediction of survival rate for insurance companies to overcome the problems of insufficient data. en_US dc.description.tableofcontents 摘要 2目錄 4表次 6第一章 緒論 7第一節 研究動機 7第二節 研究架構 10第二章 機器學習方法介紹 12第一節 傳統機器學習介紹 12第二節 聯邦式學習介紹 17第三章 實證研究 24第一節 數據前處理 24第二節 傳統機器學習實證研究 27第三節 聯邦式學習實證研究 30第四節 聯邦學習與傳統機器學習效果比較 31第四章 結論 33參考文獻 34 zh_TW dc.format.extent 2275263 bytes - dc.format.mimetype application/pdf - dc.source.uri (資料來源) http://thesis.lib.nccu.edu.tw/record/#G0108358029 en_US dc.subject (關鍵詞) 聯邦學習 zh_TW dc.subject (關鍵詞) 肺癌 zh_TW dc.subject (關鍵詞) 數據孤島 zh_TW dc.subject (關鍵詞) Federal learning en_US dc.subject (關鍵詞) Lung cancer en_US dc.subject (關鍵詞) Data island en_US dc.title (題名) 聯邦學習:肺癌生存率預測 zh_TW dc.type (資料類型) thesis en_US dc.relation.reference (參考文獻) 中文部分1. 周脈耕,王黎君,黃正京,楊功煥.2002.人口老化及危險因素改變對肺癌死亡率的影響[J]. 中國衛生統計.2. 李媛秋, 劉劍君,麼鴻雁 .(2019). "肺癌發病和死亡流行情況与人類發展指數展的關係分析." 中國腫瘤 28(9): 646-650.3. 楊強,黃安埠,劉洋,陳田健.(2021).聯邦學習實戰.4. 馬立偉, 曾強, 呂秋平, 范成燁, & 程鵬. (2015). 大數据癌症風險預測系统. 世界复合醫學(1), 5.5. 衛生福利部中央健康保健署.(2019). 醫療支出費用.6. 潘憶文(I-Wen Pan), 簡君儒(Chun-Ru Chien), & 施雅真(Ya-Chen Tina Shih). (2012). 美國癌症登記及老人醫療保險資料庫之發展與應用-論台灣癌症登記與健康保險聯結資料庫之可行性. 台灣公共衛生雜誌, 31(4), 299-310.7. Thomas Wetter. (2006). 運用三種資料探勘方法預測子宮頸癌存活情形之比較. 台灣家庭醫學雜誌, 16(3), 192-203.8. 胡麗霞, 江長思, 羅燕, 梅東東, 龔靜山, & 馬捷. (2019). 基于機器學習的放射組學 預測肥細小肺癌egfr基因突變. 醫學影像學雜誌, 29(7), 4.9. 財團法人保險事業發展中心.(2019). 108年人壽保險業務統計年報.10. 王健宗、孔令煒、黃章成、陳霖捷、劉懿、何安珣、肖京. (2020). 聯邦學習算法綜述. 大數據,6(6),19.11. 董厶溢.(2020). 基於聯邦學習數據處理方法、裝置、設備及介質與流程 .英文部分:1. Yang, Q. , Y Liu, Y Cheng, Y Kang, & Yu, H. . (2019). Federated Learning. Morgan & Claypool.2. Yang, Q. , Liu, Y. , Chen, T. , & Tong, Y. . (2019). Federated machine learning: concept and applications. ACM Transactions on Intelligent Systems and Technology, 10(2), 1-19.3. Liu, Y. , Liu, Y. , Liu, Z. , Zhang, J. , Meng, C. , & Zheng, Y. . Federated forest. IEEE Transactions on Big Data, PP(99), 1-1.4. Yang K , Jiang T , Shi Y , et al. Federated Learning via Over-the-Air Computation[J]. 2018.5. V Hartmann, Modi, K. , Pujol, J. M. , & West, R. . (2019). Privacy-preserving classification with secret vector machines.6. WILD, C. P., E. WEIDERPASS and B. W. STEWART (2020). Cancer Report :Cancer research for cancer prevention.7. Huang, L. , & Liu, D. . (2019). Patient clustering improves efficiency of federated machine learning to predict mortality and hospital stay time using distributed electronic medical records. Journal of Biomedical Informatics, 99, 103291-.8. Ferlay, Shin, Bray, & Mathers. (2010). Globocan 2008, cancer incidence and mortality worldwide: iarc cancerbase no. 10. International Journal of Cancer Journal International Du Cancer, 136(5), E359–E386.9. Xia, Y. , Yang, D. , Li, W. , Myronenko, A. , Xu, D. , & Obinata, H. , et al. (2021). Auto-fedavg: learnable federated averaging for multi-institutional medical image segmentation.10. Rehak, D. R. , Dodds, P. , & Lannom, L. . (2005). A model and infrastructure for federated learning content repositories.11. Mcmahan, H. B. , Moore, E. , D Ramage, Hampson, S. , & Arcas, B. . (2016). Communication-efficient learning of deep networks from decentralized data.12. Peter Kairouz, H.Brendan McMahan, Brendan Avent, & et al. (2019). Advances and open problems in federated learning.13. He, H. , & Garcia, E. A. . (2009). Learning from imbalanced data. IEEE Transactions on Knowledge and Data Engineering, 21(9), 1263-1284.14. Ganganwar, V. (2012). An overview of classification algorithms for imbalanced datasets. International Journal of Emerging Technology and Advanced Engineering, 2(4), 42-47. zh_TW dc.identifier.doi (DOI) 10.6814/NCCU202200065 en_US