Publications-Theses

Article View/Open

Publication Export

Google ScholarTM

NCCU Library

Citation Infomation

Related Publications in TAIR

題名 運用機器學習方法推廣綜合券商大財管業務
Expand Integrated Securities Firm`s Business Using Machine Learning
作者 許仲廷
Xu, Zhong-Ting
貢獻者 鄭宇庭
許仲廷
Xu, Zhong-Ting
關鍵詞 證券
客戶分群
分類
機器學習
K-means
Logistic regression
Securities
Customer grouping
Classification
Machine learning
日期 2020
上傳時間 3-Aug-2020 18:44:03 (UTC+8)
摘要 本研究希冀以統計以及機器學習之方法找出統一證券財管三大業務(保代、海外市場、財富管理)之客戶特徵,及區分有無購買客群間之關鍵差異性質,使業務員得以用更有效率、有系統之方式辨別新開戶商品偏好,藉此提高開戶、成交而後達到提高營收之成效。
研究流程從上述財管三大類業務的k-means分群開始,藉此完整建立大財管客戶之客戶素描以及兩年間(2018、2019)之特性轉變研究;第二步將未曾購買大財管商品之證券客戶與大財管客戶混和,使用分類演算法訓練模型,並探討模型中統計顯著之變項對客戶購買行為之影響,讓模型面對新開戶之特性能夠分辨其購買大財管商品之概率。
在財富管理的客戶分群中,我們發現了客群間在選擇商品時會有風險偏好、商品偏好的模式,但又極容易受市場概況以及業務員之影響(如2019年客戶對多頭市場的加碼投資);在保險代理的客戶分群中,我們發現高購買金額原因來自高單價的保單,如終身壽險、醫療險或年金保險,年齡則是在40~70歲之間,低購買金額則是傾向於買一些低單價的保單如定期壽險或是依些保險附約,再依需求搭配一些如意外傷害及少部分醫療險,且2019年之結果大致和2018年一樣,除了少部分受商品停賣影響,因而推論出保險商品受年齡、需求和業務員的影響是最大的。
在羅吉斯迴歸分類中,兩組模型顯著的變數相同,都是年齡、性別、註冊天數、開戶在2008前以及開戶在年終發放月份這五個變數,而購買總金額意外的並不顯著;從分類結論可以篩除潛在購買可能低的無效用戶,專注於潛在客戶,並依分群結論優化不同客群之服務,提高成交、營收。
This study is designed to use statistic and machine learning methods to find out the customer characteristics of President securities’ three major financial management businesses (Insurance, Overseas, Wealth management), and to distinguish the key differences between the customer groups, so that salesperson can use this to identify the preferences of new customers in a more efficient and systematic way, thereby improving new comer rate and transaction rate, then achieving the goal of increasing revenue.
The process starts with the k-means clustering of the three financial management business’ customers, so as to establish the customer sketches of these businesses and research on the change in characteristics of customers during data time (2018, 2019). Second step, use Logistic classification to train the model in order to distinguish from buyers(1) to non-buyers(0), and discuss how the statistically significant features in the model impact on customer buying behavior, so that the model can tell the new comer buying probability in the future.
In the outcome of customer clustering of Wealth management business, we found that there is a difference in risk preference and commodity preference when customers selecting products among the groups, but this characteristic is extremely susceptible to market situation(bull or bear market) and salesperson’s promotion; in the customer groups of Insurance business, we found that the high purchase amount group comes from the devotion of high unit price production sales, such as life insurance, medical insurance or annuity insurance, buying age is between 40 and 70 years old; and the low purchase amount group is the tend to buy some low unit price products such as fixed-term insurance or some insurance rider, and some accidental, injury insurance or a small amount of medical insurance according to demand. And the results in 2019 are roughly the same as 2018, except for little impact of some products suspensions on buying amount, and this infer that insurance products are most affected by age, demand, and salesperson.
In the outcome of Logistic classification, the significant variables of the two models are the same, which are Age, Gender, Days_of_registration, Registered_before_2008, Registered_at_bonus_release. And Total_buy_amount is interestingly not significant. In conclusion, it is possible to screen out users with low potential purchasing probability by focusing on potential customers, and optimize the services of different customer groups.
參考文獻 一、中文文獻
1.Huang, D., (2017),你可能不知道的邏輯迴歸(Logistic Regression),
https://taweihuang.hpd.io/2017/12/22/logreg101/。
2.Huang, J., (2019),k means分群演算法與Silhouette輪廓分析, https://ppt.cc/f8lhyx。
3.袁道寬,(2017),國立臺北大學統研所論文,貝氏羅吉斯模型應用於海上運輸保險可能危險因子之研究。
4.陳信宏,(2004),國立臺灣大學商學研究所論文,利用非動態資料庫之銀行顧客分群研究。
5.陸昀泓,(2014),國立臺灣大學國際企業學研究所論文,層級貝氏模型應用於消費者之社群與網路購物行為分析之研究。
6.謝育儒,(2016),國立高雄應用科技大學金融系金融資訊碩士在職專班論文,銀行房貸審核逾期因子之探討。
7.證券暨期貨月刊,(2011),我國證券商之發展與當前監理方向,第二十九卷,第九期。

二、英文文獻
1.Big Data Science Practice, (2015), Impact of target class proportions on accuracy of classification, https://kumaranpm.blogspot.com/2015/03/impact-of-target-class-proportions-on.html.
https://towardsdatascience.com/building-a-logistic-regression-in-python-step-by-step-becd4d56c9c8.
2.Jordan, J., (2017), Customer Segmentation Based on Purchasing Behavior, https://github.com/jeremyjordan/customer-segmentation/blob/master/customer_segments.ipynb.
3.Li, S., (2017), Building A Logistic Regression in Python, Step by Step.
4.Prasad, G., (2019), Notes on K-prototype for clustering mixed typed data, https://medium.com/@guruprasad0o_o0/notes-on-k-prototype-for-clustering-mixed-typed-data-e80eb526b226.
5.Stack Exchange, (2016), User segmentation by clustering with sparse data, https://stats.stackexchange.com/questions/199501/user-segmentation-by-clustering-with-sparse-data.
6.Mohit Gupta_OMG, 2020, ML | Logistic Regression using Python
https://www.geeksforgeeks.org/ml-logistic-regression-using-python/
7.Wikipedia - k-means clustering, https://en.wikipedia.org/wiki/K-means_clustering.
8.Wikipedia - Market segmentation, https://en.wikipedia.org/wiki/Market_segmentation.
9.Wikipedia - Statistical classification , https://en.wikipedia.org/wiki/Statistical_classification.
描述 碩士
國立政治大學
企業管理研究所(MBA學位學程)
107363095
資料來源 http://thesis.lib.nccu.edu.tw/record/#G0107363095
資料類型 thesis
dc.contributor.advisor 鄭宇庭zh_TW
dc.contributor.author (Authors) 許仲廷zh_TW
dc.contributor.author (Authors) Xu, Zhong-Tingen_US
dc.creator (作者) 許仲廷zh_TW
dc.creator (作者) Xu, Zhong-Tingen_US
dc.date (日期) 2020en_US
dc.date.accessioned 3-Aug-2020 18:44:03 (UTC+8)-
dc.date.available 3-Aug-2020 18:44:03 (UTC+8)-
dc.date.issued (上傳時間) 3-Aug-2020 18:44:03 (UTC+8)-
dc.identifier (Other Identifiers) G0107363095en_US
dc.identifier.uri (URI) http://nccur.lib.nccu.edu.tw/handle/140.119/131359-
dc.description (描述) 碩士zh_TW
dc.description (描述) 國立政治大學zh_TW
dc.description (描述) 企業管理研究所(MBA學位學程)zh_TW
dc.description (描述) 107363095zh_TW
dc.description.abstract (摘要) 本研究希冀以統計以及機器學習之方法找出統一證券財管三大業務(保代、海外市場、財富管理)之客戶特徵,及區分有無購買客群間之關鍵差異性質,使業務員得以用更有效率、有系統之方式辨別新開戶商品偏好,藉此提高開戶、成交而後達到提高營收之成效。
研究流程從上述財管三大類業務的k-means分群開始,藉此完整建立大財管客戶之客戶素描以及兩年間(2018、2019)之特性轉變研究;第二步將未曾購買大財管商品之證券客戶與大財管客戶混和,使用分類演算法訓練模型,並探討模型中統計顯著之變項對客戶購買行為之影響,讓模型面對新開戶之特性能夠分辨其購買大財管商品之概率。
在財富管理的客戶分群中,我們發現了客群間在選擇商品時會有風險偏好、商品偏好的模式,但又極容易受市場概況以及業務員之影響(如2019年客戶對多頭市場的加碼投資);在保險代理的客戶分群中,我們發現高購買金額原因來自高單價的保單,如終身壽險、醫療險或年金保險,年齡則是在40~70歲之間,低購買金額則是傾向於買一些低單價的保單如定期壽險或是依些保險附約,再依需求搭配一些如意外傷害及少部分醫療險,且2019年之結果大致和2018年一樣,除了少部分受商品停賣影響,因而推論出保險商品受年齡、需求和業務員的影響是最大的。
在羅吉斯迴歸分類中,兩組模型顯著的變數相同,都是年齡、性別、註冊天數、開戶在2008前以及開戶在年終發放月份這五個變數,而購買總金額意外的並不顯著;從分類結論可以篩除潛在購買可能低的無效用戶,專注於潛在客戶,並依分群結論優化不同客群之服務,提高成交、營收。
zh_TW
dc.description.abstract (摘要) This study is designed to use statistic and machine learning methods to find out the customer characteristics of President securities’ three major financial management businesses (Insurance, Overseas, Wealth management), and to distinguish the key differences between the customer groups, so that salesperson can use this to identify the preferences of new customers in a more efficient and systematic way, thereby improving new comer rate and transaction rate, then achieving the goal of increasing revenue.
The process starts with the k-means clustering of the three financial management business’ customers, so as to establish the customer sketches of these businesses and research on the change in characteristics of customers during data time (2018, 2019). Second step, use Logistic classification to train the model in order to distinguish from buyers(1) to non-buyers(0), and discuss how the statistically significant features in the model impact on customer buying behavior, so that the model can tell the new comer buying probability in the future.
In the outcome of customer clustering of Wealth management business, we found that there is a difference in risk preference and commodity preference when customers selecting products among the groups, but this characteristic is extremely susceptible to market situation(bull or bear market) and salesperson’s promotion; in the customer groups of Insurance business, we found that the high purchase amount group comes from the devotion of high unit price production sales, such as life insurance, medical insurance or annuity insurance, buying age is between 40 and 70 years old; and the low purchase amount group is the tend to buy some low unit price products such as fixed-term insurance or some insurance rider, and some accidental, injury insurance or a small amount of medical insurance according to demand. And the results in 2019 are roughly the same as 2018, except for little impact of some products suspensions on buying amount, and this infer that insurance products are most affected by age, demand, and salesperson.
In the outcome of Logistic classification, the significant variables of the two models are the same, which are Age, Gender, Days_of_registration, Registered_before_2008, Registered_at_bonus_release. And Total_buy_amount is interestingly not significant. In conclusion, it is possible to screen out users with low potential purchasing probability by focusing on potential customers, and optimize the services of different customer groups.
en_US
dc.description.tableofcontents 目錄
謝誌 1
摘要 2
Abstract 3
第一章 緒論 10
第一節 研究背景與動機 10
第二節 研究目標 12
第三節 研究流程 13
第二章 文獻探討 14
第一節 分群分類之概述 14
第二節 客戶分群分類之文獻回顧 16
第三章 實證分析 20
第一節 資料概述 20
第二節 大財管客戶之分群分析(Clustering) 20
第三節 證券客戶之分類(Classification) 71
第四章 研究結論與建議 76
第一節 研究結論 76
第二節 建議 83
參考文獻 85
zh_TW
dc.format.extent 8576302 bytes-
dc.format.mimetype application/pdf-
dc.source.uri (資料來源) http://thesis.lib.nccu.edu.tw/record/#G0107363095en_US
dc.subject (關鍵詞) 證券zh_TW
dc.subject (關鍵詞) 客戶分群zh_TW
dc.subject (關鍵詞) 分類zh_TW
dc.subject (關鍵詞) 機器學習zh_TW
dc.subject (關鍵詞) K-meansen_US
dc.subject (關鍵詞) Logistic regressionen_US
dc.subject (關鍵詞) Securitiesen_US
dc.subject (關鍵詞) Customer groupingen_US
dc.subject (關鍵詞) Classificationen_US
dc.subject (關鍵詞) Machine learningen_US
dc.title (題名) 運用機器學習方法推廣綜合券商大財管業務zh_TW
dc.title (題名) Expand Integrated Securities Firm`s Business Using Machine Learningen_US
dc.type (資料類型) thesisen_US
dc.relation.reference (參考文獻) 一、中文文獻
1.Huang, D., (2017),你可能不知道的邏輯迴歸(Logistic Regression),
https://taweihuang.hpd.io/2017/12/22/logreg101/。
2.Huang, J., (2019),k means分群演算法與Silhouette輪廓分析, https://ppt.cc/f8lhyx。
3.袁道寬,(2017),國立臺北大學統研所論文,貝氏羅吉斯模型應用於海上運輸保險可能危險因子之研究。
4.陳信宏,(2004),國立臺灣大學商學研究所論文,利用非動態資料庫之銀行顧客分群研究。
5.陸昀泓,(2014),國立臺灣大學國際企業學研究所論文,層級貝氏模型應用於消費者之社群與網路購物行為分析之研究。
6.謝育儒,(2016),國立高雄應用科技大學金融系金融資訊碩士在職專班論文,銀行房貸審核逾期因子之探討。
7.證券暨期貨月刊,(2011),我國證券商之發展與當前監理方向,第二十九卷,第九期。

二、英文文獻
1.Big Data Science Practice, (2015), Impact of target class proportions on accuracy of classification, https://kumaranpm.blogspot.com/2015/03/impact-of-target-class-proportions-on.html.
https://towardsdatascience.com/building-a-logistic-regression-in-python-step-by-step-becd4d56c9c8.
2.Jordan, J., (2017), Customer Segmentation Based on Purchasing Behavior, https://github.com/jeremyjordan/customer-segmentation/blob/master/customer_segments.ipynb.
3.Li, S., (2017), Building A Logistic Regression in Python, Step by Step.
4.Prasad, G., (2019), Notes on K-prototype for clustering mixed typed data, https://medium.com/@guruprasad0o_o0/notes-on-k-prototype-for-clustering-mixed-typed-data-e80eb526b226.
5.Stack Exchange, (2016), User segmentation by clustering with sparse data, https://stats.stackexchange.com/questions/199501/user-segmentation-by-clustering-with-sparse-data.
6.Mohit Gupta_OMG, 2020, ML | Logistic Regression using Python
https://www.geeksforgeeks.org/ml-logistic-regression-using-python/
7.Wikipedia - k-means clustering, https://en.wikipedia.org/wiki/K-means_clustering.
8.Wikipedia - Market segmentation, https://en.wikipedia.org/wiki/Market_segmentation.
9.Wikipedia - Statistical classification , https://en.wikipedia.org/wiki/Statistical_classification.
zh_TW
dc.identifier.doi (DOI) 10.6814/NCCU202000646en_US