小林眼鏡客戶資料的統計分析 | Publication

Publications-Theses

Article View/Open

pdf(1412)

Publication Export

Google Scholar^TM

題名	小林眼鏡客戶資料的統計分析 Statistical analysis of customer data for Kobayashi glasses
作者	陳冠全
貢獻者	翁久幸陳冠全
關鍵詞	小林眼鏡資料採礦關聯規則羅吉斯迴歸分類與決策樹 kobayashi data mining association rule logistic regression classification and regression tree
日期	2012
上傳時間	1-Jul-2013 17:01:42 (UTC+8)
摘要	隨著科技越來越進步，現代人和電子科技產品日益接觸頻繁，從電視、電腦一直到智慧型手機，使得台灣的視力不良人口有攀升的趨勢，根據中華民國統計資訊網的2013年資料顯示，國小的視力不良率占了49.36%，國中更占了73.66%，視力不良率的提升造成了「眼鏡」此矯正產品的重要性越來越高，眼鏡消費人口逐漸提升，相對地也造成了眼鏡業的激烈競爭。本研究選擇了台灣眼鏡業知名品牌「小林眼鏡」來進行其客戶的研究，運用統計技術和資料採礦(Data mining)的軟體「clementine 12.0」來進行挖掘客戶資訊的工具。本研究先分析小林眼鏡全部商品的銷售金額以及銷售次數的歷年趨勢，找出部分消費商品固定成長或減退的規律。再分析全部客戶的結構，利用基本的統計圖表呈現客戶特徵，並運用RFM分析法，R(Recently)為最近購買日，F(Frequency)為購買頻率，M(Monetary Amount)為購買金額，從全部客戶中選出重要客戶，並分析重要客戶結構，也將重要客戶與所有客戶做比較的討論。本研究也針對選出的小林眼鏡重要客戶，運用關聯規則的先驗演算法(Apiriori algorithm)挖掘出客戶特性和商品之間的規則，本研究找出重要客戶特性對商品「拋棄式隱形眼鏡」以及客戶特性對商品「藥水」的兩種購買關聯規則，此外，也間接探討先驗演算法中有無設定規則門檻值最小支持度α和最小信心度β所產生的區別。接下來本研究使用兩個模型種類，羅吉斯迴歸和分類與迴歸樹，來進行對未來會購買拋棄式隱形眼鏡商品的客戶預測，針對「有無事先選取重要變數」和「年份變數資料」分別建立模型，計算模型測試集準確度發現： 1. 多使用98年資料的羅吉斯模型解釋能力較佳，但對準確率沒有明顯區別。 2. 多使用98年資料的分類與迴歸樹模型準確率較佳。 3.「有無選取重要變數」對羅吉斯和分類與迴歸樹模型的準確率沒有明顯區別。 4.最重要和次重要的變數都為「99年購買D3個數」和「99年購買E1個數」。 5.分類樹與迴歸模型的準確率比羅吉斯迴歸模型高。本研究也探討是否在建立模型前對客戶資料使用平衡技巧時的差別，研究後發現，使用平衡技巧能預測出更多不會購買隱型眼鏡商品的客戶；不使用平衡技巧則預測出更多會購買隱型眼鏡的客戶。本研究將發現的所有客戶訊息提供給小林眼鏡作為參考，期許小林眼鏡企業能針對這些訊息創造出行銷策略，延長客戶的生命週期，使小林眼鏡永續經營。 This thesis aims to explore valuable information from customer transaction data using statistics and data mining techniques. We consider a real dataset from the famous brand“kobayashi”company and use the data mining software“clementine12.0". First ,we obtain time series plots and some basic statistics of all the products.Next, we use RFM analysis to find the important customers, and compare their traits with all customers. Then, we apply Apriori algorithm to find association rules. Finally, we use logistic regression and CART (classification and regression tree) to build models that can predict the potential customers. We found that 1.The logistic model with predictor variables from two years has better prediction ability than that with predictor variables from just one year.. 2.There is not much difference in prediction ability with or without doing variables selection. 3.The most important and secondary important variables are “the number of buying D3 in 99 year”and“the number of buying E1 in 99 year”in the models. 4.The prediction accuracy of CART is higher than the logistic model.
參考文獻	中文文獻 1. 邱敏鋒，「運用支撐向量機建構營建材料供應商使用衍生性金融商品避險之預測模型」，國立中央大學營建管理研究所碩士論文，1998 2. 劉玉萍，「運用一對一行銷執行顧客關係管理以提昇企業利潤」，一對一行銷研討會專題演講，電子化企業，經理人報告，第11期,2000 3. 陳巧佩，「企業導入顧客關係管理決策之研究」，國立政治大學企業管理研究所碩士論文，2001 3. 黃健炫，「樹狀模型分類問題之探討」，國立中央大學統計研究所碩士論文，2003 4. 林盈源，「決策樹在資料庫行銷決策之應用」，國立成功大學工業管理研究所碩士論文，2003 5. Ian H.Witten,Eibe Frank，「數據挖掘實用機器學習技術」，董琳,邱泉,于曉峰,吳紹群,孫立駿譯，機械工業出版社，2005 6. 劉蕙、麥惟喬、劉俞志，「混合資料探勘技術建構資料庫行銷架構之個案研究」，元智大學資訊管理系所，2006 7. 李佳玲，「應用商業智慧於眼鏡消費行為及市場需求」，國立政治大學統計學研究所碩士論文，2006 8. 楊梓貫，「應用商業智慧技術區隔3G手機市場需求模式」，國立政治大學統計學研究所碩士論文，2006 9. 林秀玲，「應用商業智慧於汽車再購行為」，國立政治大學統計學研究所碩士論文，2007 10.邱佳偉，「以關聯式規則與序列型樣探勘網路瀏覽行為之研究－以國內某休閒旅遊服飾網站為例」，國立台北科技大學商業自動化與管理研究所碩士論文，2008 11.黃元直，「RFM模型區隔消費者購買行為的區別能力研究-保險銷售資料實例驗證」，2009 12.謝邦昌,鄭宇庭,蘇志雄，「Data Mining 概述以clementine12.0為例」，中華資料採礦協會，2009 13.黃宇彤、陳罡熙，「利用 RFM 模型於顧客價值分析之個案研究」，建國科大學報，2009 14.江伊棠，「以感性工程探討網路旅遊服務之情緒層面」，國立中興大學科技管理研究所碩士論文，2010 15.蔡依庭，「應用資料採礦技術於電影市場研究」，國立政治大學統計學研究所碩士論文，2010 16.陳雨農，「應用資料採礦技術於數位相機產業消費者行為研究」，國立政治大 47 學統計學研究所碩士論文，2010 17.游涵茵，「應用資料採礦技術於信用卡使用行為及市場需求」，國立政治大學統計學研究所碩士論文，2010 18.吳晢楷，「台灣機車製造商行消策略-資料採礦應用」，國立政治大學企業管理研究所，2010 19.翁政雄，「從購買意願資料中挖掘高度相關性的關聯規則」，資訊管理學報，第4期，2011 20.何靖遠、賴宜楓，「線上消費者再購行為的實徵研究」，國立中央大學資訊管理學系，journal of e-business，2012 21.方國莉，「資料探勘於保險電話行銷之實證研究」，國立政治大學商學院經營管理碩士學程國際金融組碩士論文，2013 英文文獻 1. Thomas Robert Malthus(1798). An Essay on the Principle of Population publisher J. Johnson, London. 2. 10. Pierre François Verhuls (1845). Recherches mathématiques sur la loi d`accroissement de la population. 3. Roman Timofeev(1887).Classification and Regression trees theory and applications .Master thesis. 4. George Brown（1952）.Brand Loyalty–Fact or Fiction.Advertising Age. 5. Leo Breiman,Jerome Friedman,Charles J. Stone,R.A. Olshen(1984). Classification and Regression Trees publisher Chapman&Hall/CRC boca raton London new York Washington,D.C. 6. Bob Stone (1989). Successful direct marketing methods (4th ed.). NTC Business Books. 7. Rakesh Agrawal,Tomasz Imielinski,Arun Swami(1993). Mining association rules between sets of items in large databases . Proceedings of the 1993 ACM SIGMOD international conference on Management of data 8. Rakesh Agrawal,Ramakrishnan(1994). Fast algorithms for mining association rules in large databases. Research Report RJ 9839, IBM Almaden Research Center, San Jose, California. 9. Arthur Hughes(1994).Strategic Database Marketing(1st ed.) New York:McGraw-Hill. 10.Stone,Bob〈1995〉,Successful Direct Marketing Methods, pp37-59, Lincolnwood, IL ： NTC Business Books. 11.Richard Koch(1997). The 80/20 Principle. Nicholas Brealey Publishing. 48 12.Charles X.Ling and Chenghui Li(1998).Data mining for direct marketing:problems and solutions.American Association for Artificial Intelligent. 13.Kalakota&Robinson(1999).e-Business: Roadmap for success,Addison-Wesley Longman, Inc. 14.Peppard J.(2000). Customer Relationship Management (CRM) in financial services. European Management Journal. Elsevier Publishing. 15.Roger J. Lewis, M.D., Ph.D.(2000).An Introduction to Classification and Regression Tree (CART) Analysis.Annual Meeting of the Society for Academic Emergency Medicine in SanFrancisco, California. 16.Michael J Shaw,Chandrasekar Subramaniam,Gek Woo Tan & Michael E. Welge (2001). Knowledge management and data Mining for marketing. Journal of Decission Support System, 31,. 17.Alan Agresti(2007).An Introduction to Categorical Data Analysis(2th ed). john wiley & sons,inc,publication 18.Bob Umlas(2007).Business Analysis with Microsoft Excel(3th ed).Pearson Education,Inc.
描述	碩士國立政治大學統計研究所 99354022 101
資料來源	http://thesis.lib.nccu.edu.tw/record/#G0993540221
資料類型	thesis

dc.contributor.advisor	翁久幸	zh_TW
dc.contributor.author (Authors)	陳冠全	zh_TW
dc.creator (作者)	陳冠全	zh_TW
dc.date (日期)	2012	en_US
dc.date.accessioned	1-Jul-2013 17:01:42 (UTC+8)	-
dc.date.available	1-Jul-2013 17:01:42 (UTC+8)	-
dc.date.issued (上傳時間)	1-Jul-2013 17:01:42 (UTC+8)	-
dc.identifier (Other Identifiers)	G0993540221	en_US
dc.identifier.uri (URI)	http://nccur.lib.nccu.edu.tw/handle/140.119/58669	-
dc.description (描述)	碩士	zh_TW
dc.description (描述)	國立政治大學	zh_TW
dc.description (描述)	統計研究所	zh_TW
dc.description (描述)	99354022	zh_TW
dc.description (描述)	101	zh_TW
dc.description.abstract (摘要)	隨著科技越來越進步，現代人和電子科技產品日益接觸頻繁，從電視、電腦一直到智慧型手機，使得台灣的視力不良人口有攀升的趨勢，根據中華民國統計資訊網的2013年資料顯示，國小的視力不良率占了49.36%，國中更占了73.66%，視力不良率的提升造成了「眼鏡」此矯正產品的重要性越來越高，眼鏡消費人口逐漸提升，相對地也造成了眼鏡業的激烈競爭。本研究選擇了台灣眼鏡業知名品牌「小林眼鏡」來進行其客戶的研究，運用統計技術和資料採礦(Data mining)的軟體「clementine 12.0」來進行挖掘客戶資訊的工具。本研究先分析小林眼鏡全部商品的銷售金額以及銷售次數的歷年趨勢，找出部分消費商品固定成長或減退的規律。再分析全部客戶的結構，利用基本的統計圖表呈現客戶特徵，並運用RFM分析法，R(Recently)為最近購買日，F(Frequency)為購買頻率，M(Monetary Amount)為購買金額，從全部客戶中選出重要客戶，並分析重要客戶結構，也將重要客戶與所有客戶做比較的討論。本研究也針對選出的小林眼鏡重要客戶，運用關聯規則的先驗演算法(Apiriori algorithm)挖掘出客戶特性和商品之間的規則，本研究找出重要客戶特性對商品「拋棄式隱形眼鏡」以及客戶特性對商品「藥水」的兩種購買關聯規則，此外，也間接探討先驗演算法中有無設定規則門檻值最小支持度α和最小信心度β所產生的區別。接下來本研究使用兩個模型種類，羅吉斯迴歸和分類與迴歸樹，來進行對未來會購買拋棄式隱形眼鏡商品的客戶預測，針對「有無事先選取重要變數」和「年份變數資料」分別建立模型，計算模型測試集準確度發現： 1. 多使用98年資料的羅吉斯模型解釋能力較佳，但對準確率沒有明顯區別。 2. 多使用98年資料的分類與迴歸樹模型準確率較佳。 3.「有無選取重要變數」對羅吉斯和分類與迴歸樹模型的準確率沒有明顯區別。 4.最重要和次重要的變數都為「99年購買D3個數」和「99年購買E1個數」。 5.分類樹與迴歸模型的準確率比羅吉斯迴歸模型高。本研究也探討是否在建立模型前對客戶資料使用平衡技巧時的差別，研究後發現，使用平衡技巧能預測出更多不會購買隱型眼鏡商品的客戶；不使用平衡技巧則預測出更多會購買隱型眼鏡的客戶。本研究將發現的所有客戶訊息提供給小林眼鏡作為參考，期許小林眼鏡企業能針對這些訊息創造出行銷策略，延長客戶的生命週期，使小林眼鏡永續經營。	zh_TW
dc.description.abstract (摘要)	This thesis aims to explore valuable information from customer transaction data using statistics and data mining techniques. We consider a real dataset from the famous brand“kobayashi”company and use the data mining software“clementine12.0". First ,we obtain time series plots and some basic statistics of all the products.Next, we use RFM analysis to find the important customers, and compare their traits with all customers. Then, we apply Apriori algorithm to find association rules. Finally, we use logistic regression and CART (classification and regression tree) to build models that can predict the potential customers. We found that 1.The logistic model with predictor variables from two years has better prediction ability than that with predictor variables from just one year.. 2.There is not much difference in prediction ability with or without doing variables selection. 3.The most important and secondary important variables are “the number of buying D3 in 99 year”and“the number of buying E1 in 99 year”in the models. 4.The prediction accuracy of CART is higher than the logistic model.	en_US
dc.description.tableofcontents	目錄...5 第壹章緒論...9 第一節研究背景和動機...9 第二節研究目的...10 第三節研究架構...10 第貳章文獻探討...11 第参章研究方法...13 第一節 RFM分析法...13 第二節關聯規則...14 第三節分類方法...17 第四節 Clementine 12.0...18 第肆章實證研究...19 第一節資料說明...19 第二節敘述統計與RFM分析...21 第三節關聯規則分析...29 第四節預測分析...37 第伍章結論與建議...44 第一節結論...44 第二節建議...45 參考文獻...46 附錄...49	zh_TW
dc.format.extent	1708861 bytes	-
dc.format.mimetype	application/pdf	-
dc.language.iso	en_US	-
dc.source.uri (資料來源)	http://thesis.lib.nccu.edu.tw/record/#G0993540221	en_US
dc.subject (關鍵詞)	小林眼鏡	zh_TW
dc.subject (關鍵詞)	資料採礦	zh_TW
dc.subject (關鍵詞)	關聯規則	zh_TW
dc.subject (關鍵詞)	羅吉斯迴歸	zh_TW
dc.subject (關鍵詞)	分類與決策樹	zh_TW
dc.subject (關鍵詞)	kobayashi	en_US
dc.subject (關鍵詞)	data mining	en_US
dc.subject (關鍵詞)	association rule	en_US
dc.subject (關鍵詞)	logistic regression	en_US
dc.subject (關鍵詞)	classification and regression tree	en_US
dc.title (題名)	小林眼鏡客戶資料的統計分析	zh_TW
dc.title (題名)	Statistical analysis of customer data for Kobayashi glasses	en_US
dc.type (資料類型)	thesis	en
dc.relation.reference (參考文獻)	中文文獻 1. 邱敏鋒，「運用支撐向量機建構營建材料供應商使用衍生性金融商品避險之預測模型」，國立中央大學營建管理研究所碩士論文，1998 2. 劉玉萍，「運用一對一行銷執行顧客關係管理以提昇企業利潤」，一對一行銷研討會專題演講，電子化企業，經理人報告，第11期,2000 3. 陳巧佩，「企業導入顧客關係管理決策之研究」，國立政治大學企業管理研究所碩士論文，2001 3. 黃健炫，「樹狀模型分類問題之探討」，國立中央大學統計研究所碩士論文，2003 4. 林盈源，「決策樹在資料庫行銷決策之應用」，國立成功大學工業管理研究所碩士論文，2003 5. Ian H.Witten,Eibe Frank，「數據挖掘實用機器學習技術」，董琳,邱泉,于曉峰,吳紹群,孫立駿譯，機械工業出版社，2005 6. 劉蕙、麥惟喬、劉俞志，「混合資料探勘技術建構資料庫行銷架構之個案研究」，元智大學資訊管理系所，2006 7. 李佳玲，「應用商業智慧於眼鏡消費行為及市場需求」，國立政治大學統計學研究所碩士論文，2006 8. 楊梓貫，「應用商業智慧技術區隔3G手機市場需求模式」，國立政治大學統計學研究所碩士論文，2006 9. 林秀玲，「應用商業智慧於汽車再購行為」，國立政治大學統計學研究所碩士論文，2007 10.邱佳偉，「以關聯式規則與序列型樣探勘網路瀏覽行為之研究－以國內某休閒旅遊服飾網站為例」，國立台北科技大學商業自動化與管理研究所碩士論文，2008 11.黃元直，「RFM模型區隔消費者購買行為的區別能力研究-保險銷售資料實例驗證」，2009 12.謝邦昌,鄭宇庭,蘇志雄，「Data Mining 概述以clementine12.0為例」，中華資料採礦協會，2009 13.黃宇彤、陳罡熙，「利用 RFM 模型於顧客價值分析之個案研究」，建國科大學報，2009 14.江伊棠，「以感性工程探討網路旅遊服務之情緒層面」，國立中興大學科技管理研究所碩士論文，2010 15.蔡依庭，「應用資料採礦技術於電影市場研究」，國立政治大學統計學研究所碩士論文，2010 16.陳雨農，「應用資料採礦技術於數位相機產業消費者行為研究」，國立政治大 47 學統計學研究所碩士論文，2010 17.游涵茵，「應用資料採礦技術於信用卡使用行為及市場需求」，國立政治大學統計學研究所碩士論文，2010 18.吳晢楷，「台灣機車製造商行消策略-資料採礦應用」，國立政治大學企業管理研究所，2010 19.翁政雄，「從購買意願資料中挖掘高度相關性的關聯規則」，資訊管理學報，第4期，2011 20.何靖遠、賴宜楓，「線上消費者再購行為的實徵研究」，國立中央大學資訊管理學系，journal of e-business，2012 21.方國莉，「資料探勘於保險電話行銷之實證研究」，國立政治大學商學院經營管理碩士學程國際金融組碩士論文，2013 英文文獻 1. Thomas Robert Malthus(1798). An Essay on the Principle of Population publisher J. Johnson, London. 2. 10. Pierre François Verhuls (1845). Recherches mathématiques sur la loi d`accroissement de la population. 3. Roman Timofeev(1887).Classification and Regression trees theory and applications .Master thesis. 4. George Brown（1952）.Brand Loyalty–Fact or Fiction.Advertising Age. 5. Leo Breiman,Jerome Friedman,Charles J. Stone,R.A. Olshen(1984). Classification and Regression Trees publisher Chapman&Hall/CRC boca raton London new York Washington,D.C. 6. Bob Stone (1989). Successful direct marketing methods (4th ed.). NTC Business Books. 7. Rakesh Agrawal,Tomasz Imielinski,Arun Swami(1993). Mining association rules between sets of items in large databases . Proceedings of the 1993 ACM SIGMOD international conference on Management of data 8. Rakesh Agrawal,Ramakrishnan(1994). Fast algorithms for mining association rules in large databases. Research Report RJ 9839, IBM Almaden Research Center, San Jose, California. 9. Arthur Hughes(1994).Strategic Database Marketing(1st ed.) New York:McGraw-Hill. 10.Stone,Bob〈1995〉,Successful Direct Marketing Methods, pp37-59, Lincolnwood, IL ： NTC Business Books. 11.Richard Koch(1997). The 80/20 Principle. Nicholas Brealey Publishing. 48 12.Charles X.Ling and Chenghui Li(1998).Data mining for direct marketing:problems and solutions.American Association for Artificial Intelligent. 13.Kalakota&Robinson(1999).e-Business: Roadmap for success,Addison-Wesley Longman, Inc. 14.Peppard J.(2000). Customer Relationship Management (CRM) in financial services. European Management Journal. Elsevier Publishing. 15.Roger J. Lewis, M.D., Ph.D.(2000).An Introduction to Classification and Regression Tree (CART) Analysis.Annual Meeting of the Society for Academic Emergency Medicine in SanFrancisco, California. 16.Michael J Shaw,Chandrasekar Subramaniam,Gek Woo Tan & Michael E. Welge (2001). Knowledge management and data Mining for marketing. Journal of Decission Support System, 31,. 17.Alan Agresti(2007).An Introduction to Categorical Data Analysis(2th ed). john wiley & sons,inc,publication 18.Bob Umlas(2007).Business Analysis with Microsoft Excel(3th ed).Pearson Education,Inc.	zh_TW

Publications-Theses

Article View/Open

Publication Export

Google ScholarTM

NCCU Library

Citation Infomation

Related Publications in TAIR

Google Scholar^TM