學術產出-Theses

Article View/Open

Publication Export

Google ScholarTM

政大圖書館

Citation Infomation

  • No doi shows Citation Infomation
題名 應用多重插補法在包含遺漏資料的離散選擇模型
Applying Multiple Imputation to the Discrete Choice Model with Missing Data
作者 簡廷翰
Jian, Ting Han
貢獻者 鄭宗記
Cheng, Tsung Chi
簡廷翰
Jian, Ting Han
關鍵詞 多重差補法
離散選擇模型
Multiple imputation
Discrete choice model
日期 2017
上傳時間 13-Sep-2017 14:11:41 (UTC+8)
摘要 此篇文章探討,使用離散選擇模型(discrete choice model)中的邏輯模型(logit model)分析,若資料具有遺漏值(incomplete-data),比較將具有遺漏值樣本值皆移除與使用多重插補方法補值之參數估計結果。
本文使用的多重差補法為Buuren(2007)等人所提出的Multiple Imputation by Chained Equation(MICE)多重插補方法進行補值,並使用Rubin(1987)所提出的方法合併參數估計結果。從模擬結果之參數偏誤盒狀圖可知插補後參數估計與設定參數差異不大,另外插補次數對於參數估計結果影響不大,且在遺漏比例(missing percentage)大時,參數估計結果比起將具有遺漏值樣本直接移除的參數估計較為穩定。
另外使用實際資料分析,發現具有遺漏值樣本直接移除的參數估標準差比起插補後參數估計標準差大的趨勢,與模擬結果相同。
This paper focuses on using discrete choice logit model to analyze incompleted data. To deal with the incompleted data, complete case analysis and multiple imputation are used, and compare the result of parameter estimates of the two methods.
The method of multiple imputation which this paper used is Multiple Imputation by Chained Equation (MICE). With the estimates from multiple imputed data sets, using Rubin’s method (1987) to pool the estimates. The simulation shows that after imputing the missing values, the estimates from the imputed data are not much difference from the real parameters. The number of imputation does not effect the estimates much. With larger missing percentage, the estimates from the imputed data is more robust than the estimates from the complete case analysis.
In real data analysis, the standard deviation of estimates from using complete case analysis are bigger than imputed data, this result is the same with the simulation.
參考文獻 Atkinson, A. C. and T.-C. Cheng (2000). On Robust Linear Regression with Incomplete Data,
Computational Statistics and Data Analysis, 33, 361-380.
Azur, M. J., E. A. Stuart, C. Frangakis, and P. J. Leaf, (2011).Multiple Imputation by Chained Equations: What is it and how does it work?
International Journal of Methods in Psychiatric Research, 20(1), 40–49.
David A. Hensher, John M. Rose, William H. Greene (2015).Applied Choice Analysis,
CAMBRIDGE UNIVERSITY PRESS.
Gerko Vink, Laurence E. Frank, Jeroen Pannekoek, Stef van Buuren(2014).Predictive mean matching imputation of semicontinuous variables,
Statistica Neerlandica Vol. 68, nr. 1, pp. 61–90
Kenneth E. Train (2003), Discrete Choice Methods with Simulation,
CAMBRIDGE UNIVERSITY PRESS.
Kenneth E. Train, Yves Croissant, Kenneth Train’s exercises using the mlogit packages for R.
Also available at https://cran.r-project.org/web/packages/mlogit/vignettes/Exercises.pdf
James R. Carpenter and Michael G. Kenward (2013).Multiple Imputation and its application,
Wiley.
John W. Graham (2012).Missing Data-Analysis and Design,
Springer.
Jordan J. Louviere, David A. Hensher, Joffre D. Swait (2000).Stated Choice Methods-Analysis and Application,
CAMBRIDGE UNIVERSITY PRESS.
Peter C. Boxall and Wiktor L. Adamowicz (2002). Understanding Heterogeneous Preferences in Random Utility Models: A Latent Class Approach,
Environmental and Resource Economics 23: 421–446
Rubin, D.B. (1987). Multiple Imputation for Nonresponse in Surveys, New York :
John Wiley & Sons.
Rubin, D.B. (2002).Statistical Analysis with Missing Data Second Edition,
Wiley.
Van Buuren, S. and K. Groothuis-Oudshoorn (2011). mice: Multivariate Imputation by Chained Equations in R, Journal of Statistical Software, 45(3), 1-67.
Also available at http://www.jstatsoft.org/v45/i03/
Yves Croissant (2013).Estimation od multinomial logit model in R:The mlogit Packages .
Also available at https://cran.rproject.org/web/packages/mlogit/vignettes/mlogit.pdf
描述 碩士
國立政治大學
統計學系
104354001
資料來源 http://thesis.lib.nccu.edu.tw/record/#G1043540012
資料類型 thesis
dc.contributor.advisor 鄭宗記zh_TW
dc.contributor.advisor Cheng, Tsung Chien_US
dc.contributor.author (Authors) 簡廷翰zh_TW
dc.contributor.author (Authors) Jian, Ting Hanen_US
dc.creator (作者) 簡廷翰zh_TW
dc.creator (作者) Jian, Ting Hanen_US
dc.date (日期) 2017en_US
dc.date.accessioned 13-Sep-2017 14:11:41 (UTC+8)-
dc.date.available 13-Sep-2017 14:11:41 (UTC+8)-
dc.date.issued (上傳時間) 13-Sep-2017 14:11:41 (UTC+8)-
dc.identifier (Other Identifiers) G1043540012en_US
dc.identifier.uri (URI) http://nccur.lib.nccu.edu.tw/handle/140.119/112616-
dc.description (描述) 碩士zh_TW
dc.description (描述) 國立政治大學zh_TW
dc.description (描述) 統計學系zh_TW
dc.description (描述) 104354001zh_TW
dc.description.abstract (摘要) 此篇文章探討,使用離散選擇模型(discrete choice model)中的邏輯模型(logit model)分析,若資料具有遺漏值(incomplete-data),比較將具有遺漏值樣本值皆移除與使用多重插補方法補值之參數估計結果。
本文使用的多重差補法為Buuren(2007)等人所提出的Multiple Imputation by Chained Equation(MICE)多重插補方法進行補值,並使用Rubin(1987)所提出的方法合併參數估計結果。從模擬結果之參數偏誤盒狀圖可知插補後參數估計與設定參數差異不大,另外插補次數對於參數估計結果影響不大,且在遺漏比例(missing percentage)大時,參數估計結果比起將具有遺漏值樣本直接移除的參數估計較為穩定。
另外使用實際資料分析,發現具有遺漏值樣本直接移除的參數估標準差比起插補後參數估計標準差大的趨勢,與模擬結果相同。
zh_TW
dc.description.abstract (摘要) This paper focuses on using discrete choice logit model to analyze incompleted data. To deal with the incompleted data, complete case analysis and multiple imputation are used, and compare the result of parameter estimates of the two methods.
The method of multiple imputation which this paper used is Multiple Imputation by Chained Equation (MICE). With the estimates from multiple imputed data sets, using Rubin’s method (1987) to pool the estimates. The simulation shows that after imputing the missing values, the estimates from the imputed data are not much difference from the real parameters. The number of imputation does not effect the estimates much. With larger missing percentage, the estimates from the imputed data is more robust than the estimates from the complete case analysis.
In real data analysis, the standard deviation of estimates from using complete case analysis are bigger than imputed data, this result is the same with the simulation.
en_US
dc.description.tableofcontents 第一章、緒論………………………………………………………………………………1
第二章、離散選擇模型……………………………………………………………3
第一節、選擇集合……………………………………………………………3
第二節、效用理論與機率導…………………………………………4
第三節、離散選擇模型參數標準化……………………………6
第四節、不同離散選擇模型簡介………………………………8
第三章、遺漏值問題………………………………………………………………15
第一節、遺漏機制…………………………………………………………15
第二節、遺漏值處理方法介紹……………………………………16
第三節、連鎖方程多重插補法……………………………………18
第四節、預測均數匹配法……………………………………………19
第四章、模擬方法與結論………………………………………………………20
第一節、模擬方法…………………………………………………………20
第二節、模擬結論…………………………………………………………22
第五章、資料實證…………………………………………………………………30
第一節、實證資料介紹…………………………………………………30
第二節、變數說明…………………………………………………………31
第三節、離散選擇模型應用………………………………………33
第六章、結論……………………………………………………………………………38
參考文獻……………………………………………………………………………………39
附錄……………………………………………………………………………………………41
附錄1、表演藝術消費行為調查問卷………………………………41
zh_TW
dc.format.extent 2263206 bytes-
dc.format.mimetype application/pdf-
dc.source.uri (資料來源) http://thesis.lib.nccu.edu.tw/record/#G1043540012en_US
dc.subject (關鍵詞) 多重差補法zh_TW
dc.subject (關鍵詞) 離散選擇模型zh_TW
dc.subject (關鍵詞) Multiple imputationen_US
dc.subject (關鍵詞) Discrete choice modelen_US
dc.title (題名) 應用多重插補法在包含遺漏資料的離散選擇模型zh_TW
dc.title (題名) Applying Multiple Imputation to the Discrete Choice Model with Missing Dataen_US
dc.type (資料類型) thesisen_US
dc.relation.reference (參考文獻) Atkinson, A. C. and T.-C. Cheng (2000). On Robust Linear Regression with Incomplete Data,
Computational Statistics and Data Analysis, 33, 361-380.
Azur, M. J., E. A. Stuart, C. Frangakis, and P. J. Leaf, (2011).Multiple Imputation by Chained Equations: What is it and how does it work?
International Journal of Methods in Psychiatric Research, 20(1), 40–49.
David A. Hensher, John M. Rose, William H. Greene (2015).Applied Choice Analysis,
CAMBRIDGE UNIVERSITY PRESS.
Gerko Vink, Laurence E. Frank, Jeroen Pannekoek, Stef van Buuren(2014).Predictive mean matching imputation of semicontinuous variables,
Statistica Neerlandica Vol. 68, nr. 1, pp. 61–90
Kenneth E. Train (2003), Discrete Choice Methods with Simulation,
CAMBRIDGE UNIVERSITY PRESS.
Kenneth E. Train, Yves Croissant, Kenneth Train’s exercises using the mlogit packages for R.
Also available at https://cran.r-project.org/web/packages/mlogit/vignettes/Exercises.pdf
James R. Carpenter and Michael G. Kenward (2013).Multiple Imputation and its application,
Wiley.
John W. Graham (2012).Missing Data-Analysis and Design,
Springer.
Jordan J. Louviere, David A. Hensher, Joffre D. Swait (2000).Stated Choice Methods-Analysis and Application,
CAMBRIDGE UNIVERSITY PRESS.
Peter C. Boxall and Wiktor L. Adamowicz (2002). Understanding Heterogeneous Preferences in Random Utility Models: A Latent Class Approach,
Environmental and Resource Economics 23: 421–446
Rubin, D.B. (1987). Multiple Imputation for Nonresponse in Surveys, New York :
John Wiley & Sons.
Rubin, D.B. (2002).Statistical Analysis with Missing Data Second Edition,
Wiley.
Van Buuren, S. and K. Groothuis-Oudshoorn (2011). mice: Multivariate Imputation by Chained Equations in R, Journal of Statistical Software, 45(3), 1-67.
Also available at http://www.jstatsoft.org/v45/i03/
Yves Croissant (2013).Estimation od multinomial logit model in R:The mlogit Packages .
Also available at https://cran.rproject.org/web/packages/mlogit/vignettes/mlogit.pdf
zh_TW