葛特曼量表之拒答插補研究 | Publication

Publications-Theses

Article View/Open

pdf(695)pdf(765)pdf(796)pdf(736)pdf(1978)pdf(770)pdf(728)

Publication Export

Google Scholar^TM

題名	葛特曼量表之拒答插補研究
作者	左宗光
貢獻者	江振東左宗光
關鍵詞	拒答葛特曼量表簡易插補多重插補最鄰近插補
日期	2008
上傳時間	8-Dec-2010 14:52:55 (UTC+8)
摘要	在抽樣調查的資料中，可能因為題意不清、關係到個人隱私，或是議題太過於敏感而導致受訪者「拒答」。透過存在拒答的樣本資料來做分析探討時，很可能會造成偏誤的研究結果，因此如何處理無反應的資料常常是一項研究結果是否可信的重要關鍵之一。常見的處理方式通常是設法對這些拒答資料進行插補。然而插補的好壞一直沒有一個判定準則，分析結果亦常因此受到質疑。本研究將針對葛特曼量表的資料型態，利用「正確率」的概念，用不同的插補方式，包括社會科學研究常使用的簡易插補法，以及多重插補法與最鄰近插補法等方法，透過計算正確率來比較插補的好壞以及推論適用的時機。本研究以「台灣社會變遷基本調查」第四期第三次的調查資料中，有關性態度的題目做為例子，將其中符合葛特曼量表的資料視為「黃金標準」，並按照其中拒答部分的形態，從黃金標準中製造拒答資料。隨著拒答率的上升，每種拒答形態對應的個數將等量放大。研究結果發現，簡易插補法的正確率可以透過公式推導求出。在這筆資料之下，不論何種簡易插補方法，其正確率都不超過32%，但隨著拒答型態與社會開放程度的不同，拒答率會有很大的變化。多重插補法之下的結果比簡易插補法略好一些，有接近33%的正確率，但從便利性來看使用簡易插補法就比多重插補法來的高。最鄰近插補法的正確率是相對比較高的，最高可以達到約47%，然而執行上比較花費時間，以及正確率有隨著拒答率的上升而下降的趨勢都是最鄰近插補法可能的問題。 In a questionnaire survey、respondents may refuse to answer certain items since the questions themselves are unclear、sensitive、or relating to personal privacy. An analysis result using a data set containing refusal responses might be biased、how to deal with survey refusals have thus drawn much attention of late. One popular approach is through the use of imputation. However、lacking a criterion to evaluate its performances、there exist debates concerning the usefulness of this approach. In this study、we compare Simple imputation Method、Multiple Imputation Method、and Nearest Neighbor Method to deal with refusals in a set of survey items forming a Gittman scale in terms of imputation accuracy. Data are taken from the 2002 Taiwan Social Change Survey (TSCS)、and the items of interest are about sexual attitude. The parts of data that satisfy perfect Guttman scale are treat as 「Gold Standard」、and refusals are generated according to the original refusal pattern appear in the data. The result shows that the accuracy associated with Simple Imputation can actually be derived theoretically. No matter which version of Simple Imputation is applied、the accuracy is no more than 32%. Multiple Imputations performs slightly better than Simple Imputation、the accuracy is about 33%. However、it is less efficient in terms of computer time. Although Nearest Neighbor Method has the best performance the three、and its accuracy can reach as 47%、it requires much more computer time than the other two methods、and the accuracy would decrease as the refusal rate goes up.
參考文獻	(一)中文部分陳信木、林佳瑩(1997)〈調查資料之遺漏值的處置—以熱卡插補法為例〉，《調查研究》，3：75-106 (二)英文部分 Buuren、S.V. and Oudshoorn、C.G.M.、(2000). Multivariate Imputation by Chained Equations: MICE V1.0 User’s Manual. Report PG/VGZ/00.038、TNO Prevention and Health、Leiden. Cover、T.M. and Hart、P.E.、(1967). 「Nearest Neighbor Pattern Classification」. IEEE Transactions on Information Theory、13:21-27. Fix、E. and Hodges、J.L.、(1951). 「Discriminatory analysis－Nonparametric Discrimination: Consistency Properties」. Project 21-49-004、Report NO.4、US Air Force School of Aviation Medicine、Randolph Field. Guttman、L.、(1950). 「The Basis for Scalogram Analysis」(With Stouffer et al). Measurement and Prediction. Studies in Social Psychology in World War II、 Princeton University Press、NJ、4:60-90. Kaufman、L.、and Rousseeuw、P.J.、(1990). Finding Groups in Data: An Introduction to Cluster Analysis. New York: John Wiley and Sons、Inc. Liao、P.、and Tu、S.、(2006). 「Examining the Scalability of Intimacy Permissiveness Scale in Taiwan」. Social Indicators Research、76:207-232. Little、R.J.A.、and Rubin、D.B.、(1989). 「The Analysis of Social Science Data with Missing Values」. Sociological Methods and Research、18: 292-326. Menzel、H. (1953). 「A New Coefficient for Scalogram Analysis」. Public Opinion Quarterly、17: 268-280. Rubin、D.B.、(1976). Inference and missing data. Biometrika、63:581-592. Rubin、D.B.、(1987). Multiple Imputation for Nonresponse in Surveys. New York: John Wiley. Schafer、J.L (1999)、「Multiple Imputation: A Primer」. Statistical Methods in Medical Research 8: 3-15. Shoemaker、P.F.、Eichholz、M.、and Skewes、E.A.、(2002). 「Item Nonresponse: Distinguishing Between Don’t Know and Refuse」. International Journal of Public Opinion Research、14: 193-201. Sinharay、S.、Stern、H.S.、and Russell、D. (2001). 「The Use of Multiple Imputation for the Analysis of Missing Data」. Psychological Methods 4: 317-329. Tanner、M.A. and Wong、W.H.、(1987). 「The Calculation of Posterior Distributions by Data Augmentation (with Discussion)」. Journal of the American Statistical Association、82: 528-50. Yamaguchi、K. (2000). 「Multinomial Logit Latent-Class Regression Models: An Analysis of the Predictors of Gender-Role Attitudes Among Japanese Women」. American Journal of Sociology、105: 1702-1740.
描述	碩士國立政治大學統計研究所 96354013 97
資料來源	http://thesis.lib.nccu.edu.tw/record/#G0096354013
資料類型	thesis

dc.contributor.advisor	江振東	zh_TW
dc.contributor.author (Authors)	左宗光	zh_TW
dc.creator (作者)	左宗光	zh_TW
dc.date (日期)	2008	en_US
dc.date.accessioned	8-Dec-2010 14:52:55 (UTC+8)	-
dc.date.available	8-Dec-2010 14:52:55 (UTC+8)	-
dc.date.issued (上傳時間)	8-Dec-2010 14:52:55 (UTC+8)	-
dc.identifier (Other Identifiers)	G0096354013	en_US
dc.identifier.uri (URI)	http://nccur.lib.nccu.edu.tw/handle/140.119/49599	-
dc.description (描述)	碩士	zh_TW
dc.description (描述)	國立政治大學	zh_TW
dc.description (描述)	統計研究所	zh_TW
dc.description (描述)	96354013	zh_TW
dc.description (描述)	97	zh_TW
dc.description.abstract (摘要)	在抽樣調查的資料中，可能因為題意不清、關係到個人隱私，或是議題太過於敏感而導致受訪者「拒答」。透過存在拒答的樣本資料來做分析探討時，很可能會造成偏誤的研究結果，因此如何處理無反應的資料常常是一項研究結果是否可信的重要關鍵之一。常見的處理方式通常是設法對這些拒答資料進行插補。然而插補的好壞一直沒有一個判定準則，分析結果亦常因此受到質疑。本研究將針對葛特曼量表的資料型態，利用「正確率」的概念，用不同的插補方式，包括社會科學研究常使用的簡易插補法，以及多重插補法與最鄰近插補法等方法，透過計算正確率來比較插補的好壞以及推論適用的時機。本研究以「台灣社會變遷基本調查」第四期第三次的調查資料中，有關性態度的題目做為例子，將其中符合葛特曼量表的資料視為「黃金標準」，並按照其中拒答部分的形態，從黃金標準中製造拒答資料。隨著拒答率的上升，每種拒答形態對應的個數將等量放大。研究結果發現，簡易插補法的正確率可以透過公式推導求出。在這筆資料之下，不論何種簡易插補方法，其正確率都不超過32%，但隨著拒答型態與社會開放程度的不同，拒答率會有很大的變化。多重插補法之下的結果比簡易插補法略好一些，有接近33%的正確率，但從便利性來看使用簡易插補法就比多重插補法來的高。最鄰近插補法的正確率是相對比較高的，最高可以達到約47%，然而執行上比較花費時間，以及正確率有隨著拒答率的上升而下降的趨勢都是最鄰近插補法可能的問題。	zh_TW
dc.description.abstract (摘要)	In a questionnaire survey、respondents may refuse to answer certain items since the questions themselves are unclear、sensitive、or relating to personal privacy. An analysis result using a data set containing refusal responses might be biased、how to deal with survey refusals have thus drawn much attention of late. One popular approach is through the use of imputation. However、lacking a criterion to evaluate its performances、there exist debates concerning the usefulness of this approach. In this study、we compare Simple imputation Method、Multiple Imputation Method、and Nearest Neighbor Method to deal with refusals in a set of survey items forming a Gittman scale in terms of imputation accuracy. Data are taken from the 2002 Taiwan Social Change Survey (TSCS)、and the items of interest are about sexual attitude. The parts of data that satisfy perfect Guttman scale are treat as 「Gold Standard」、and refusals are generated according to the original refusal pattern appear in the data. The result shows that the accuracy associated with Simple Imputation can actually be derived theoretically. No matter which version of Simple Imputation is applied、the accuracy is no more than 32%. Multiple Imputations performs slightly better than Simple Imputation、the accuracy is about 33%. However、it is less efficient in terms of computer time. Although Nearest Neighbor Method has the best performance the three、and its accuracy can reach as 47%、it requires much more computer time than the other two methods、and the accuracy would decrease as the refusal rate goes up.	en_US
dc.description.tableofcontents	第壹章研究背景與動機.........................................................................1 第貳章文獻探討.....................................................................................3 第一節葛特曼量表模型.........................................................................................3 第二節過去相關研究............................................................................................4 第三節遺漏值的定義與機制................................................................................4 第四節多重插補法................................................................................................5 第五節最鄰近插補法.............................................................................................7 第參章資料分析...................................................................................11 第一節變數介紹..................................................................................................11 第二節反應變數說明..........................................................................................11 第三節人口變項分析..........................................................................................14 第肆章實證分析...................................................................................17 第一節簡易插補法..............................................................................................19 第二節多重插補法..............................................................................................36 第三節最鄰近插補法..........................................................................................38 第伍章結論與改進事項.......................................................................41 第一節結論..........................................................................................................41 第二節改進事項..................................................................................................42 參考文獻..................................................................................................44 附錄..........................................................................................................47 附錄一、簡易插補法(不考慮葛特曼量表性質)....................................................47 附錄二、R製造拒答的程式碼...............................................................................49 附錄三、R執行多重插補的程式碼.......................................................................51 附錄四、R執行最鄰近插補法的程式碼...............................................................60 附錄五、其他模式的多重插補法...........................................................................64 表目錄表2.1 執行多重插補法的建議分配............................................................................7 表2.2 五筆資料距離計算結果....................................................................................9 表3.1 不含拒答類型資料比例..................................................................................12 表3.2 錯誤個數計算..................................................................................................13 表3.3 各問題同意與不同意個數計算......................................................................13 表3.4 葛特曼量表指標計算結果..............................................................................13 表 3.5 性別的次數分配..............................................................................................14 表 3.6 婚姻狀況的次數分配......................................................................................15 表 3.7 教育程度的次數分配......................................................................................15 表3.8 教育年數的次數分配表..................................................................................16 表 3.9 每月平均收入的次數分配..............................................................................16 表4.1 拒答型態對應可能的回答形式(原始資料)....................................................18 表4.2 拒答率為10%之下，各種回答須對應產生的個數.........................................18 表4.3 正確率計算(原始資料)....................................................................................22 表4.4 拒答型態對應可能的回答形式(當問題數為二時).......................................22 表4.5 問題數為二，出現拒答時建議的簡易插補方法...........................................27 表4.6 拒答型態對應可能的回答形式(當問題數為三時).......................................28 表4.7 問題數為三，出現拒答時建議的簡易插補方法...........................................32 表4.8 問題數為四，出現拒答時建議的簡易插補方法...........................................35 表4.9 模式一多重插補法之下的常用統計量比較..................................................37 表4.10 四種多重插補法的正確率比較....................................................................37 表4.11 最鄰近插補法之下的常用統計量比較.........................................................39 附錄表1 正確率計算(不考慮葛特曼量表特性).......................................................46 附錄表2 模式二多重插補法之下的常用統計量比較.............................................56 附錄表3 模式三多重插補法之下的常用統計量比較.............................................57 附錄表4 模式四多重插補法之下的常用統計量比較.............................................58 圖目錄圖2.1 多重插補法圖示.................................................................................................6 圖 3.1 性別的次數分配..............................................................................................14 圖3.2 年齡的次數分配圖..........................................................................................14 圖 3.3 婚姻狀況的次數分配......................................................................................15 圖3.4 教育程度的次數分配圖..................................................................................15 圖3.5 教育年數的次數分配圖..................................................................................16 圖3.6 收入的次數分配圖..........................................................................................16	zh_TW
dc.format.extent	90831 bytes	-
dc.format.extent	168321 bytes	-
dc.format.extent	132865 bytes	-
dc.format.extent	120907 bytes	-
dc.format.extent	428661 bytes	-
dc.format.extent	124630 bytes	-
dc.format.extent	173039 bytes	-
dc.format.mimetype	application/pdf	-
dc.format.mimetype	application/pdf	-
dc.format.mimetype	application/pdf	-
dc.format.mimetype	application/pdf	-
dc.format.mimetype	application/pdf	-
dc.format.mimetype	application/pdf	-
dc.format.mimetype	application/pdf	-
dc.language.iso	en_US	-
dc.source.uri (資料來源)	http://thesis.lib.nccu.edu.tw/record/#G0096354013	en_US
dc.subject (關鍵詞)	拒答	zh_TW
dc.subject (關鍵詞)	葛特曼量表	zh_TW
dc.subject (關鍵詞)	簡易插補	zh_TW
dc.subject (關鍵詞)	多重插補	zh_TW
dc.subject (關鍵詞)	最鄰近插補	zh_TW
dc.title (題名)	葛特曼量表之拒答插補研究	zh_TW
dc.type (資料類型)	thesis	en
dc.relation.reference (參考文獻)	(一)中文部分	zh_TW
dc.relation.reference (參考文獻)	陳信木、林佳瑩(1997)〈調查資料之遺漏值的處置—以熱卡插補法為例〉，《調查	zh_TW
dc.relation.reference (參考文獻)	研究》，3：75-106	zh_TW
dc.relation.reference (參考文獻)	(二)英文部分	zh_TW
dc.relation.reference (參考文獻)	Buuren、S.V. and Oudshoorn、C.G.M.、(2000). Multivariate Imputation by Chained	zh_TW
dc.relation.reference (參考文獻)	Equations: MICE V1.0 User’s Manual. Report PG/VGZ/00.038、TNO	zh_TW
dc.relation.reference (參考文獻)	Prevention and Health、Leiden.	zh_TW
dc.relation.reference (參考文獻)	Cover、T.M. and Hart、P.E.、(1967). 「Nearest Neighbor Pattern Classification」. IEEE	zh_TW
dc.relation.reference (參考文獻)	Transactions on Information Theory、13:21-27.	zh_TW
dc.relation.reference (參考文獻)	Fix、E. and Hodges、J.L.、(1951). 「Discriminatory analysis－Nonparametric	zh_TW
dc.relation.reference (參考文獻)	Discrimination: Consistency Properties」. Project 21-49-004、Report NO.4、US	zh_TW
dc.relation.reference (參考文獻)	Air Force School of Aviation Medicine、Randolph Field.	zh_TW
dc.relation.reference (參考文獻)	Guttman、L.、(1950). 「The Basis for Scalogram Analysis」(With Stouffer et al).	zh_TW
dc.relation.reference (參考文獻)	Measurement and Prediction. Studies in Social Psychology in World War II、	zh_TW
dc.relation.reference (參考文獻)	Princeton University Press、NJ、4:60-90.	zh_TW
dc.relation.reference (參考文獻)	Kaufman、L.、and Rousseeuw、P.J.、(1990). Finding Groups in Data: An Introduction	zh_TW
dc.relation.reference (參考文獻)	to Cluster Analysis. New York: John Wiley and Sons、Inc.	zh_TW
dc.relation.reference (參考文獻)	Liao、P.、and Tu、S.、(2006). 「Examining the Scalability of Intimacy Permissiveness	zh_TW
dc.relation.reference (參考文獻)	Scale in Taiwan」. Social Indicators Research、76:207-232.	zh_TW
dc.relation.reference (參考文獻)	Little、R.J.A.、and Rubin、D.B.、(1989). 「The Analysis of Social Science Data with	zh_TW
dc.relation.reference (參考文獻)	Missing Values」. Sociological Methods and Research、18: 292-326.	zh_TW
dc.relation.reference (參考文獻)	Menzel、H. (1953). 「A New Coefficient for Scalogram Analysis」. Public Opinion	zh_TW
dc.relation.reference (參考文獻)	Quarterly、17: 268-280.	zh_TW
dc.relation.reference (參考文獻)	Rubin、D.B.、(1976). Inference and missing data. Biometrika、63:581-592.	zh_TW
dc.relation.reference (參考文獻)	Rubin、D.B.、(1987). Multiple Imputation for Nonresponse in Surveys. New York:	zh_TW
dc.relation.reference (參考文獻)	John Wiley.	zh_TW
dc.relation.reference (參考文獻)	Schafer、J.L (1999)、「Multiple Imputation: A Primer」. Statistical Methods in Medical	zh_TW
dc.relation.reference (參考文獻)	Research 8: 3-15.	zh_TW
dc.relation.reference (參考文獻)	Shoemaker、P.F.、Eichholz、M.、and Skewes、E.A.、(2002). 「Item Nonresponse:	zh_TW
dc.relation.reference (參考文獻)	Distinguishing Between Don’t Know and Refuse」. International Journal of	zh_TW
dc.relation.reference (參考文獻)	Public Opinion Research、14: 193-201.	zh_TW
dc.relation.reference (參考文獻)	Sinharay、S.、Stern、H.S.、and Russell、D. (2001). 「The Use of Multiple Imputation for	zh_TW
dc.relation.reference (參考文獻)	the Analysis of Missing Data」. Psychological Methods 4: 317-329.	zh_TW
dc.relation.reference (參考文獻)	Tanner、M.A. and Wong、W.H.、(1987). 「The Calculation of Posterior Distributions by	zh_TW
dc.relation.reference (參考文獻)	Data Augmentation (with Discussion)」. Journal of the American Statistical	zh_TW
dc.relation.reference (參考文獻)	Association、82: 528-50.	zh_TW
dc.relation.reference (參考文獻)	Yamaguchi、K. (2000). 「Multinomial Logit Latent-Class Regression Models: An	zh_TW
dc.relation.reference (參考文獻)	Analysis of the Predictors of Gender-Role Attitudes Among Japanese Women」.	zh_TW
dc.relation.reference (參考文獻)	American Journal of Sociology、105: 1702-1740.	zh_TW

Publications-Theses

Article View/Open

Publication Export

Google ScholarTM

NCCU Library

Citation Infomation

Related Publications in TAIR

Google Scholar^TM