學術產出-Theses

Article View/Open

Publication Export

Google ScholarTM

政大圖書館

Citation Infomation

題名 分類錯誤資料在母體異質下的馬可夫模型
A mixture model for heterogeneous ordinal data with misclassification
作者 李依璇
Lee, Yi-Shiuan
貢獻者 黃佳慧
Huang, Chia-Hui
李依璇
Lee, Yi-Shiuan
關鍵詞 縱向資料
羅吉斯迴歸
潛在類別
隱藏式馬可夫模型
分類錯誤
Hidden Markov Model
Latent class
Logistic regression
Longitudinal data
Misclassification
日期 2022
上傳時間 1-Aug-2022 17:15:10 (UTC+8)
摘要 本研究欲觀察一系列屬於定序變量的縱向資料,並假設母體是由兩個具備 相異特質的群體所組成,以群體劃分母體的方式來處理縱向資料中常見的組間 相異現象,而同一個對象的數個數據間所存在的相關性則以馬可夫模型解釋。另一方面定序變量是由三個類別組成,每一個類別皆被視為一種馬可夫狀態,並且假設不同的群體有相異的狀態空間。在蒐集數據時,測量誤差會使得部分資料的分類有誤,也就是說觀察到的馬可夫鏈未必皆是正確的。為了處理個體異質性以及測量誤差的問題,本研究利用混合馬可夫模型以及隱藏式馬可夫模型的概念,以羅吉斯迴歸分別建立群體類別、給定群體條件之下的初始狀態與狀態轉換的機率模型。計算概似函數時將機率表示為所有可能的馬可夫鏈與群體之聯合機率的加總,以排除測量誤差所產生的錯誤資訊,再利用 R 語言中的 “constrOptim”套件,放入對數概似函數和分數函數求出最大概似估計量。最後由四組不同的參數值進行電腦模擬,以偏誤、標準差、標準誤和覆蓋率這四種指標來評估所提出的統計模型,結果顯示樣本的分佈情形不影響評估表現,而參數估計偏誤與測量誤差之間的關係也是合乎預期的。
The aim of this work is to provide a model for longitudinal data which has the characteristics of heterogeneity in population and correlation within subjects. In this study, the former can be explained by supposing the population consists of several unobservable subgroups with distint features while the latter can be captured by Markov models, in which the Markov states are assumed to be ordinal variables. Furthermore, some observed states are subject to misclassification owing to the measurement error; hence both groups and Markov states without misclassification are latent variables. To address this, mixture Markov chain model and hidden Markov model are used in the analysis of misclassified heterogenous ordinal data. The models of subpopulation membership, subpopuation-specific initial states as well as transition patterns are proposed with logistic regression models. Simulations are conducted under four different parameters settings and maximum likelihood estimators are solved by using the function "constrOptim" in software R. Our simulation results suggest that the estimations, in terms of bias, standard deviation, standard error and coverage probability, are robust to the frequencies of the observed states. In addition, the dependence between esitmation biases and measurement error rates are in line with expectations.
參考文獻 Albert, P. S. (1994). A markov model for sequences of ordinal data from a relapsing- remitting disease. Biometrics, pages 51–60.

Bahl, L., Brown, P., De Souza, P., and Mercer, R. (1986). Maximum mutual infor- mation estimation of hidden markov model parameters for speech recognition. In ICASSP’86. IEEE International Conference on Acoustics, Speech, and Signal Pro- cessing, volume 11, pages 49–52. IEEE.

Bartolucci, F., Farcomeni, A., and Pennoni, F. (2012). Latent Markov models for lon- gitudinal data. CRC Press.

Baum, L. E., Petrie, T., Soules, G., and Weiss, N. (1970). A maximization technique occurring in the statistical analysis of probabilistic functions of markov chains. The annals of mathematical statistics, 41(1):164–171.

Chaijareenont, K., Sirimai, K., Boriboonhirunsarn, D., and Kiriwat, O. (2004). Accu- racy of nugent’s score and each amsel’s criteria in the diagnosis of bacterial vaginosis. J Med Assoc Thai, 87(11):1270–1274.

Cheon, K., Thoma, M. E., Kong, X., and Albert, P. S. (2014). A mixture of transition models for heterogeneous longitudinal ordinal data: with applications to longitudinal bacterial vaginosis data. Statistics in medicine, 33(18):3204–3213.

Clark, T. S. and Linzer, D. A. (2015). Should i use fixed or random effects? Political science research and methods, 3(2):399–408.

Cook, R. J. (1999). A mixed model for two-state markov processes under panel obser- vation. Biometrics, 55(3):915–920.

Dempster, A. P., Laird, N. M., and Rubin, D. B. (1977). Maximum likelihood from incomplete data via the em algorithm. Journal of the Royal Statistical Society: Series B (Methodological), 39(1):1–22.

Goodman, L. A. (1961). Statistical methods for the mover-stayer model. Journal of the American Statistical Association, 56(296):841–868.

Haussler, D. K. D. and Eeckman, M. G. R. F. H. (1996). A generalized hidden markov model for the recognition of human genes in dna. In Proc. int. conf. on intelligent systems for molecular biology, st. louis, pages 134–142.

Koumans, E. H. and Kendrick, J. S. (2001). Preventing adverse sequelae of bacterial vaginosis: a public health program and research agenda. Sexually transmitted dis- eases, pages 292–297.

Krumbein, W. C. and Dacey, M. F. (1969). Markov chains and embedded markov chains in geology. Journal of the International Association for Mathematical Geology, 1(1): 79–96.

Laird, N. M. and Ware, J. H. (1982). Random-effects models for longitudinal data. Biometrics, pages 963–974.

Markov, A. A. (1906). Rasprostranenie zakona bol’shih chisel na velichiny, zavisyaschie drug ot druga. Izvestiya Fiziko-matematicheskogo obschestva pri Kazanskom univer- sitete, 15(135-156):18.

Norris, J. R. (1998). Markov chains. Number 2. Cambridge university press.

Nugent, R. P., Krohn, M. A., and Hillier, S. L. (1991). Reliability of diagnosing bacterial vaginosis is improved by a standardized method of gram stain interpretation. Journal of clinical microbiology, 29(2):297–301.

Poulsen, C. S. (1983). Latent structure analysis with choice modeling applications. PhD thesis, University of Pennsylvania.

Sanders, K. L., Thoma, M. E., Yu, K., and Albert, P. S. (2011). An evaluation of the natural history of bacterial vaginosis using transition models. Sexually transmitted diseases, 38(12):1131.
描述 碩士
國立政治大學
統計學系
109354006
資料來源 http://thesis.lib.nccu.edu.tw/record/#G0109354006
資料類型 thesis
dc.contributor.advisor 黃佳慧zh_TW
dc.contributor.advisor Huang, Chia-Huien_US
dc.contributor.author (Authors) 李依璇zh_TW
dc.contributor.author (Authors) Lee, Yi-Shiuanen_US
dc.creator (作者) 李依璇zh_TW
dc.creator (作者) Lee, Yi-Shiuanen_US
dc.date (日期) 2022en_US
dc.date.accessioned 1-Aug-2022 17:15:10 (UTC+8)-
dc.date.available 1-Aug-2022 17:15:10 (UTC+8)-
dc.date.issued (上傳時間) 1-Aug-2022 17:15:10 (UTC+8)-
dc.identifier (Other Identifiers) G0109354006en_US
dc.identifier.uri (URI) http://nccur.lib.nccu.edu.tw/handle/140.119/141005-
dc.description (描述) 碩士zh_TW
dc.description (描述) 國立政治大學zh_TW
dc.description (描述) 統計學系zh_TW
dc.description (描述) 109354006zh_TW
dc.description.abstract (摘要) 本研究欲觀察一系列屬於定序變量的縱向資料,並假設母體是由兩個具備 相異特質的群體所組成,以群體劃分母體的方式來處理縱向資料中常見的組間 相異現象,而同一個對象的數個數據間所存在的相關性則以馬可夫模型解釋。另一方面定序變量是由三個類別組成,每一個類別皆被視為一種馬可夫狀態,並且假設不同的群體有相異的狀態空間。在蒐集數據時,測量誤差會使得部分資料的分類有誤,也就是說觀察到的馬可夫鏈未必皆是正確的。為了處理個體異質性以及測量誤差的問題,本研究利用混合馬可夫模型以及隱藏式馬可夫模型的概念,以羅吉斯迴歸分別建立群體類別、給定群體條件之下的初始狀態與狀態轉換的機率模型。計算概似函數時將機率表示為所有可能的馬可夫鏈與群體之聯合機率的加總,以排除測量誤差所產生的錯誤資訊,再利用 R 語言中的 “constrOptim”套件,放入對數概似函數和分數函數求出最大概似估計量。最後由四組不同的參數值進行電腦模擬,以偏誤、標準差、標準誤和覆蓋率這四種指標來評估所提出的統計模型,結果顯示樣本的分佈情形不影響評估表現,而參數估計偏誤與測量誤差之間的關係也是合乎預期的。zh_TW
dc.description.abstract (摘要) The aim of this work is to provide a model for longitudinal data which has the characteristics of heterogeneity in population and correlation within subjects. In this study, the former can be explained by supposing the population consists of several unobservable subgroups with distint features while the latter can be captured by Markov models, in which the Markov states are assumed to be ordinal variables. Furthermore, some observed states are subject to misclassification owing to the measurement error; hence both groups and Markov states without misclassification are latent variables. To address this, mixture Markov chain model and hidden Markov model are used in the analysis of misclassified heterogenous ordinal data. The models of subpopulation membership, subpopuation-specific initial states as well as transition patterns are proposed with logistic regression models. Simulations are conducted under four different parameters settings and maximum likelihood estimators are solved by using the function "constrOptim" in software R. Our simulation results suggest that the estimations, in terms of bias, standard deviation, standard error and coverage probability, are robust to the frequencies of the observed states. In addition, the dependence between esitmation biases and measurement error rates are in line with expectations.en_US
dc.description.tableofcontents 第一章 緒論 1
第一節 前言 1
第二節 研究動機 2

第二章 文獻回顧 5
第一節 單一馬可夫鏈模型 5
第二節 混合馬可夫模型 7
第三節 隱藏式馬可夫模型 8

第三章 研究方法 10
第一節 模型假設 10
第二節 統計模型 12
第三節 統計推論 14

第四章 模擬分析 18
第一節 資料生成 18
第二節 模擬評估與結果 19

第五章 結論 34
參考文獻 36
zh_TW
dc.format.extent 1110837 bytes-
dc.format.mimetype application/pdf-
dc.source.uri (資料來源) http://thesis.lib.nccu.edu.tw/record/#G0109354006en_US
dc.subject (關鍵詞) 縱向資料zh_TW
dc.subject (關鍵詞) 羅吉斯迴歸zh_TW
dc.subject (關鍵詞) 潛在類別zh_TW
dc.subject (關鍵詞) 隱藏式馬可夫模型zh_TW
dc.subject (關鍵詞) 分類錯誤zh_TW
dc.subject (關鍵詞) Hidden Markov Modelen_US
dc.subject (關鍵詞) Latent classen_US
dc.subject (關鍵詞) Logistic regressionen_US
dc.subject (關鍵詞) Longitudinal dataen_US
dc.subject (關鍵詞) Misclassificationen_US
dc.title (題名) 分類錯誤資料在母體異質下的馬可夫模型zh_TW
dc.title (題名) A mixture model for heterogeneous ordinal data with misclassificationen_US
dc.type (資料類型) thesisen_US
dc.relation.reference (參考文獻) Albert, P. S. (1994). A markov model for sequences of ordinal data from a relapsing- remitting disease. Biometrics, pages 51–60.

Bahl, L., Brown, P., De Souza, P., and Mercer, R. (1986). Maximum mutual infor- mation estimation of hidden markov model parameters for speech recognition. In ICASSP’86. IEEE International Conference on Acoustics, Speech, and Signal Pro- cessing, volume 11, pages 49–52. IEEE.

Bartolucci, F., Farcomeni, A., and Pennoni, F. (2012). Latent Markov models for lon- gitudinal data. CRC Press.

Baum, L. E., Petrie, T., Soules, G., and Weiss, N. (1970). A maximization technique occurring in the statistical analysis of probabilistic functions of markov chains. The annals of mathematical statistics, 41(1):164–171.

Chaijareenont, K., Sirimai, K., Boriboonhirunsarn, D., and Kiriwat, O. (2004). Accu- racy of nugent’s score and each amsel’s criteria in the diagnosis of bacterial vaginosis. J Med Assoc Thai, 87(11):1270–1274.

Cheon, K., Thoma, M. E., Kong, X., and Albert, P. S. (2014). A mixture of transition models for heterogeneous longitudinal ordinal data: with applications to longitudinal bacterial vaginosis data. Statistics in medicine, 33(18):3204–3213.

Clark, T. S. and Linzer, D. A. (2015). Should i use fixed or random effects? Political science research and methods, 3(2):399–408.

Cook, R. J. (1999). A mixed model for two-state markov processes under panel obser- vation. Biometrics, 55(3):915–920.

Dempster, A. P., Laird, N. M., and Rubin, D. B. (1977). Maximum likelihood from incomplete data via the em algorithm. Journal of the Royal Statistical Society: Series B (Methodological), 39(1):1–22.

Goodman, L. A. (1961). Statistical methods for the mover-stayer model. Journal of the American Statistical Association, 56(296):841–868.

Haussler, D. K. D. and Eeckman, M. G. R. F. H. (1996). A generalized hidden markov model for the recognition of human genes in dna. In Proc. int. conf. on intelligent systems for molecular biology, st. louis, pages 134–142.

Koumans, E. H. and Kendrick, J. S. (2001). Preventing adverse sequelae of bacterial vaginosis: a public health program and research agenda. Sexually transmitted dis- eases, pages 292–297.

Krumbein, W. C. and Dacey, M. F. (1969). Markov chains and embedded markov chains in geology. Journal of the International Association for Mathematical Geology, 1(1): 79–96.

Laird, N. M. and Ware, J. H. (1982). Random-effects models for longitudinal data. Biometrics, pages 963–974.

Markov, A. A. (1906). Rasprostranenie zakona bol’shih chisel na velichiny, zavisyaschie drug ot druga. Izvestiya Fiziko-matematicheskogo obschestva pri Kazanskom univer- sitete, 15(135-156):18.

Norris, J. R. (1998). Markov chains. Number 2. Cambridge university press.

Nugent, R. P., Krohn, M. A., and Hillier, S. L. (1991). Reliability of diagnosing bacterial vaginosis is improved by a standardized method of gram stain interpretation. Journal of clinical microbiology, 29(2):297–301.

Poulsen, C. S. (1983). Latent structure analysis with choice modeling applications. PhD thesis, University of Pennsylvania.

Sanders, K. L., Thoma, M. E., Yu, K., and Albert, P. S. (2011). An evaluation of the natural history of bacterial vaginosis using transition models. Sexually transmitted diseases, 38(12):1131.
zh_TW
dc.identifier.doi (DOI) 10.6814/NCCU202200708en_US