Publications-Theses

Article View/Open

Publication Export

Google ScholarTM

NCCU Library

Citation Infomation

Related Publications in TAIR

題名 混合連續與間斷資料之馬式距離的穩健估計
Robust estimation of the Mahalanobis distance for multivariate data mixed with continuous and discrete variables
作者 任嘉珩
Jen , Chia Heng
貢獻者 鄭宗記
任嘉珩
Jen , Chia Heng
關鍵詞 混合型資料
隱藏常態變數模型
穩健估計
馬式距離
mixed data
normal latnet variable model
robust estimation
Mahalanobis distacne
minimum covariance determinant
日期 2007
上傳時間 2009-09-14
摘要 本研究採用Lee 和Poon 所提出的隱藏常態變數模型來估計混合連續與間斷型變數之參數估計,並估計其馬式距離。此外,並利用穩健估計來估計混合型資料參數及其馬式距離,可在有離群值時解決最大蓋似估計的不穩定。
Poon and Lee (1987) applied normal latent variable model to deal with the parameters
     estimation for the data mixed with continuous and discrete variables and Bedrick et al. (2000) used this idea to evaluate the Mahalanobis distance. In this thesis, we extend a similar idea to robustly estimate Multivariate Data Mixed with Continuous and Discrete Variables with the same model. Furthermore, we evaluate the Mahalanobis distance which can determine similarity of variables. The proposed method can overcome the unreliability of MLE while there exist outliers in the data.
參考文獻 [1] Barnett, V. and Lewis, T. (1994), Outliers in Statistical Data, 3rd ed. New York:
John Wiley and Sons.
[2] Bedrick, E. J., Lapidus, J., and Powell, J. F. (2000), Estimating the Mahalanobis
Distance from Mixed Continuous and Discrete Data, Biometrics, 56, 394–401.
[3] Bhattacharyya, A. (1943), On a measure of divergence between two statistical
populations defined by their probability distributions, Bulletin of the Calcutta
Mathematical Society, 35, 99–109.
[4] Donoho, D. L., and Huber,P. J. (1983), The Notion of Breakdown Point. In A
Festschrift for Erich L. Lehmann, Ed. P. J. Bickel, K. A. Docksum and J. L.
Hodges, Jr., 157–84, Belmont CA: Wadsworth.
[5] Hampel, F., Ronchetti, P., Rousseeuw, P., and Stahel, W. (1986), Robust Statistics:
The Approach Based on Influence Functions, New York: John Wiley and
Sons.
[6] Huber, Peter. J. (1964), Robust estimation of a location parameter, The Annals
of Mathematical Statistics, 35, 73–101.
[7] Huber, Peter. J. (1981), Robust Statistics, New York: John Wiley and Sons.
[8] Jobsin, J. D. (1992), Applied Multivariate Data Analysis: Volume II: Categorical
and Multivariate Methods, New York: Springer-Verlag.
[9] Krzanowski, W. J. (1975), Discrimination and classfication using both binary
and continuous variables, Journal of the American Statistical Association, 70,
782–790.
[10] Krzanowski, W. J. (1983), Distance between population using mixed continuous
and categorical variables, Biometrika, 70, 235–243.
[11] Lehmann, E. L. and Casella, G. (1998), Theory of Point Estimation, New York:
Springer.
[12] Krzanowski, W. J. and Marriott, F. H. C. (1995), Kendall’s Library of Statistics
2, Maltivariate Analysis Part 2, London: Arnold.
[13] Mahalanobis, P. C. (1936), On the generalized distance in statistics, Proceedings
of the National Institute of Science India, 2, 49–55.
[14] Mardia, K. V., Kent, J. T. and Bibby, J. M. (1979), Multivariate Analysis,
London: Academic Press.
[15] Maronna, R. A., Martin, R. D. and Yohai, V. J. (2006), Robust Statistics, Theory
and Methods, New York: Wiley.
[16] Matusita, K. (1972), Discrimination and the affinity of distributions, Sidcriminant
Analysis and Applications, Ed. T. Cacoullos, pp.213-223, New York: Academic
Press.
[17] Olkin,I. and Tate, R. F. (1961), Multivariate correlation models with mixed
discrete and continuous variables, Annals of Mathematical Statistics, 32, 448–
465.
[18] Poon, W. Y. and Lee, S. Y. (1986), Maximum likelihood estimation of polyserial
correlations, Psychometrika, 51, 113–121.
[19] Poon, W. Y. and Lee, S. Y. (1987),Maximum likelihood estimation of multivariate
polyserial and polychoric correlation coefficients, Psychometrika, 52, 409–
430.
[20] Prohorov, Y. V. (1956), Convergence of random processes and limit theorems in
probability theory, Theory of Probability and its Applications, 1, 157–214.
[21] Rousseeuw, P. J. (1984), Least median of squares regression, Journal of the
American Statistical Association, 79, 871–880.
[22] Rousseeuw, P. J. and A. M. Leroy (1987), Robust Regression and Outlier Detection,
New York: John Wiley.
[23] Rousseeuw, P. J. and Van Driessen, K. (1999), A fast algorithm for the minimum
covariance determinant estimator. Technometrics, 41, 212V223.
[24] Zaman, A., Rousseeuw, P. J., and Orhan, M. (2001), Econometric applications
of high-breakdown robust regression techiniques, Econometrics Letters, 71, 1–8.
描述 碩士
國立政治大學
統計研究所
95354024
96
資料來源 http://thesis.lib.nccu.edu.tw/record/#G0095354024
資料類型 thesis
dc.contributor.advisor 鄭宗記zh_TW
dc.contributor.author (Authors) 任嘉珩zh_TW
dc.contributor.author (Authors) Jen , Chia Hengen_US
dc.creator (作者) 任嘉珩zh_TW
dc.creator (作者) Jen , Chia Hengen_US
dc.date (日期) 2007en_US
dc.date.accessioned 2009-09-14-
dc.date.available 2009-09-14-
dc.date.issued (上傳時間) 2009-09-14-
dc.identifier (Other Identifiers) G0095354024en_US
dc.identifier.uri (URI) https://nccur.lib.nccu.edu.tw/handle/140.119/30926-
dc.description (描述) 碩士zh_TW
dc.description (描述) 國立政治大學zh_TW
dc.description (描述) 統計研究所zh_TW
dc.description (描述) 95354024zh_TW
dc.description (描述) 96zh_TW
dc.description.abstract (摘要) 本研究採用Lee 和Poon 所提出的隱藏常態變數模型來估計混合連續與間斷型變數之參數估計,並估計其馬式距離。此外,並利用穩健估計來估計混合型資料參數及其馬式距離,可在有離群值時解決最大蓋似估計的不穩定。zh_TW
dc.description.abstract (摘要) Poon and Lee (1987) applied normal latent variable model to deal with the parameters
     estimation for the data mixed with continuous and discrete variables and Bedrick et al. (2000) used this idea to evaluate the Mahalanobis distance. In this thesis, we extend a similar idea to robustly estimate Multivariate Data Mixed with Continuous and Discrete Variables with the same model. Furthermore, we evaluate the Mahalanobis distance which can determine similarity of variables. The proposed method can overcome the unreliability of MLE while there exist outliers in the data.
en_US
dc.description.tableofcontents 1 Introduction 1
     2 Mahalanobis Distance and Robust Estimation 3
     2.1 Mahalanobis Distance . . . . . . . . . . . . . . . . . . . . . . . . . . 3
     2.1.1 Mahalanobis Distance Between Individuals . . . . . . . . . . . 3
     2.1.2 Mahalanobis Distance Between Populaions . . . . . . . . . . . 4
     2.2 Robust Estimation . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
     2.2.1 Breakdown Point . . . . . . . . . . . . . . . . . . . . . . . . . 5
     2.2.2 Definition of Minimum Covariance Determinant . . . . . . . . 7
     3 Distance Between Populations With Mixed Continuous and Discrete
     Variables 8
     3.1 Distance Between Populations With Continuous Variables . . . . . . 8
     3.2 Distance between Populations With Mixed Continuous and Discrete
     Variables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
     3.3 Maximum Likelihood Estimation . . . . . . . . . . . . . . . . . . . . 10
     3.4 Robust Estimation . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
     3.5 Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
     3.5.1 Academic Achievement Data . . . . . . . . . . . . . . . . . . . 11
     3.5.2 Skull Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
     4 Robust Estimation of the Mahalanobis Distance for Multivariate
     Data Mixed with Continuous and Discrete Variables 21
     4.1 Notation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
     4.2 Maximum Likelihood Estimation . . . . . . . . . . . . . . . . . . . . 23
     4.3 Estimation of the Mahalanobis Distance . . . . . . . . . . . . . . . . 26
     4.4 Robust Estimation of the Mahalanobis Distance . . . . . . . . . . . . 26
     4.5 Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
     4.5.1 Academic Achievement Data . . . . . . . . . . . . . . . . . . . 27
     4.5.2 Skull Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
     5 Conclusions 29
     Reference 31
zh_TW
dc.language.iso en_US-
dc.source.uri (資料來源) http://thesis.lib.nccu.edu.tw/record/#G0095354024en_US
dc.subject (關鍵詞) 混合型資料zh_TW
dc.subject (關鍵詞) 隱藏常態變數模型zh_TW
dc.subject (關鍵詞) 穩健估計zh_TW
dc.subject (關鍵詞) 馬式距離zh_TW
dc.subject (關鍵詞) mixed dataen_US
dc.subject (關鍵詞) normal latnet variable modelen_US
dc.subject (關鍵詞) robust estimationen_US
dc.subject (關鍵詞) Mahalanobis distacneen_US
dc.subject (關鍵詞) minimum covariance determinanten_US
dc.title (題名) 混合連續與間斷資料之馬式距離的穩健估計zh_TW
dc.title (題名) Robust estimation of the Mahalanobis distance for multivariate data mixed with continuous and discrete variablesen_US
dc.type (資料類型) thesisen
dc.relation.reference (參考文獻) [1] Barnett, V. and Lewis, T. (1994), Outliers in Statistical Data, 3rd ed. New York:zh_TW
dc.relation.reference (參考文獻) John Wiley and Sons.zh_TW
dc.relation.reference (參考文獻) [2] Bedrick, E. J., Lapidus, J., and Powell, J. F. (2000), Estimating the Mahalanobiszh_TW
dc.relation.reference (參考文獻) Distance from Mixed Continuous and Discrete Data, Biometrics, 56, 394–401.zh_TW
dc.relation.reference (參考文獻) [3] Bhattacharyya, A. (1943), On a measure of divergence between two statisticalzh_TW
dc.relation.reference (參考文獻) populations defined by their probability distributions, Bulletin of the Calcuttazh_TW
dc.relation.reference (參考文獻) Mathematical Society, 35, 99–109.zh_TW
dc.relation.reference (參考文獻) [4] Donoho, D. L., and Huber,P. J. (1983), The Notion of Breakdown Point. In Azh_TW
dc.relation.reference (參考文獻) Festschrift for Erich L. Lehmann, Ed. P. J. Bickel, K. A. Docksum and J. L.zh_TW
dc.relation.reference (參考文獻) Hodges, Jr., 157–84, Belmont CA: Wadsworth.zh_TW
dc.relation.reference (參考文獻) [5] Hampel, F., Ronchetti, P., Rousseeuw, P., and Stahel, W. (1986), Robust Statistics:zh_TW
dc.relation.reference (參考文獻) The Approach Based on Influence Functions, New York: John Wiley andzh_TW
dc.relation.reference (參考文獻) Sons.zh_TW
dc.relation.reference (參考文獻) [6] Huber, Peter. J. (1964), Robust estimation of a location parameter, The Annalszh_TW
dc.relation.reference (參考文獻) of Mathematical Statistics, 35, 73–101.zh_TW
dc.relation.reference (參考文獻) [7] Huber, Peter. J. (1981), Robust Statistics, New York: John Wiley and Sons.zh_TW
dc.relation.reference (參考文獻) [8] Jobsin, J. D. (1992), Applied Multivariate Data Analysis: Volume II: Categoricalzh_TW
dc.relation.reference (參考文獻) and Multivariate Methods, New York: Springer-Verlag.zh_TW
dc.relation.reference (參考文獻) [9] Krzanowski, W. J. (1975), Discrimination and classfication using both binaryzh_TW
dc.relation.reference (參考文獻) and continuous variables, Journal of the American Statistical Association, 70,zh_TW
dc.relation.reference (參考文獻) 782–790.zh_TW
dc.relation.reference (參考文獻) [10] Krzanowski, W. J. (1983), Distance between population using mixed continuouszh_TW
dc.relation.reference (參考文獻) and categorical variables, Biometrika, 70, 235–243.zh_TW
dc.relation.reference (參考文獻) [11] Lehmann, E. L. and Casella, G. (1998), Theory of Point Estimation, New York:zh_TW
dc.relation.reference (參考文獻) Springer.zh_TW
dc.relation.reference (參考文獻) [12] Krzanowski, W. J. and Marriott, F. H. C. (1995), Kendall’s Library of Statisticszh_TW
dc.relation.reference (參考文獻) 2, Maltivariate Analysis Part 2, London: Arnold.zh_TW
dc.relation.reference (參考文獻) [13] Mahalanobis, P. C. (1936), On the generalized distance in statistics, Proceedingszh_TW
dc.relation.reference (參考文獻) of the National Institute of Science India, 2, 49–55.zh_TW
dc.relation.reference (參考文獻) [14] Mardia, K. V., Kent, J. T. and Bibby, J. M. (1979), Multivariate Analysis,zh_TW
dc.relation.reference (參考文獻) London: Academic Press.zh_TW
dc.relation.reference (參考文獻) [15] Maronna, R. A., Martin, R. D. and Yohai, V. J. (2006), Robust Statistics, Theoryzh_TW
dc.relation.reference (參考文獻) and Methods, New York: Wiley.zh_TW
dc.relation.reference (參考文獻) [16] Matusita, K. (1972), Discrimination and the affinity of distributions, Sidcriminantzh_TW
dc.relation.reference (參考文獻) Analysis and Applications, Ed. T. Cacoullos, pp.213-223, New York: Academiczh_TW
dc.relation.reference (參考文獻) Press.zh_TW
dc.relation.reference (參考文獻) [17] Olkin,I. and Tate, R. F. (1961), Multivariate correlation models with mixedzh_TW
dc.relation.reference (參考文獻) discrete and continuous variables, Annals of Mathematical Statistics, 32, 448–zh_TW
dc.relation.reference (參考文獻) 465.zh_TW
dc.relation.reference (參考文獻) [18] Poon, W. Y. and Lee, S. Y. (1986), Maximum likelihood estimation of polyserialzh_TW
dc.relation.reference (參考文獻) correlations, Psychometrika, 51, 113–121.zh_TW
dc.relation.reference (參考文獻) [19] Poon, W. Y. and Lee, S. Y. (1987),Maximum likelihood estimation of multivariatezh_TW
dc.relation.reference (參考文獻) polyserial and polychoric correlation coefficients, Psychometrika, 52, 409–zh_TW
dc.relation.reference (參考文獻) 430.zh_TW
dc.relation.reference (參考文獻) [20] Prohorov, Y. V. (1956), Convergence of random processes and limit theorems inzh_TW
dc.relation.reference (參考文獻) probability theory, Theory of Probability and its Applications, 1, 157–214.zh_TW
dc.relation.reference (參考文獻) [21] Rousseeuw, P. J. (1984), Least median of squares regression, Journal of thezh_TW
dc.relation.reference (參考文獻) American Statistical Association, 79, 871–880.zh_TW
dc.relation.reference (參考文獻) [22] Rousseeuw, P. J. and A. M. Leroy (1987), Robust Regression and Outlier Detection,zh_TW
dc.relation.reference (參考文獻) New York: John Wiley.zh_TW
dc.relation.reference (參考文獻) [23] Rousseeuw, P. J. and Van Driessen, K. (1999), A fast algorithm for the minimumzh_TW
dc.relation.reference (參考文獻) covariance determinant estimator. Technometrics, 41, 212V223.zh_TW
dc.relation.reference (參考文獻) [24] Zaman, A., Rousseeuw, P. J., and Orhan, M. (2001), Econometric applicationszh_TW
dc.relation.reference (參考文獻) of high-breakdown robust regression techiniques, Econometrics Letters, 71, 1–8.zh_TW