學術產出-NSC Projects

Article View/Open

Publication Export

Google ScholarTM

政大圖書館

Citation Infomation

  • No doi shows Citation Infomation
題名 接受者操作特徵曲線之部份線下面積之極大化
其他題名 Maximization of the Partial Area under the ROC Curve
作者 薛慧敏;張源俊
貢獻者 統計學系
日期 2012
上傳時間 15-Apr-2016 09:52:31 (UTC+8)
摘要 近年來由於生物技術上的重大進展,使得我們能夠以更低成本蒐集到更高品質的資料。在科學研究中,研究人員可以同時獲得多個變數的資料。如何有效率的捕捉資料中的重要資訊以獲得其最佳表現為一重要課題。我們考慮某疾病診斷的醫學研究,其中研究人員同時蒐集到一組疾病潛在因子變數。此研究的目的是尋求此組因子的最佳線性組合,使得其在該疾病的診斷上有最佳的診斷能力。在診斷能力上,我們考慮的評估準則為接受者操作特徵函數之部分線下面積 (pAUC),其中評估的範圍僅限於在低偽陽率部分。過去在常態假設下,給定母體參數值,我們曾提出此最佳線性組合的充分條件,並發展其相關的計算方法。若母體參數為未知時,我們考慮以樣本資料獲得的最大概似估計量(MLE)代入計算中,則可獲得最佳線性組合的估計量。在此研究中,我們將推導此估計量的統計性質,並發展統計檢定方法以檢定這組潛在因子是否對該疾病具有診斷能力。我們將進行電腦模擬以驗證理論推導結果,與評估所發展之檢定方法。最後也將運用我們的方法在實際資料上,以判斷其實用性。
Nowadays, one major benefit brought by the remarkable progress in biotechnology is that high quality data can be produced with lower cost. It’s more common that multiple features are available in a study. An important issue for researchers is to find a way to efficiently capture the vital information of the features for best performance. Consider a medical research, in which investigators obtain a set of multiple features potentially related to some specific disease simultaneously. The goal of this study is to solve for the linear combination of the features that has the best diagnostic ability. Several measures are proposed for assessment of the diagnostic power of a diagnostic tool. One popular type for evaluation of a continuous-scale diagnosis focuses on the ROC curve. Here the criterion of interest is the partial area under the ROC curve (pAUC) on an extent of high specificity of clinical importance. Previously, we have given a sufficient condition for the best linear combination to achieve the maximal pAUC under normality, and have developed an algorithm for calculation of the solution. In this study, we aim to show the statistical property of the estimated best linear combination obtained from replacing the population parameters by their MLEs, from a random sample of data, in calculation. Next, a statistical testing procedure will be developed to determine whether the set of features has a significant diagnostic power to the disease. Intensive numerical studies will be performed to validate the proposed methods. Real examples will be provided for illustration.
關聯 計畫編號 NSC101-2118-M004-004
資料類型 report
dc.contributor 統計學系-
dc.creator (作者) 薛慧敏;張源俊-
dc.date (日期) 2012-
dc.date.accessioned 15-Apr-2016 09:52:31 (UTC+8)-
dc.date.available 15-Apr-2016 09:52:31 (UTC+8)-
dc.date.issued (上傳時間) 15-Apr-2016 09:52:31 (UTC+8)-
dc.identifier.uri (URI) http://nccur.lib.nccu.edu.tw/handle/140.119/84725-
dc.description.abstract (摘要) 近年來由於生物技術上的重大進展,使得我們能夠以更低成本蒐集到更高品質的資料。在科學研究中,研究人員可以同時獲得多個變數的資料。如何有效率的捕捉資料中的重要資訊以獲得其最佳表現為一重要課題。我們考慮某疾病診斷的醫學研究,其中研究人員同時蒐集到一組疾病潛在因子變數。此研究的目的是尋求此組因子的最佳線性組合,使得其在該疾病的診斷上有最佳的診斷能力。在診斷能力上,我們考慮的評估準則為接受者操作特徵函數之部分線下面積 (pAUC),其中評估的範圍僅限於在低偽陽率部分。過去在常態假設下,給定母體參數值,我們曾提出此最佳線性組合的充分條件,並發展其相關的計算方法。若母體參數為未知時,我們考慮以樣本資料獲得的最大概似估計量(MLE)代入計算中,則可獲得最佳線性組合的估計量。在此研究中,我們將推導此估計量的統計性質,並發展統計檢定方法以檢定這組潛在因子是否對該疾病具有診斷能力。我們將進行電腦模擬以驗證理論推導結果,與評估所發展之檢定方法。最後也將運用我們的方法在實際資料上,以判斷其實用性。-
dc.description.abstract (摘要) Nowadays, one major benefit brought by the remarkable progress in biotechnology is that high quality data can be produced with lower cost. It’s more common that multiple features are available in a study. An important issue for researchers is to find a way to efficiently capture the vital information of the features for best performance. Consider a medical research, in which investigators obtain a set of multiple features potentially related to some specific disease simultaneously. The goal of this study is to solve for the linear combination of the features that has the best diagnostic ability. Several measures are proposed for assessment of the diagnostic power of a diagnostic tool. One popular type for evaluation of a continuous-scale diagnosis focuses on the ROC curve. Here the criterion of interest is the partial area under the ROC curve (pAUC) on an extent of high specificity of clinical importance. Previously, we have given a sufficient condition for the best linear combination to achieve the maximal pAUC under normality, and have developed an algorithm for calculation of the solution. In this study, we aim to show the statistical property of the estimated best linear combination obtained from replacing the population parameters by their MLEs, from a random sample of data, in calculation. Next, a statistical testing procedure will be developed to determine whether the set of features has a significant diagnostic power to the disease. Intensive numerical studies will be performed to validate the proposed methods. Real examples will be provided for illustration.-
dc.relation (關聯) 計畫編號 NSC101-2118-M004-004-
dc.title (題名) 接受者操作特徵曲線之部份線下面積之極大化-
dc.title.alternative (其他題名) Maximization of the Partial Area under the ROC Curve-
dc.type (資料類型) report-