學術產出-Theses

Article View/Open

Publication Export

Google ScholarTM

政大圖書館

Citation Infomation

  • No doi shows Citation Infomation
題名 高密度寡核甘酸基因陣列晶片正規化方法之研究
The Research of Normalization Methods for High Density Oligonucleotide Array
作者 薛慧芬
Hsueh ,Hui-Fen
貢獻者 薛慧敏
Hsueh ,Huey-Miin
薛慧芬
Hsueh ,Hui-Fen
關鍵詞 寡核甘酸
正規化
Oligonucleotide
Normalization
日期 2004
上傳時間 2009-09-14
摘要 高密度寡核甘酸基因陣列實驗是新的生物技術,可在一個晶片上蒐集到數千至上萬個基因資料,資料處理的過程相當繁複,包括背景訊號的修正、正規化、探針背景的修正及探針組資料的整合,本研究首先將介紹各資料處理步驟。其中正規化的目的是要修正資料中由實驗產生的系統化變異,去除實驗誤差,使資料更為純淨,則後續所做的統計分析才會更為精確。之後再詳細介紹三種正規化方法,包括:尺度調整法、循環平滑調整法及百分位調整法。並將以一組實際資料來說明正規化後的結果。最終採取電腦模擬的方式,以平均四分位距、平均標準差、Diff統計量及離群值的個數這四個量化準則,來研究各正規化方法的效果,以及比較這三種正規化方法的優劣,同時也將探討此四種準則的適當性。
High-density oligonucleotide array gene experiment is a new biological technology. More than thousands of gene data can be obtained in an array. The data processing includes background correction, normalization, probe specific background correction and summarizing the probe set value into one expression measure. The goal of normalization is to remove the systematic variation induced in the experiment while keeping the biological variation of interest. Using the purified data, one will obtain more accurate conclusions in subsequent statistical analysis. Firstly, we introduce the data processing procedures. Three normalization methods, which include Scaling, Cyclic Loess and Quantile, are explained in detail and illustrated by a real data set. Moreover, a simulation study is conducted to compare the three methods. Four quantities, Mean of IQR, Mean of Standard Deviation, Diff Statistics and Outlier, are proposed for assessment. Not only the performances of the three normalization methods but also the properties of the four proposed criteria are given and studied in this research.
參考文獻 1. Affymetrix(1999). Affymetrix Microarray Suite User Guide. Affymetrix, Santa Clara, CA, version 4 edition.
2. Affymetrix(2002). Statistical Algorithms Description Document. Technical report, Affymetrix .
3. Astrand, M.(2003). Contrast normalization of oligonucleotide arrays. Journal of Computational Biology, 10(1), 95-102.
4. Bolstad, B., Irizarry, R., Astrand, M., and Speed, T.(2003). A comparison of normalization methods for high density oligonucleotide array data based on variance and bias. Bioinformatics, 19(2), 185-193.
5. Cleveland, W. S. and Devlin, S. J. (1988). Locally-weighted regression: An approach to regression analysis by local fitting, Journal of the American Statistical Association, 83,596-610.
6. Irizarry, R., Hobbs, B., Collin, F., Beazer-Barclay, Y., Antonellis, K., Scherf, U., and Speed, T.(2003). Exploration, normalization, and summaries of high density oligonucleotide array probe level data. Biostatistics. 4, 249-264.
7. Lazaridis, E., Sinibaldi, D., Bloom, G., Mane, S., and Jove, R.(2002). A simple method to improve probe set estimates form oligonucleotide arrays. Math Biostatistics, 176(1), 53-58.
8. Li, C. and Wong, W.(2001a). Model-based analysis of oligonucleotide arrays: Expression index computation and outlier detection. Proceedings of the National Academy of Science U S A, 98, 31-36.
9. Li, C. and Wong, W.(2001b). Model-based analysis of oligonucleotide arrays: Model Validation Design Issues and Standard Error Application. Genome Biology, 2, 1-11.
10. Naef, F., Lim, D.A., Patil, N., and Magnasco, M.O.(2001). From features to expression: High density oligonucleotide array analysis revisited. Tech Report 1, 1-9.
11. Workman, C., Saxild, J., Nielsen, C., Brunak, S., and Knudsen, S.(2002). A new non-linear normalization method for reducing variability in DNA microarray experiments. Genome Biology, 3(9), 1-16.
12. Yang, Y.H., Dudoit, S., Luu, P., Lin, D. M., Peng, V., Ngai, J., Speed, T. (2002). Normalization for cDNA microarray data: a robust composite method addressing single and multiple slide systematic variation. Nucleic Acids Research, 30(4), e15.
13. http://www.affymetrix.com/index.affx
14. http://www.genelogic.com/media/studies/index.cfm
描述 碩士
國立政治大學
統計研究所
91354011
93
資料來源 http://thesis.lib.nccu.edu.tw/record/#G0913540111
資料類型 thesis
dc.contributor.advisor 薛慧敏zh_TW
dc.contributor.advisor Hsueh ,Huey-Miinen_US
dc.contributor.author (Authors) 薛慧芬zh_TW
dc.contributor.author (Authors) Hsueh ,Hui-Fenen_US
dc.creator (作者) 薛慧芬zh_TW
dc.creator (作者) Hsueh ,Hui-Fenen_US
dc.date (日期) 2004en_US
dc.date.accessioned 2009-09-14-
dc.date.available 2009-09-14-
dc.date.issued (上傳時間) 2009-09-14-
dc.identifier (Other Identifiers) G0913540111en_US
dc.identifier.uri (URI) https://nccur.lib.nccu.edu.tw/handle/140.119/30932-
dc.description (描述) 碩士zh_TW
dc.description (描述) 國立政治大學zh_TW
dc.description (描述) 統計研究所zh_TW
dc.description (描述) 91354011zh_TW
dc.description (描述) 93zh_TW
dc.description.abstract (摘要) 高密度寡核甘酸基因陣列實驗是新的生物技術,可在一個晶片上蒐集到數千至上萬個基因資料,資料處理的過程相當繁複,包括背景訊號的修正、正規化、探針背景的修正及探針組資料的整合,本研究首先將介紹各資料處理步驟。其中正規化的目的是要修正資料中由實驗產生的系統化變異,去除實驗誤差,使資料更為純淨,則後續所做的統計分析才會更為精確。之後再詳細介紹三種正規化方法,包括:尺度調整法、循環平滑調整法及百分位調整法。並將以一組實際資料來說明正規化後的結果。最終採取電腦模擬的方式,以平均四分位距、平均標準差、Diff統計量及離群值的個數這四個量化準則,來研究各正規化方法的效果,以及比較這三種正規化方法的優劣,同時也將探討此四種準則的適當性。zh_TW
dc.description.abstract (摘要) High-density oligonucleotide array gene experiment is a new biological technology. More than thousands of gene data can be obtained in an array. The data processing includes background correction, normalization, probe specific background correction and summarizing the probe set value into one expression measure. The goal of normalization is to remove the systematic variation induced in the experiment while keeping the biological variation of interest. Using the purified data, one will obtain more accurate conclusions in subsequent statistical analysis. Firstly, we introduce the data processing procedures. Three normalization methods, which include Scaling, Cyclic Loess and Quantile, are explained in detail and illustrated by a real data set. Moreover, a simulation study is conducted to compare the three methods. Four quantities, Mean of IQR, Mean of Standard Deviation, Diff Statistics and Outlier, are proposed for assessment. Not only the performances of the three normalization methods but also the properties of the four proposed criteria are given and studied in this research.en_US
dc.description.tableofcontents 論文目錄 I
     圖目錄 II
     表目錄 III
     論文摘要(中文) IV
     論文摘要(英文) V
     第一章 緒論 1
     1.2 寡核甘酸微陣列基因晶片的技術介紹 1
     1.3 寡核甘酸微陣列基因晶片的資料型態 3
     1.4 寡核甘酸微陣列基因資料的處理流程 3
     1.5 研究動機與目的 6
     1.6 研究限制 7
     第二章 正規化方法的介紹 9
     2.1 尺度調整法(Scaling Method) 9
     2.2 循環平滑調整法(Cyclic Loess) 13
     2.3 百分位調整法(Quantile) 18
     第三章 資料模擬與比較 23
     第三章 資料模擬與比較 24
     3.1 平均四分位距(Mean of IQR) 26
     3.2 平均標準差(Mean of Standard Deviation,MSD) 31
     3.3 Diff統計量 34
     3.4 離群值的個數(Outlier) 38
     第四章 總結與建議 42
     參考文獻 45
     附錄 47
zh_TW
dc.language.iso en_US-
dc.source.uri (資料來源) http://thesis.lib.nccu.edu.tw/record/#G0913540111en_US
dc.subject (關鍵詞) 寡核甘酸zh_TW
dc.subject (關鍵詞) 正規化zh_TW
dc.subject (關鍵詞) Oligonucleotideen_US
dc.subject (關鍵詞) Normalizationen_US
dc.title (題名) 高密度寡核甘酸基因陣列晶片正規化方法之研究zh_TW
dc.title (題名) The Research of Normalization Methods for High Density Oligonucleotide Arrayen_US
dc.type (資料類型) thesisen
dc.relation.reference (參考文獻) 1. Affymetrix(1999). Affymetrix Microarray Suite User Guide. Affymetrix, Santa Clara, CA, version 4 edition.zh_TW
dc.relation.reference (參考文獻) 2. Affymetrix(2002). Statistical Algorithms Description Document. Technical report, Affymetrix .zh_TW
dc.relation.reference (參考文獻) 3. Astrand, M.(2003). Contrast normalization of oligonucleotide arrays. Journal of Computational Biology, 10(1), 95-102.zh_TW
dc.relation.reference (參考文獻) 4. Bolstad, B., Irizarry, R., Astrand, M., and Speed, T.(2003). A comparison of normalization methods for high density oligonucleotide array data based on variance and bias. Bioinformatics, 19(2), 185-193.zh_TW
dc.relation.reference (參考文獻) 5. Cleveland, W. S. and Devlin, S. J. (1988). Locally-weighted regression: An approach to regression analysis by local fitting, Journal of the American Statistical Association, 83,596-610.zh_TW
dc.relation.reference (參考文獻) 6. Irizarry, R., Hobbs, B., Collin, F., Beazer-Barclay, Y., Antonellis, K., Scherf, U., and Speed, T.(2003). Exploration, normalization, and summaries of high density oligonucleotide array probe level data. Biostatistics. 4, 249-264.zh_TW
dc.relation.reference (參考文獻) 7. Lazaridis, E., Sinibaldi, D., Bloom, G., Mane, S., and Jove, R.(2002). A simple method to improve probe set estimates form oligonucleotide arrays. Math Biostatistics, 176(1), 53-58.zh_TW
dc.relation.reference (參考文獻) 8. Li, C. and Wong, W.(2001a). Model-based analysis of oligonucleotide arrays: Expression index computation and outlier detection. Proceedings of the National Academy of Science U S A, 98, 31-36.zh_TW
dc.relation.reference (參考文獻) 9. Li, C. and Wong, W.(2001b). Model-based analysis of oligonucleotide arrays: Model Validation Design Issues and Standard Error Application. Genome Biology, 2, 1-11.zh_TW
dc.relation.reference (參考文獻) 10. Naef, F., Lim, D.A., Patil, N., and Magnasco, M.O.(2001). From features to expression: High density oligonucleotide array analysis revisited. Tech Report 1, 1-9.zh_TW
dc.relation.reference (參考文獻) 11. Workman, C., Saxild, J., Nielsen, C., Brunak, S., and Knudsen, S.(2002). A new non-linear normalization method for reducing variability in DNA microarray experiments. Genome Biology, 3(9), 1-16.zh_TW
dc.relation.reference (參考文獻) 12. Yang, Y.H., Dudoit, S., Luu, P., Lin, D. M., Peng, V., Ngai, J., Speed, T. (2002). Normalization for cDNA microarray data: a robust composite method addressing single and multiple slide systematic variation. Nucleic Acids Research, 30(4), e15.zh_TW
dc.relation.reference (參考文獻) 13. http://www.affymetrix.com/index.affxzh_TW
dc.relation.reference (參考文獻) 14. http://www.genelogic.com/media/studies/index.cfmzh_TW