高密度寡核甘酸基因陣列晶片正規化方法之研究

Publications-Theses

Article View/Open

html(311)

Publication Export

Google Scholar^TM

題名	高密度寡核甘酸基因陣列晶片正規化方法之研究 The Research of Normalization Methods for High Density Oligonucleotide Array
作者	薛慧芬 Hsueh ,Hui-Fen
貢獻者	薛慧敏 Hsueh ,Huey-Miin 薛慧芬 Hsueh ,Hui-Fen
關鍵詞	寡核甘酸正規化 Oligonucleotide Normalization
日期	2004
上傳時間	2009-09-14
摘要	高密度寡核甘酸基因陣列實驗是新的生物技術，可在一個晶片上蒐集到數千至上萬個基因資料，資料處理的過程相當繁複，包括背景訊號的修正、正規化、探針背景的修正及探針組資料的整合，本研究首先將介紹各資料處理步驟。其中正規化的目的是要修正資料中由實驗產生的系統化變異，去除實驗誤差，使資料更為純淨，則後續所做的統計分析才會更為精確。之後再詳細介紹三種正規化方法，包括：尺度調整法、循環平滑調整法及百分位調整法。並將以一組實際資料來說明正規化後的結果。最終採取電腦模擬的方式，以平均四分位距、平均標準差、Diff統計量及離群值的個數這四個量化準則，來研究各正規化方法的效果，以及比較這三種正規化方法的優劣，同時也將探討此四種準則的適當性。 High-density oligonucleotide array gene experiment is a new biological technology. More than thousands of gene data can be obtained in an array. The data processing includes background correction, normalization, probe specific background correction and summarizing the probe set value into one expression measure. The goal of normalization is to remove the systematic variation induced in the experiment while keeping the biological variation of interest. Using the purified data, one will obtain more accurate conclusions in subsequent statistical analysis. Firstly, we introduce the data processing procedures. Three normalization methods, which include Scaling, Cyclic Loess and Quantile, are explained in detail and illustrated by a real data set. Moreover, a simulation study is conducted to compare the three methods. Four quantities, Mean of IQR, Mean of Standard Deviation, Diff Statistics and Outlier, are proposed for assessment. Not only the performances of the three normalization methods but also the properties of the four proposed criteria are given and studied in this research.
參考文獻	1. Affymetrix（1999）. Affymetrix Microarray Suite User Guide. Affymetrix, Santa Clara, CA, version 4 edition. 2. Affymetrix（2002）. Statistical Algorithms Description Document. Technical report, Affymetrix . 3. Astrand, M.（2003）. Contrast normalization of oligonucleotide arrays. Journal of Computational Biology, 10(1), 95-102. 4. Bolstad, B., Irizarry, R., Astrand, M., and Speed, T.（2003）. A comparison of normalization methods for high density oligonucleotide array data based on variance and bias. Bioinformatics, 19(2), 185-193. 5. Cleveland, W. S. and Devlin, S. J. (1988). Locally-weighted regression: An approach to regression analysis by local fitting, Journal of the American Statistical Association, 83,596-610. 6. Irizarry, R., Hobbs, B., Collin, F., Beazer-Barclay, Y., Antonellis, K., Scherf, U., and Speed, T.（2003）. Exploration, normalization, and summaries of high density oligonucleotide array probe level data. Biostatistics. 4, 249-264. 7. Lazaridis, E., Sinibaldi, D., Bloom, G., Mane, S., and Jove, R.（2002）. A simple method to improve probe set estimates form oligonucleotide arrays. Math Biostatistics, 176(1), 53-58. 8. Li, C. and Wong, W.（2001a）. Model-based analysis of oligonucleotide arrays: Expression index computation and outlier detection. Proceedings of the National Academy of Science U S A, 98, 31-36. 9. Li, C. and Wong, W.（2001b）. Model-based analysis of oligonucleotide arrays: Model Validation Design Issues and Standard Error Application. Genome Biology, 2, 1-11. 10. Naef, F., Lim, D.A., Patil, N., and Magnasco, M.O.（2001）. From features to expression: High density oligonucleotide array analysis revisited. Tech Report 1, 1-9. 11. Workman, C., Saxild, J., Nielsen, C., Brunak, S., and Knudsen, S.（2002）. A new non-linear normalization method for reducing variability in DNA microarray experiments. Genome Biology, 3(9), 1-16. 12. Yang, Y.H., Dudoit, S., Luu, P., Lin, D. M., Peng, V., Ngai, J., Speed, T. (2002). Normalization for cDNA microarray data: a robust composite method addressing single and multiple slide systematic variation. Nucleic Acids Research, 30(4), e15. 13. http://www.affymetrix.com/index.affx 14. http://www.genelogic.com/media/studies/index.cfm
描述	碩士國立政治大學統計研究所 91354011 93
資料來源	http://thesis.lib.nccu.edu.tw/record/#G0913540111
資料類型	thesis

dc.contributor.advisor	薛慧敏	zh_TW
dc.contributor.advisor	Hsueh ,Huey-Miin	en_US
dc.contributor.author (Authors)	薛慧芬	zh_TW
dc.contributor.author (Authors)	Hsueh ,Hui-Fen	en_US
dc.creator (作者)	薛慧芬	zh_TW
dc.creator (作者)	Hsueh ,Hui-Fen	en_US
dc.date (日期)	2004	en_US
dc.date.accessioned	2009-09-14	-
dc.date.available	2009-09-14	-
dc.date.issued (上傳時間)	2009-09-14	-
dc.identifier (Other Identifiers)	G0913540111	en_US
dc.identifier.uri (URI)	https://nccur.lib.nccu.edu.tw/handle/140.119/30932	-
dc.description (描述)	碩士	zh_TW
dc.description (描述)	國立政治大學	zh_TW
dc.description (描述)	統計研究所	zh_TW
dc.description (描述)	91354011	zh_TW
dc.description (描述)	93	zh_TW
dc.description.abstract (摘要)	高密度寡核甘酸基因陣列實驗是新的生物技術，可在一個晶片上蒐集到數千至上萬個基因資料，資料處理的過程相當繁複，包括背景訊號的修正、正規化、探針背景的修正及探針組資料的整合，本研究首先將介紹各資料處理步驟。其中正規化的目的是要修正資料中由實驗產生的系統化變異，去除實驗誤差，使資料更為純淨，則後續所做的統計分析才會更為精確。之後再詳細介紹三種正規化方法，包括：尺度調整法、循環平滑調整法及百分位調整法。並將以一組實際資料來說明正規化後的結果。最終採取電腦模擬的方式，以平均四分位距、平均標準差、Diff統計量及離群值的個數這四個量化準則，來研究各正規化方法的效果，以及比較這三種正規化方法的優劣，同時也將探討此四種準則的適當性。	zh_TW
dc.description.abstract (摘要)	High-density oligonucleotide array gene experiment is a new biological technology. More than thousands of gene data can be obtained in an array. The data processing includes background correction, normalization, probe specific background correction and summarizing the probe set value into one expression measure. The goal of normalization is to remove the systematic variation induced in the experiment while keeping the biological variation of interest. Using the purified data, one will obtain more accurate conclusions in subsequent statistical analysis. Firstly, we introduce the data processing procedures. Three normalization methods, which include Scaling, Cyclic Loess and Quantile, are explained in detail and illustrated by a real data set. Moreover, a simulation study is conducted to compare the three methods. Four quantities, Mean of IQR, Mean of Standard Deviation, Diff Statistics and Outlier, are proposed for assessment. Not only the performances of the three normalization methods but also the properties of the four proposed criteria are given and studied in this research.	en_US
dc.description.tableofcontents	論文目錄 I 圖目錄 II 表目錄 III 論文摘要（中文） IV 論文摘要（英文） V 第一章緒論 1 1.2 寡核甘酸微陣列基因晶片的技術介紹 1 1.3 寡核甘酸微陣列基因晶片的資料型態 3 1.4 寡核甘酸微陣列基因資料的處理流程 3 1.5 研究動機與目的 6 1.6 研究限制 7 第二章正規化方法的介紹 9 2.1 尺度調整法（Scaling Method） 9 2.2 循環平滑調整法（Cyclic Loess） 13 2.3 百分位調整法（Quantile） 18 第三章資料模擬與比較 23 第三章資料模擬與比較 24 3.1 平均四分位距（Mean of IQR） 26 3.2 平均標準差（Mean of Standard Deviation，MSD） 31 3.3 Diff統計量 34 3.4 離群值的個數（Outlier） 38 第四章總結與建議 42 參考文獻 45 附錄 47	zh_TW
dc.language.iso	en_US	-
dc.source.uri (資料來源)	http://thesis.lib.nccu.edu.tw/record/#G0913540111	en_US
dc.subject (關鍵詞)	寡核甘酸	zh_TW
dc.subject (關鍵詞)	正規化	zh_TW
dc.subject (關鍵詞)	Oligonucleotide	en_US
dc.subject (關鍵詞)	Normalization	en_US
dc.title (題名)	高密度寡核甘酸基因陣列晶片正規化方法之研究	zh_TW
dc.title (題名)	The Research of Normalization Methods for High Density Oligonucleotide Array	en_US
dc.type (資料類型)	thesis	en
dc.relation.reference (參考文獻)	1. Affymetrix（1999）. Affymetrix Microarray Suite User Guide. Affymetrix, Santa Clara, CA, version 4 edition.	zh_TW
dc.relation.reference (參考文獻)	2. Affymetrix（2002）. Statistical Algorithms Description Document. Technical report, Affymetrix .	zh_TW
dc.relation.reference (參考文獻)	3. Astrand, M.（2003）. Contrast normalization of oligonucleotide arrays. Journal of Computational Biology, 10(1), 95-102.	zh_TW
dc.relation.reference (參考文獻)	4. Bolstad, B., Irizarry, R., Astrand, M., and Speed, T.（2003）. A comparison of normalization methods for high density oligonucleotide array data based on variance and bias. Bioinformatics, 19(2), 185-193.	zh_TW
dc.relation.reference (參考文獻)	5. Cleveland, W. S. and Devlin, S. J. (1988). Locally-weighted regression: An approach to regression analysis by local fitting, Journal of the American Statistical Association, 83,596-610.	zh_TW
dc.relation.reference (參考文獻)	6. Irizarry, R., Hobbs, B., Collin, F., Beazer-Barclay, Y., Antonellis, K., Scherf, U., and Speed, T.（2003）. Exploration, normalization, and summaries of high density oligonucleotide array probe level data. Biostatistics. 4, 249-264.	zh_TW
dc.relation.reference (參考文獻)	7. Lazaridis, E., Sinibaldi, D., Bloom, G., Mane, S., and Jove, R.（2002）. A simple method to improve probe set estimates form oligonucleotide arrays. Math Biostatistics, 176(1), 53-58.	zh_TW
dc.relation.reference (參考文獻)	8. Li, C. and Wong, W.（2001a）. Model-based analysis of oligonucleotide arrays: Expression index computation and outlier detection. Proceedings of the National Academy of Science U S A, 98, 31-36.	zh_TW
dc.relation.reference (參考文獻)	9. Li, C. and Wong, W.（2001b）. Model-based analysis of oligonucleotide arrays: Model Validation Design Issues and Standard Error Application. Genome Biology, 2, 1-11.	zh_TW
dc.relation.reference (參考文獻)	10. Naef, F., Lim, D.A., Patil, N., and Magnasco, M.O.（2001）. From features to expression: High density oligonucleotide array analysis revisited. Tech Report 1, 1-9.	zh_TW
dc.relation.reference (參考文獻)	11. Workman, C., Saxild, J., Nielsen, C., Brunak, S., and Knudsen, S.（2002）. A new non-linear normalization method for reducing variability in DNA microarray experiments. Genome Biology, 3(9), 1-16.	zh_TW
dc.relation.reference (參考文獻)	12. Yang, Y.H., Dudoit, S., Luu, P., Lin, D. M., Peng, V., Ngai, J., Speed, T. (2002). Normalization for cDNA microarray data: a robust composite method addressing single and multiple slide systematic variation. Nucleic Acids Research, 30(4), e15.	zh_TW
dc.relation.reference (參考文獻)	13. http://www.affymetrix.com/index.affx	zh_TW
dc.relation.reference (參考文獻)	14. http://www.genelogic.com/media/studies/index.cfm	zh_TW

Publications-Theses

Article View/Open

Publication Export

Google ScholarTM

NCCU Library

Citation Infomation

Related Publications in TAIR

Google Scholar^TM