模擬高密度寡聚核甘酸微陣列矩陣資料及正規化方法之探討

Publications-Theses

Article View/Open

html(315)

Publication Export

Google Scholar^TM

題名	模擬高密度寡聚核甘酸微陣列矩陣資料及正規化方法之探討 A Simulation Study on High Density Oligonucleotide Microarray Data With Discussion of Normalization Methods
作者	吳小萍 Wu, Hsiao-Ping
貢獻者	郭訓志<br>蔡紋琦 Kuo, Hsun-Chih<br>Tsai, Wen-Chi 吳小萍 Wu, Hsiao-Ping
關鍵詞	微陣列矩陣正規化 microarray normalization
日期	2005
上傳時間	2009-09-14
摘要	微陣列矩陣晶片是一門現今被廣泛使用在許多領域的生物醫學研究，在本文，我們主要是對寡核甘酸微陣列矩陣晶片資料的正規化感興趣。為了比較不同的正規化方法，我們致力於模擬更接近真實寡核甘酸微陣列矩陣晶片的資料。在資料的模擬上，我們主要是根據Li和Wong的模型來進行模擬，並利用階層法來設定模型的參數。最後為了判別正規化方法的好壞，我們模擬了100組資料，並且利用四個判斷準則來做比較。模擬的結果表示，我們所提出的新方法 (LOESS to Average)，一般來說都比其他的正規化方法來的好。 Microarray technology is now widely used in many areas of biomedical research. In this thesis, we are interested in the normalization for oligonucleotide Microarray data. We aimed to simulate more realistic oligonucleotide microarry data in order to compare different normalization methods. The data simulation was based on Li and Wong`s model with a hierarchical setup for parameters. In order to compare normalization methods, 100 data sets were simulated data. The performance of ten normalization methods was assessed based on four comparison criteria. Simulation results suggest that our new proposed normalization method, LOESS to Average, is generally a better method than other normalization methods.
參考文獻	[1] Affymetrix (2002), Statistical algorithms description document, Technical report, Affymetrix. [2] B. M. Bolstad, R. A. Irizarry, M. Astrand and T. P. Speed (2003), A comparison of normalization methods for high density oligonucleotide array data based on variance and bias, Bioinformatics, 19(2), 185-193. [3] R. A. Irizarry, B. Hobbs, F. Collin, Y. D. Beazer- Barclay, K. J. Antonellis, U. Scherf and T. P. Speed (2003), Exploration, normalization, and summaries of high density oligonucleotide array probe level data, Biostatistics, 4(2), 249-264. [4] C. Li and W. H. Wong (2001a), Model-based analysis of oligonucleotide arrays: expression index computation and outlier detection, Proceedings of the National Academy of Science USA, 98, 31-36. [5] C. Li and W. H. Wong (2001b), Model-based analysis of oligonucleotide arrays: model validation, design issues and standard error application, Genome Biology 2(8): research 0032.1-0032.11. [6] R. A. Irizarry, B. M. Bolstad, F. Collin, L. M. Cope, B. Hobbs and T. P. Speed (2003), Summaries of affymetrix GeneChip probe level data, Nucleic Acids Research, 31(4), e15. [7] B. Bolstad (2001), Probe level quantile normalization of high density oligonucleotide array data, Division of Biostatistics. [8] B. Bolstad (2002), Comparing the effects of background, normalization and summarization on gene expression estimates. [9] Affymetrix (2001), GeneChip arrays provide optimal sensitivity and specificity for microarray expression analysis, Affymetrix. [10] B. M. Bolstad (2004), Low-level analysis of high- density Oligonucleotide array data: background, normalization and summarization. [11] D. Holder, R. F. Raubertas, V. Bill Pikounis, V. Svetnik and K. Soper, statistical analysis of high density oligonucleotide arrays: a safer approach, Merck Research Laboratories, WP37C-305, West Point, PA 19486. [12] F. Naef, D. A. Lim, N. Patil and M. O. Magnasco (2001),From features to expression: High-density oligonucleotide array analysis revisited, Tech Report, 1, 1-9. [13] R. Sasik, E. Calvo and J. Corbeil (2002), Statistical analysis of high-density oligonucleotide arrays: a multiplicative noise model, Bioinformatics 18(12), 1633-1640. [14] dChip User`s Manual (2005) http://biosun1.harvard.edu/complab/dchip [15] 薛慧芬 (2005), The research of normalization methods for high density oligonucleotide array, Thesis at National Chengchi University. [16] S. Dudoit, Y. H. Yang, M. J. Callow and T. P. Speed (2000), Statistical methods for identifying differentially expressed genes in replicated cDNA microarray experiments.
描述	碩士國立政治大學統計研究所 93354023 94
資料來源	http://thesis.lib.nccu.edu.tw/record/#G0093354023
資料類型	thesis

dc.contributor.advisor	郭訓志<br>蔡紋琦	zh_TW
dc.contributor.advisor	Kuo, Hsun-Chih<br>Tsai, Wen-Chi	en_US
dc.contributor.author (Authors)	吳小萍	zh_TW
dc.contributor.author (Authors)	Wu, Hsiao-Ping	en_US
dc.creator (作者)	吳小萍	zh_TW
dc.creator (作者)	Wu, Hsiao-Ping	en_US
dc.date (日期)	2005	en_US
dc.date.accessioned	2009-09-14	-
dc.date.available	2009-09-14	-
dc.date.issued (上傳時間)	2009-09-14	-
dc.identifier (Other Identifiers)	G0093354023	en_US
dc.identifier.uri (URI)	https://nccur.lib.nccu.edu.tw/handle/140.119/30906	-
dc.description (描述)	碩士	zh_TW
dc.description (描述)	國立政治大學	zh_TW
dc.description (描述)	統計研究所	zh_TW
dc.description (描述)	93354023	zh_TW
dc.description (描述)	94	zh_TW
dc.description.abstract (摘要)	微陣列矩陣晶片是一門現今被廣泛使用在許多領域的生物醫學研究，在本文，我們主要是對寡核甘酸微陣列矩陣晶片資料的正規化感興趣。為了比較不同的正規化方法，我們致力於模擬更接近真實寡核甘酸微陣列矩陣晶片的資料。在資料的模擬上，我們主要是根據Li和Wong的模型來進行模擬，並利用階層法來設定模型的參數。最後為了判別正規化方法的好壞，我們模擬了100組資料，並且利用四個判斷準則來做比較。模擬的結果表示，我們所提出的新方法 (LOESS to Average)，一般來說都比其他的正規化方法來的好。	zh_TW
dc.description.abstract (摘要)	Microarray technology is now widely used in many areas of biomedical research. In this thesis, we are interested in the normalization for oligonucleotide Microarray data. We aimed to simulate more realistic oligonucleotide microarry data in order to compare different normalization methods. The data simulation was based on Li and Wong`s model with a hierarchical setup for parameters. In order to compare normalization methods, 100 data sets were simulated data. The performance of ten normalization methods was assessed based on four comparison criteria. Simulation results suggest that our new proposed normalization method, LOESS to Average, is generally a better method than other normalization methods.	en_US
dc.description.tableofcontents	謝辭.....................................................{i} Abstract...............................................{ii} 中文摘要...............................................{iii} 1 Introduction..........................................{1} 2 Literature Review.....................................{3} 2.1 Affymetrix Gene Chip Technologies.................{3} 2.2 Li and Wong`s Model...............................{4} 2.3 DNA-Chip (dChip)..................................{4} 2.3.1 Invariant Normalization.......................{5} 2.4 Robust Multi-Array Average (RMA)..................{5} 2.4.1 Background Correction in RMA..................{6} 2.4.2 Quantile Normalization in RMA.................{7} 2.4.3 Summarization in RMA: Median Polish...........{7} 2.5 Microarray Analysis Suite Software (MAS 5.0)......{9} 2.5.1 Background Correction in MAS 5.0..............{9} 2.5.2 The Ideal Mismatch Value (IM)................{10} 2.5.3 The Adjusted Log-Transformed PM Intensities..{10} 2.5.4 One Step Tukey Biweight Algorithm............{11} 2.5.5 Scaling Normalization........................{11} 2.6 omparisons of Normalization Methods..............{12} 3 Methodology..........................................{13} 3.1 Scaling Method...................................{13} 3.2 Median Centered..................................{15} 3.3 Hybrid Scaling-Median Centered Methods...........{15} 3.4 Z^* Scores.......................................{16} 3.5 Quantile Normalization...........................{16} 3.6 Cyclic LOESS.....................................{18} 3.7 New Proposed Normalization Method: LOESS to Average .................................................{20} 4 Real Data............................................{21} 4.1 Real Data........................................{21} 4.2 The Perfect Match (PM) Value.....................{21} 4.3 The Mismatch (MM) Value..........................{22} 4.4 The Theta (θ)....................................{23} 4.5 The Phi (Φ)......................................{24} 5 Simulation...........................................{26} 5.1 Common Simulation Setting........................{26} 5.2 Simulation Settings for Differentially Expressed Genes............................................{27} 5.3 Simulated Data...................................{28} 6 Comparisons of Normalization Methods.................{30} 6.1 Interquarter Range (IQR).........................{30} 6.2 Diff-statistics..................................{31} 6.3 Mean Standard Deviation (MSD)....................{33} 6.3.1 Overall MSD..................................{33} 6.3.2 Diff-MSD.....................................{34} 6.4 Ratio............................................{36} 7 Discussion and Future Work...........................{38} 7.1 Discussion of Comparison Criteria................{38} 7.2 Summary of Comparisons for Normalization Methods.{39} 7.3 Discussion of Simulation Settings................{39} 7.3.1 Simulation Setting 1.........................{39} 7.3.2 Simulation Setting 2 and Setting 3...........{40} 7.4 Future Work......................................{41} References.............................................{43} Appendix...............................................{45}	zh_TW
dc.language.iso	en_US	-
dc.source.uri (資料來源)	http://thesis.lib.nccu.edu.tw/record/#G0093354023	en_US
dc.subject (關鍵詞)	微陣列矩陣	zh_TW
dc.subject (關鍵詞)	正規化	zh_TW
dc.subject (關鍵詞)	microarray	en_US
dc.subject (關鍵詞)	normalization	en_US
dc.title (題名)	模擬高密度寡聚核甘酸微陣列矩陣資料及正規化方法之探討	zh_TW
dc.title (題名)	A Simulation Study on High Density Oligonucleotide Microarray Data With Discussion of Normalization Methods	en_US
dc.type (資料類型)	thesis	en
dc.relation.reference (參考文獻)	[1] Affymetrix (2002), Statistical algorithms description	zh_TW
dc.relation.reference (參考文獻)	document, Technical report, Affymetrix.	zh_TW
dc.relation.reference (參考文獻)	[2] B. M. Bolstad, R. A. Irizarry, M. Astrand and T. P.	zh_TW
dc.relation.reference (參考文獻)	Speed (2003), A comparison of normalization methods for	zh_TW
dc.relation.reference (參考文獻)	high density oligonucleotide array data based on	zh_TW
dc.relation.reference (參考文獻)	variance and bias, Bioinformatics, 19(2), 185-193.	zh_TW
dc.relation.reference (參考文獻)	[3] R. A. Irizarry, B. Hobbs, F. Collin, Y. D. Beazer-	zh_TW
dc.relation.reference (參考文獻)	Barclay, K. J. Antonellis, U. Scherf and T. P. Speed	zh_TW
dc.relation.reference (參考文獻)	(2003), Exploration, normalization, and summaries of	zh_TW
dc.relation.reference (參考文獻)	high density oligonucleotide array probe level data,	zh_TW
dc.relation.reference (參考文獻)	Biostatistics, 4(2), 249-264.	zh_TW
dc.relation.reference (參考文獻)	[4] C. Li and W. H. Wong (2001a), Model-based analysis of	zh_TW
dc.relation.reference (參考文獻)	oligonucleotide arrays: expression index computation	zh_TW
dc.relation.reference (參考文獻)	and outlier detection, Proceedings of the National	zh_TW
dc.relation.reference (參考文獻)	Academy of Science USA, 98, 31-36.	zh_TW
dc.relation.reference (參考文獻)	[5] C. Li and W. H. Wong (2001b), Model-based analysis of	zh_TW
dc.relation.reference (參考文獻)	oligonucleotide arrays: model validation, design issues	zh_TW
dc.relation.reference (參考文獻)	and standard error application, Genome Biology 2(8):	zh_TW
dc.relation.reference (參考文獻)	research 0032.1-0032.11.	zh_TW
dc.relation.reference (參考文獻)	[6] R. A. Irizarry, B. M. Bolstad, F. Collin, L. M. Cope,	zh_TW
dc.relation.reference (參考文獻)	B. Hobbs and T. P. Speed (2003), Summaries of	zh_TW
dc.relation.reference (參考文獻)	affymetrix GeneChip probe level data, Nucleic Acids	zh_TW
dc.relation.reference (參考文獻)	Research, 31(4), e15.	zh_TW
dc.relation.reference (參考文獻)	[7] B. Bolstad (2001), Probe level quantile normalization of	zh_TW
dc.relation.reference (參考文獻)	high density oligonucleotide array data, Division of	zh_TW
dc.relation.reference (參考文獻)	Biostatistics.	zh_TW
dc.relation.reference (參考文獻)	[8] B. Bolstad (2002), Comparing the effects of background,	zh_TW
dc.relation.reference (參考文獻)	normalization and summarization on gene expression	zh_TW
dc.relation.reference (參考文獻)	estimates.	zh_TW
dc.relation.reference (參考文獻)	[9] Affymetrix (2001), GeneChip arrays provide optimal	zh_TW
dc.relation.reference (參考文獻)	sensitivity and specificity for microarray expression	zh_TW
dc.relation.reference (參考文獻)	analysis, Affymetrix.	zh_TW
dc.relation.reference (參考文獻)	[10] B. M. Bolstad (2004), Low-level analysis of high-	zh_TW
dc.relation.reference (參考文獻)	density Oligonucleotide array data: background,	zh_TW
dc.relation.reference (參考文獻)	normalization and summarization.	zh_TW
dc.relation.reference (參考文獻)	[11] D. Holder, R. F. Raubertas, V. Bill Pikounis, V.	zh_TW
dc.relation.reference (參考文獻)	Svetnik and K. Soper, statistical analysis of high	zh_TW
dc.relation.reference (參考文獻)	density oligonucleotide arrays: a safer approach,	zh_TW
dc.relation.reference (參考文獻)	Merck Research Laboratories, WP37C-305, West Point, PA	zh_TW
dc.relation.reference (參考文獻)	19486.	zh_TW
dc.relation.reference (參考文獻)	[12] F. Naef, D. A. Lim, N. Patil and M. O. Magnasco	zh_TW
dc.relation.reference (參考文獻)	(2001),From features to expression: High-density	zh_TW
dc.relation.reference (參考文獻)	oligonucleotide array analysis revisited, Tech Report,	zh_TW
dc.relation.reference (參考文獻)	1, 1-9.	zh_TW
dc.relation.reference (參考文獻)	[13] R. Sasik, E. Calvo and J. Corbeil (2002), Statistical	zh_TW
dc.relation.reference (參考文獻)	analysis of high-density oligonucleotide arrays: a	zh_TW
dc.relation.reference (參考文獻)	multiplicative noise model, Bioinformatics 18(12),	zh_TW
dc.relation.reference (參考文獻)	1633-1640.	zh_TW
dc.relation.reference (參考文獻)	[14] dChip User`s Manual (2005)	zh_TW
dc.relation.reference (參考文獻)	http://biosun1.harvard.edu/complab/dchip	zh_TW
dc.relation.reference (參考文獻)	[15] 薛慧芬 (2005), The research of normalization methods	zh_TW
dc.relation.reference (參考文獻)	for high density oligonucleotide array, Thesis at	zh_TW
dc.relation.reference (參考文獻)	National Chengchi University.	zh_TW
dc.relation.reference (參考文獻)	[16] S. Dudoit, Y. H. Yang, M. J. Callow and T. P. Speed	zh_TW
dc.relation.reference (參考文獻)	(2000), Statistical methods for identifying	zh_TW
dc.relation.reference (參考文獻)	differentially expressed genes in replicated cDNA	zh_TW
dc.relation.reference (參考文獻)	microarray experiments.	zh_TW

Publications-Theses

Article View/Open

Publication Export

Google ScholarTM

NCCU Library

Citation Infomation

Related Publications in TAIR

Google Scholar^TM