學術產出-Theses

Article View/Open

Publication Export

Google ScholarTM

政大圖書館

Citation Infomation

  • No doi shows Citation Infomation
題名 模擬高密度寡聚核甘酸微陣列矩陣資料及正規化方法之探討
A Simulation Study on High Density Oligonucleotide Microarray Data With Discussion of Normalization Methods
作者 吳小萍
Wu, Hsiao-Ping
貢獻者 郭訓志<br>蔡紋琦
Kuo, Hsun-Chih<br>Tsai, Wen-Chi
吳小萍
Wu, Hsiao-Ping
關鍵詞 微陣列矩陣
正規化
microarray
normalization
日期 2005
上傳時間 2009-09-14
摘要 微陣列矩陣晶片是一門現今被廣泛使用在許多領域的生物醫學研究,在本文,我們主要是對寡核甘酸微陣列矩陣晶片資料的正規化感興趣。為了比較不同的正規化方法,我們致力於模擬更接近真實寡核甘酸微陣列矩陣晶片的資料。在資料的模擬上,我們主要是根據Li和Wong的模型來進行模擬,並利用階層法來設定模型的參數。最後為了判別正規化方法的好壞,我們模擬了100組資料,並且利用四個判斷準則來做比較。模擬的結果表示,我們所提出的新方法
     (LOESS to Average),一般來說都比其他的正規化方法來的好。
Microarray technology is now widely used in many areas of biomedical research. In this thesis, we are interested in the normalization for oligonucleotide Microarray data. We aimed to simulate more realistic oligonucleotide microarry data in order to compare different normalization methods. The data simulation was based on Li and Wong`s model with a hierarchical setup for parameters. In order to compare normalization methods, 100 data sets were simulated data. The performance of ten normalization methods was assessed based on four comparison criteria. Simulation results suggest that our new proposed normalization method, LOESS
     to Average, is generally a better method than other normalization methods.
參考文獻 [1] Affymetrix (2002), Statistical algorithms description
document, Technical report, Affymetrix.
[2] B. M. Bolstad, R. A. Irizarry, M. Astrand and T. P.
Speed (2003), A comparison of normalization methods for
high density oligonucleotide array data based on
variance and bias, Bioinformatics, 19(2), 185-193.
[3] R. A. Irizarry, B. Hobbs, F. Collin, Y. D. Beazer-
Barclay, K. J. Antonellis, U. Scherf and T. P. Speed
(2003), Exploration, normalization, and summaries of
high density oligonucleotide array probe level data,
Biostatistics, 4(2), 249-264.
[4] C. Li and W. H. Wong (2001a), Model-based analysis of
oligonucleotide arrays: expression index computation
and outlier detection, Proceedings of the National
Academy of Science USA, 98, 31-36.
[5] C. Li and W. H. Wong (2001b), Model-based analysis of
oligonucleotide arrays: model validation, design issues
and standard error application, Genome Biology 2(8):
research 0032.1-0032.11.
[6] R. A. Irizarry, B. M. Bolstad, F. Collin, L. M. Cope,
B. Hobbs and T. P. Speed (2003), Summaries of
affymetrix GeneChip probe level data, Nucleic Acids
Research, 31(4), e15.
[7] B. Bolstad (2001), Probe level quantile normalization of
high density oligonucleotide array data, Division of
Biostatistics.
[8] B. Bolstad (2002), Comparing the effects of background,
normalization and summarization on gene expression
estimates.
[9] Affymetrix (2001), GeneChip arrays provide optimal
sensitivity and specificity for microarray expression
analysis, Affymetrix.
[10] B. M. Bolstad (2004), Low-level analysis of high-
density Oligonucleotide array data: background,
normalization and summarization.
[11] D. Holder, R. F. Raubertas, V. Bill Pikounis, V.
Svetnik and K. Soper, statistical analysis of high
density oligonucleotide arrays: a safer approach,
Merck Research Laboratories, WP37C-305, West Point, PA
19486.
[12] F. Naef, D. A. Lim, N. Patil and M. O. Magnasco
(2001),From features to expression: High-density
oligonucleotide array analysis revisited, Tech Report,
1, 1-9.
[13] R. Sasik, E. Calvo and J. Corbeil (2002), Statistical
analysis of high-density oligonucleotide arrays: a
multiplicative noise model, Bioinformatics 18(12),
1633-1640.
[14] dChip User`s Manual (2005)
http://biosun1.harvard.edu/complab/dchip
[15] 薛慧芬 (2005), The research of normalization methods
for high density oligonucleotide array, Thesis at
National Chengchi University.
[16] S. Dudoit, Y. H. Yang, M. J. Callow and T. P. Speed
(2000), Statistical methods for identifying
differentially expressed genes in replicated cDNA
microarray experiments.
描述 碩士
國立政治大學
統計研究所
93354023
94
資料來源 http://thesis.lib.nccu.edu.tw/record/#G0093354023
資料類型 thesis
dc.contributor.advisor 郭訓志<br>蔡紋琦zh_TW
dc.contributor.advisor Kuo, Hsun-Chih<br>Tsai, Wen-Chien_US
dc.contributor.author (Authors) 吳小萍zh_TW
dc.contributor.author (Authors) Wu, Hsiao-Pingen_US
dc.creator (作者) 吳小萍zh_TW
dc.creator (作者) Wu, Hsiao-Pingen_US
dc.date (日期) 2005en_US
dc.date.accessioned 2009-09-14-
dc.date.available 2009-09-14-
dc.date.issued (上傳時間) 2009-09-14-
dc.identifier (Other Identifiers) G0093354023en_US
dc.identifier.uri (URI) https://nccur.lib.nccu.edu.tw/handle/140.119/30906-
dc.description (描述) 碩士zh_TW
dc.description (描述) 國立政治大學zh_TW
dc.description (描述) 統計研究所zh_TW
dc.description (描述) 93354023zh_TW
dc.description (描述) 94zh_TW
dc.description.abstract (摘要) 微陣列矩陣晶片是一門現今被廣泛使用在許多領域的生物醫學研究,在本文,我們主要是對寡核甘酸微陣列矩陣晶片資料的正規化感興趣。為了比較不同的正規化方法,我們致力於模擬更接近真實寡核甘酸微陣列矩陣晶片的資料。在資料的模擬上,我們主要是根據Li和Wong的模型來進行模擬,並利用階層法來設定模型的參數。最後為了判別正規化方法的好壞,我們模擬了100組資料,並且利用四個判斷準則來做比較。模擬的結果表示,我們所提出的新方法
     (LOESS to Average),一般來說都比其他的正規化方法來的好。
zh_TW
dc.description.abstract (摘要) Microarray technology is now widely used in many areas of biomedical research. In this thesis, we are interested in the normalization for oligonucleotide Microarray data. We aimed to simulate more realistic oligonucleotide microarry data in order to compare different normalization methods. The data simulation was based on Li and Wong`s model with a hierarchical setup for parameters. In order to compare normalization methods, 100 data sets were simulated data. The performance of ten normalization methods was assessed based on four comparison criteria. Simulation results suggest that our new proposed normalization method, LOESS
     to Average, is generally a better method than other normalization methods.
en_US
dc.description.tableofcontents 謝辭.....................................................{i}
     
     Abstract...............................................{ii}
     
     中文摘要...............................................{iii}
     
     1 Introduction..........................................{1}
     
     2 Literature Review.....................................{3}
     
      2.1 Affymetrix Gene Chip Technologies.................{3}
     
      2.2 Li and Wong`s Model...............................{4}
     
      2.3 DNA-Chip (dChip)..................................{4}
     
      2.3.1 Invariant Normalization.......................{5}
     
      2.4 Robust Multi-Array Average (RMA)..................{5}
     
      2.4.1 Background Correction in RMA..................{6}
     
      2.4.2 Quantile Normalization in RMA.................{7}
     
      2.4.3 Summarization in RMA: Median Polish...........{7}
     
      2.5 Microarray Analysis Suite Software (MAS 5.0)......{9}
     
      2.5.1 Background Correction in MAS 5.0..............{9}
     
      2.5.2 The Ideal Mismatch Value (IM)................{10}
     
      2.5.3 The Adjusted Log-Transformed PM Intensities..{10}
     
      2.5.4 One Step Tukey Biweight Algorithm............{11}
     
      2.5.5 Scaling Normalization........................{11}
     
      2.6 omparisons of Normalization Methods..............{12}
     
     3 Methodology..........................................{13}
     
      3.1 Scaling Method...................................{13}
     
      3.2 Median Centered..................................{15}
     
      3.3 Hybrid Scaling-Median Centered Methods...........{15}
     
      3.4 Z^* Scores.......................................{16}
     
      3.5 Quantile Normalization...........................{16}
     
      3.6 Cyclic LOESS.....................................{18}
     
      3.7 New Proposed Normalization Method: LOESS to Average
      .................................................{20}
     
     4 Real Data............................................{21}
     
      4.1 Real Data........................................{21}
     
      4.2 The Perfect Match (PM) Value.....................{21}
     
      4.3 The Mismatch (MM) Value..........................{22}
     
      4.4 The Theta (θ)....................................{23}
     
      4.5 The Phi (Φ)......................................{24}
     
     5 Simulation...........................................{26}
     
      5.1 Common Simulation Setting........................{26}
     
      5.2 Simulation Settings for Differentially Expressed
      Genes............................................{27}
     
      5.3 Simulated Data...................................{28}
     
     6 Comparisons of Normalization Methods.................{30}
     
      6.1 Interquarter Range (IQR).........................{30}
     
      6.2 Diff-statistics..................................{31}
     
      6.3 Mean Standard Deviation (MSD)....................{33}
     
      6.3.1 Overall MSD..................................{33}
     
      6.3.2 Diff-MSD.....................................{34}
     
      6.4 Ratio............................................{36}
     
     7 Discussion and Future Work...........................{38}
     
      7.1 Discussion of Comparison Criteria................{38}
     
      7.2 Summary of Comparisons for Normalization Methods.{39}
     
      7.3 Discussion of Simulation Settings................{39}
     
      7.3.1 Simulation Setting 1.........................{39}
     
      7.3.2 Simulation Setting 2 and Setting 3...........{40}
     
      7.4 Future Work......................................{41}
     
     References.............................................{43}
     
     Appendix...............................................{45}
zh_TW
dc.language.iso en_US-
dc.source.uri (資料來源) http://thesis.lib.nccu.edu.tw/record/#G0093354023en_US
dc.subject (關鍵詞) 微陣列矩陣zh_TW
dc.subject (關鍵詞) 正規化zh_TW
dc.subject (關鍵詞) microarrayen_US
dc.subject (關鍵詞) normalizationen_US
dc.title (題名) 模擬高密度寡聚核甘酸微陣列矩陣資料及正規化方法之探討zh_TW
dc.title (題名) A Simulation Study on High Density Oligonucleotide Microarray Data With Discussion of Normalization Methodsen_US
dc.type (資料類型) thesisen
dc.relation.reference (參考文獻) [1] Affymetrix (2002), Statistical algorithms descriptionzh_TW
dc.relation.reference (參考文獻) document, Technical report, Affymetrix.zh_TW
dc.relation.reference (參考文獻) [2] B. M. Bolstad, R. A. Irizarry, M. Astrand and T. P.zh_TW
dc.relation.reference (參考文獻) Speed (2003), A comparison of normalization methods forzh_TW
dc.relation.reference (參考文獻) high density oligonucleotide array data based onzh_TW
dc.relation.reference (參考文獻) variance and bias, Bioinformatics, 19(2), 185-193.zh_TW
dc.relation.reference (參考文獻) [3] R. A. Irizarry, B. Hobbs, F. Collin, Y. D. Beazer-zh_TW
dc.relation.reference (參考文獻) Barclay, K. J. Antonellis, U. Scherf and T. P. Speedzh_TW
dc.relation.reference (參考文獻) (2003), Exploration, normalization, and summaries ofzh_TW
dc.relation.reference (參考文獻) high density oligonucleotide array probe level data,zh_TW
dc.relation.reference (參考文獻) Biostatistics, 4(2), 249-264.zh_TW
dc.relation.reference (參考文獻) [4] C. Li and W. H. Wong (2001a), Model-based analysis ofzh_TW
dc.relation.reference (參考文獻) oligonucleotide arrays: expression index computationzh_TW
dc.relation.reference (參考文獻) and outlier detection, Proceedings of the Nationalzh_TW
dc.relation.reference (參考文獻) Academy of Science USA, 98, 31-36.zh_TW
dc.relation.reference (參考文獻) [5] C. Li and W. H. Wong (2001b), Model-based analysis ofzh_TW
dc.relation.reference (參考文獻) oligonucleotide arrays: model validation, design issueszh_TW
dc.relation.reference (參考文獻) and standard error application, Genome Biology 2(8):zh_TW
dc.relation.reference (參考文獻) research 0032.1-0032.11.zh_TW
dc.relation.reference (參考文獻) [6] R. A. Irizarry, B. M. Bolstad, F. Collin, L. M. Cope,zh_TW
dc.relation.reference (參考文獻) B. Hobbs and T. P. Speed (2003), Summaries ofzh_TW
dc.relation.reference (參考文獻) affymetrix GeneChip probe level data, Nucleic Acidszh_TW
dc.relation.reference (參考文獻) Research, 31(4), e15.zh_TW
dc.relation.reference (參考文獻) [7] B. Bolstad (2001), Probe level quantile normalization ofzh_TW
dc.relation.reference (參考文獻) high density oligonucleotide array data, Division ofzh_TW
dc.relation.reference (參考文獻) Biostatistics.zh_TW
dc.relation.reference (參考文獻) [8] B. Bolstad (2002), Comparing the effects of background,zh_TW
dc.relation.reference (參考文獻) normalization and summarization on gene expressionzh_TW
dc.relation.reference (參考文獻) estimates.zh_TW
dc.relation.reference (參考文獻) [9] Affymetrix (2001), GeneChip arrays provide optimalzh_TW
dc.relation.reference (參考文獻) sensitivity and specificity for microarray expressionzh_TW
dc.relation.reference (參考文獻) analysis, Affymetrix.zh_TW
dc.relation.reference (參考文獻) [10] B. M. Bolstad (2004), Low-level analysis of high-zh_TW
dc.relation.reference (參考文獻) density Oligonucleotide array data: background,zh_TW
dc.relation.reference (參考文獻) normalization and summarization.zh_TW
dc.relation.reference (參考文獻) [11] D. Holder, R. F. Raubertas, V. Bill Pikounis, V.zh_TW
dc.relation.reference (參考文獻) Svetnik and K. Soper, statistical analysis of highzh_TW
dc.relation.reference (參考文獻) density oligonucleotide arrays: a safer approach,zh_TW
dc.relation.reference (參考文獻) Merck Research Laboratories, WP37C-305, West Point, PAzh_TW
dc.relation.reference (參考文獻) 19486.zh_TW
dc.relation.reference (參考文獻) [12] F. Naef, D. A. Lim, N. Patil and M. O. Magnascozh_TW
dc.relation.reference (參考文獻) (2001),From features to expression: High-densityzh_TW
dc.relation.reference (參考文獻) oligonucleotide array analysis revisited, Tech Report,zh_TW
dc.relation.reference (參考文獻) 1, 1-9.zh_TW
dc.relation.reference (參考文獻) [13] R. Sasik, E. Calvo and J. Corbeil (2002), Statisticalzh_TW
dc.relation.reference (參考文獻) analysis of high-density oligonucleotide arrays: azh_TW
dc.relation.reference (參考文獻) multiplicative noise model, Bioinformatics 18(12),zh_TW
dc.relation.reference (參考文獻) 1633-1640.zh_TW
dc.relation.reference (參考文獻) [14] dChip User`s Manual (2005)zh_TW
dc.relation.reference (參考文獻) http://biosun1.harvard.edu/complab/dchipzh_TW
dc.relation.reference (參考文獻) [15] 薛慧芬 (2005), The research of normalization methodszh_TW
dc.relation.reference (參考文獻) for high density oligonucleotide array, Thesis atzh_TW
dc.relation.reference (參考文獻) National Chengchi University.zh_TW
dc.relation.reference (參考文獻) [16] S. Dudoit, Y. H. Yang, M. J. Callow and T. P. Speedzh_TW
dc.relation.reference (參考文獻) (2000), Statistical methods for identifyingzh_TW
dc.relation.reference (參考文獻) differentially expressed genes in replicated cDNAzh_TW
dc.relation.reference (參考文獻) microarray experiments.zh_TW