Please use this identifier to cite or link to this item: https://ah.lib.nccu.edu.tw/handle/140.119/110783
題名: Hi-C實驗資料正規化
Hi-C data normalization
作者: 魏孝全
貢獻者: 薛慧敏
魏孝全
關鍵詞: 染色體捕捉技術
Hi-C實驗資料
正規化
基因特徵偏差
Chromosome conformation capture
Hi-C data
Normalization
Genome feature
日期: 2017
上傳時間: 11-Jul-2017
摘要: 本研究探討高通量染色體捕捉技術 (high-throughput chromosome conformation capture, Hi-C) 實驗所產生的關聯矩陣資料之正規化方法。已知該類實驗主要用來測量染色體之間的空間距離,正規化的目的是移除資料中的系統性偏差,本文主要針對基因特徵所造成之偏差。有別於Hu等人 (2012) 所提出的「局部基因特徵正規化法」(local genome feature normalization, LGF法),我們所提出的「二次函數正規化法」(quadratic function normalization, QF法) 建立在更為一般化的二次對數模型與負二項分配假設上。本研究透過模擬實驗以及人類淋巴細胞資料 (GSE18199) 來評估QF法的表現,並且與其他方法比較。在模擬實驗中,我們發現當模型正確時,QF法能有效消除偏差。在實例中,當基因特徵偏差被消除後,則染色體之間的相對距離在重複實驗資料之間有更為一致的結果。另一方面,我們發現實驗所採用的限制酶影響關聯矩陣的結果,而且運用這些正規化方法並不能有效消除限制酶造成的偏差。
Recently, the high-throughput chromosome conformation capture (Hi-C) experiment is developed to explore the three-dimensional structure of genomics. To assess the chromosomal interaction, a contact matrix is produced from a Hi-C experiment. Very often, systematic technical biases appear in the contact matrix and lead to inadequate conclusions. Consequently, data normalization to remove these biases is essential and necessary prior advanced inference. In this research, we propose the so-called quadratic function normalization method, which is a modification of the local genome feature normalization (Hu et al., 2012) by considering a more general model. Simulation studies are conducted to evaluate the proposed method. When the model assumption holds, the proposed method has adequate performance. Further, a Hi-C data set of a human lymphoblastoid cell GSE18199 is employed for a comparison of our method and two existing methods. It’s observed that normalization improves the reproducibility between experimental replicates. However, the effect of normalization is lean in eliminating the bias of restriction enzymes.
參考文獻: 參考資料\nAgard DA, Hiraoka Y, Shaw P, Sedat JW, (1989).Fluorescence microscopy in three dimensions, Methods Cell Biol., 30, 353-377.\nDekker J, Rippe K, Dekker M, Kleckner N, (2002).Capturing chromosome conformation, Science, 295, 1306-1311.\nDostie J, Richmond TA, Arnaout RA, Selzer RR, Lee WL, Honan TA, Rubio ED, Krumm A, Lamb J, Nusbaum C, Green RD, Dekker J, (2006).Chromosome Conformation Capture Carbon Copy (5C): A massively parallel solution for mapping interactions between genomic elements, Genome Res., 16, 1299-1309.\nDudchenko O, Batra SS, Omer AD, Nyquist SK, Hoeger M, Durand NC, Shamim MS, Machol I, Lander ES, Aiden AP, Aiden EL, (2017).De novo assembly of the Aedes aegypti genome using Hi-C yields chromosome-length scaffolds, Science, 356, 92-95.\nGerstein MB, Kundaje A, Hariharan M, Landt SG, Yan KK, Cheng C, Mu XJ, Khurana E, Rozowsky J, Alexander R, Min R, Alves P, Abyzov A, Addleman N, Bhardwaj N, Boyle AP, Cayting P, Charos A, Chen DZ, Cheng Y, Clarke D, Eastman C, Euskirchen G, Frietze S, Fu Y, Gertz J, Grubert F, Harmanci A, Jain P, Kasowski M, Lacroute P, Leng J, Lian J, Monahan H, O`Geen H, Ouyang Z, Partridge EC, Patacsil D, Pauli F, Raha D, Ramirez L, Reddy TE, Reed B, Shi M, Slifer T, Wang J, Wu L, Yang X, Yip KY, Zilberman-Schapira G, Batzoglou S, Sidow A, Farnham PJ, Myers RM, Weissman SM, Snyder M, (2012).Architecture of the human regulatory network derived from ENCODE data, Nature, 489, 91-100.\nHu M, Deng K, Selvaraj S, Qin Z, Ren B, Liu JS, (2012).HiCNorm: removing biases\n in Hi-C data via Poisson regression, Bioinformatics, 28, 3131-3133.\nImakaev M, Fudenberg G, McCord RP, Naumova N, Goloborodko A, Lajoie BR, Dekker J, Mirny LA, (2012).Iterative correction of Hi-C data reveals hallmarks of chromosome organization, Nature Methods, 9, 999-1003.\nLi H, Ruan J, Durbin R, (2008).Mapping short DNA sequencing reads and calling variants using mapping quality scores, Genome Res., 18, 1851-1858.\nLieberman-Aiden E, van Berkum NL, Williams L, Imakaev M, Ragoczy T, Telling A, Amit I, Lajoie BR, Sabo PJ, Dorschner MO, Sandstrom R, Bernstein B, Bender MA, Groudine M, Gnirke A, Stamatoyannopoulos J, Mirny LA, Lander ES, Dekker J, (2009).Comprehensive mapping of long range interactions reveals folding principles of the human genome, Science, 326, 289-293.\nLupiáñez DG, Kraft K, Heinrich V, Krawitz P, Brancati F, Klopocki E, Horn D, Kayserili H, Opitz JM, Laxova R, Santos-Simarro F, Gilbert-Dussardier B, Wittler L, Borschiwer M, Haas SA, Osterwalder M, Franke M, Timmermann B, Hecht J, Spielmann M, Visel A, Mundlos S, (2015).Disruptions of topological chromatin domains cause pathogenic rewiring of gene-enhancer interactions, Cell, 161, 1012-1025.\nSimonis M, Klous P, Splinter E, Moshkin Y, Willemsen R, de Wit E, van Steensel B, de Laat W, (2006).Nuclear organization of active and inactive chromatin domains uncovered by chromosome conformation capture–on-chip (4C), Nature Genetics, 38, 1348-1354.\nSexton T, Cavalli G, (2015). The role of chromosome domains in shaping the functional genome, Cell, 160, 1049–1059.\nYaffe E, Tanay A, (2011).Probabilistic modeling of Hi-C contact maps eliminates systematic biases to characterize global chromosomal architecture, Nature Genetics, 43, 1059-1065.
描述: 碩士
國立政治大學
統計學系
104354025
資料來源: http://thesis.lib.nccu.edu.tw/record/#G0104354025
資料類型: thesis
Appears in Collections:學位論文

Files in This Item:
File SizeFormat
402501.pdf1.21 MBAdobe PDF2View/Open
Show full item record

Google ScholarTM

Check


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.