Publications-Theses
Article View/Open
Publication Export
-
題名 基於圖論之高通量染色體結構捕獲連結網路視覺化與分析
Apply graph theory to visualizing and analyzing Hi-C contact network作者 甘岱珺
Kan, Tai-Chun貢獻者 張家銘
Chang, Jia-Ming
甘岱珺
Kan, Tai-Chun關鍵詞 Hi-C
連結熱圖
連結網路
圖論
網路嵌入
資料視覺化
Shiny
Hi-C
Contact map
Contact network
Graph theory
Network embedding
Information visualization
Shiny日期 2018 上傳時間 1-Oct-2018 12:11:00 (UTC+8) 摘要 在本研究中主要探討於遠距離規模下基因片段交互作用的情況,並且運用網路拓撲分析其表現模式和生物性功能。網路特性能夠有效率地測量圖論中節點的重要性,以及節點彼此之間的關聯性,藉此辨識在生物系統裡的中心元素。本研究應用各種網路拓撲方法分析高通量染色體結構捕獲連結網路,然後結合 t-SNE 和 Network Embedding 進行資料分群。此外,HiCONET 是針對 Hi-C 資料提供連結熱圖和網路結構視覺化的服務平台。圖形化介面可以協助使用者在視覺上搜尋特定資料模式,同時連結熱圖與網路圖中相關聯的資料內容。借助 R Shiny 平台,使用者能夠透過點選視覺化結果和調整參數,互動式地探索其感興趣的資料範圍。此網路服務平台的網址是 https://changlab.shinyapps.io/hiconet/。
In this work we explore the interactions of gene regions in long-range scale with network topologies for analyzing expression patterns and biological functionalities. Network features help us efficiently measure the significance of nodes and relationships between other nodes, in order to identify the central elements in a biological system. We apply different network topological measures in analyzing Hi-C contact network, then use t-SNE and network embedding method for clustering. Furthermore, we developed a web server to visualize Hi-C data by contact map and network, HiCONET. The graphical interface lets users visually search for patterns in the Hi-C data, as simultaneously plotting related genomic region among contact map and network. Besides, users can interactively explore interesting regions through clicking network or selecting parameters of Hi-C data thanks to R Shiny platform. The server is free available in https://changlab.shinyapps.io/hiconet/.參考文獻 1.T. Sexton, E. Yaffe, E. Kenigsberg, F. Bantignies, B. Leblanc, M. Hoichman, H. Parrinello, A. Tanay, and G.Cavalli, “Three-Dimensional Folding and Functional Organization Principles of the Drosophila Genome,” in Cell148, pp. 458-472, 2012.2.B. Bonev and G. Cavalli, “Organization and function of the 3D genome,” in Nature Reviews Genetics volume 17,pp. 661-678, 2016.3.S. Rosa and P. Shaw, “Insights into chromatin structure and dynamics in plants,“ in Biology (Basel), Vol. 2(4), pp.1378-1410, 2013.4.J. R. Dixon, D. U. Gorkin, and B. Ren, "Chromatin domains: the unit of chromosome organization," in Mol. Cell,Vol. 62, pp. 668-680, 2016.5.J. Dostie, J. Dekker, “ Mapping networks of physical interactions between genomic elements using 5C technology,”in Nat. Protoc. 2, pp. 988-1002, 2007.6.A. S. Belmont, "Large-scale chromatin organization: the good, the surprising, and the still perplexing," in Curr. Opin.Cell Biol., Vol. 26, pp. 69-78, 2014.7.K. S. Sandhu, G. Li , H. M. Poh, ..., Y. Ruan, “Large-scale functional organization of long-range chromatininteraction networks,” in Cell Rep, Vol. 2(5), pp. 1207-1219, 2012.8.J. Dekker, K. Rippe, M. Dekker, and N. Kleckner, “Capturing chromosome conformation,” in Science, Vol. 295, pp.1306-1311, 2002.9.A. Pombo, and N. Dillon, "Three-dimensional genome architecture: players and mechanisms," in Nat. Rev. Mol.Cell Biol., Vol. 16, pp. 245-257, 2015.10. J. Dekker, “The three ‘C’ s of chromosome conformation capture: controls, controls, controls,” in Nat. Methods 3,pp. 17-21, 2006.11. Z. Zhao, G. Tavoosidana, M. Sjolinder, A. Gondor, ..., U. Singh, “Circular chromosome conformation capture (4C)uncovers extensive networks of epigenetically regulated intra- and interchromosomal interactions,” in Nat. Genet.38, pp. 1341-1347, 2006.12. E. Lieberman-Aiden, N. L. van Berkum, L. Williams, M. Imakaev, ..., J. Dekker, “Comprehensive mapping of longrange interactions reveals folding principles of the human genome,” in Science, Vol. 326(5950), pp. 289-293, 2009.13. W. de Laat and D. Duboule, “Topology of mammalian developmental enhancers and their regulatory landscapes,” inNature, Vol. 502, pp. 499-506, 2013.14. Nora E.P. , Lajoie B.R., Schulz E.G., Giorgetti L., Okamoto I., Servant N., Piolot T., van Berkum N.L., Meisig J.,Sedat J.et al. Spatial partitioning of the regulatory landscape of the X-inactivation centre. Nature . 2012; 485:381–385.15. N. Matharu and N. Ahituv, “Minor Loops in Major Folds: Enhancer-Promoter Looping, Chromatin Restructuring,and Their Association with Transcriptional Regulation and Disease,” in PLoS Genet, 2015.16. G. A. Pavlopoulos, M. Secrier, C. N. Moschopoulos, ..., P. G. Bagos, “Using graph theory to analyze biologicalnetworks,” in BioData Min. 4, 10, 2011.3917. C. Chin, S. Wu, H. Ho, M. Ko, and C. Lin, “cytoHubba: Identifying hub objects and sub-networks from complexinteractome,” in BMC Systems Biology, 8(Suppl 4):S11, 2014.18. H. Jeong, B. Tombor, R. Albert, Z. N. Oltvai, and A. L. Barabási, “The large-scale organization of metabolicnetworks,” in Nature, Vol. 407, pp. 651-654, 2000.19. J. B. Morlot, J. Mozziconacci, and A. Lesne, “Network concepts for analyzing 3D genome structure fromchromosomal contact maps,” in EPJ Nonlinear Biomed Phys, 4: 2, 2016.20. M. W. Schmid, S. Grob, and U. Grossniklaus, “HiCdat: a fast and easy-to-use Hi-C data analysis tool,” in BMCBioinformatics, Vol. 16, pp. 277, 2015.21. G. Castellano, F. Le Dily, A. Hermoso Pulido, M. Beato, and G. Roma, “Hi-Cpipe: a pipeline for high-throughputchromosome capture,” in bioRxiv. Cold Spring Harbor Labs Journals, 2015.22. E. C. Schofield, T. Carver, P. Achuthan, P. Freire-Pritchett, M. Spivakov, J. A. Todd, O. S. Burren, “CHiCP: a web-based tool for the integrative and interactive visualization of promoter capture Hi-C datasets,” in Bioinformatics,Vol. 32, Issue 16, pp. 2511–2513, 2016.23. N. C Durand, J. T. Robinson, M. S. Shamim, I. Machol, J. P. Mesirov, E. S. Lander, and E. Lieberman Aiden,“Juicebox provides a visualization system for Hi-C contact maps with unlimited zoom,” in Cell Systems 3(1), 2016.24. R. Kumar, H. Sobhy, P. Stenberg, and L. Lizana, “Genome Contact Map Explorer - A platform for the comparison,interactive visualization and analysis of genome contact maps,” in Nucleic Acids Res, Vol. 45, Issue 17, pp. e152,2017.25. A. Thibodeau, E. J. Márquez, O. Luo, Y. Ruan, F. Menghi, D. G. Shin, M. L. Stitzel, P. Vera-Licona, and D. Ucar,“QuIN: A Web Server for Querying and Visualizing Chromatin Interaction Networks, ” in PLoS Comput Biol, 2016.26. S. Babaei, A. Mahfouz, M. Hulsman, B. P. Lelieveldt, J. de Ridder, and M. Reinders, “Hi-C chromatin interactionnetworks predict co-expression in the mouse cortex,” in PLoS Comput Biol, 11(5):1004221, 2015.27. B. Schuettengruber, N. Oded Elkayam, T. Sexton, M. Entrevan, S. Stern, A. Thomas, E. Yaffe, H. Parrinello, A.Tanay, and G. Cavalli, “Cooperativity, specificity, and evolutionary stability of Polycomb targeting in Drosophila,”in Cell Reports, 2014.28. Q. Szabo, D. Jost, J. M. Chang, ... and G. Cavalli, “TADs are 3D structural units of higher-order chromosomeorganization in Drosophila,” in Science Advances, 2018.29. T. Schauer, Y. Ghavi-Helm, T. Sexton, ... , P. B. Becker, “Chromosome topology guides the Drosophila DosageCompensation Complex for target gene activation,” in EMBO reports, 2017.30. Ay F, Noble WS. Analysis methods for studying the 3D architecture of the genome. Genome Biol. 2015;16:1–15.31. Schmitt AD, Hu M, Ren B. Genome-wide mapping and analysis of chromosome architecture. Nat Rev Mol CellBiol. 2016;17:743–55.32. P. Erdös and A. Rényi, “On the evolution of random graphs,” in Publ. Math. Inst. Hung. Acad. Sci, Vol. 5, pp. 17-61,1960.33. B. A. László and A. Réka, “Emergence of Scaling in Random Networks,” in Science, Vol. 286, Issue. 5439, pp.509-512, 1999.4034. Peng Cui, Xiao Wang, Jian Pei, and Wenwu Zhu, “A Survey on Network Embedding,” in arXiv preprintarXiv:1711.08752, 2017.35. Xiangyu Li, Weizheng Chen, Yang Chen, Xuegong Zhang, Jin Gu Michael, and Q. Zhang, “Network embedding-based representation learning for single cell RNA-seq data,” in Nucleic Acids Research, Vol 45, Issue 19, pp. E166,2017.36. Jian Tang, Qu Meng, Mingzhe Wang, Ming Zhang, Jun Yan, and Qiaozhu Mei, “Line: Large-scale informationnetwork embedding,” in Proceedings of the 24th International Conference on World Wide Web, 2015.37. B. Perozzi, R.i Al-Rfou, and S. Skiena, “DeepWalk: Online Learning of Social Representations,” in KDD `14Proceedings of the 20th ACM SIGKDD international conference on Knowledge discovery and data mining, pp. 701-710, 2014.38. L. van der Maaten and G. Hinton, “Visualizing Data using t-SNE,” in Journal of Machine Learning Research, Vol. 9,pp. 2579-2605, 2008.39. H. Jeong, S. P. Mason, A. L. Barabási and Z. N. Oltvai, “Lethality and centrality in protein networks,” in Nature,Vol. 411, pp. 41-42, 2001. 描述 碩士
國立政治大學
資訊科學系
105753026資料來源 http://thesis.lib.nccu.edu.tw/record/#G0105753026 資料類型 thesis dc.contributor.advisor 張家銘 zh_TW dc.contributor.advisor Chang, Jia-Ming en_US dc.contributor.author (Authors) 甘岱珺 zh_TW dc.contributor.author (Authors) Kan, Tai-Chun en_US dc.creator (作者) 甘岱珺 zh_TW dc.creator (作者) Kan, Tai-Chun en_US dc.date (日期) 2018 en_US dc.date.accessioned 1-Oct-2018 12:11:00 (UTC+8) - dc.date.available 1-Oct-2018 12:11:00 (UTC+8) - dc.date.issued (上傳時間) 1-Oct-2018 12:11:00 (UTC+8) - dc.identifier (Other Identifiers) G0105753026 en_US dc.identifier.uri (URI) http://nccur.lib.nccu.edu.tw/handle/140.119/120260 - dc.description (描述) 碩士 zh_TW dc.description (描述) 國立政治大學 zh_TW dc.description (描述) 資訊科學系 zh_TW dc.description (描述) 105753026 zh_TW dc.description.abstract (摘要) 在本研究中主要探討於遠距離規模下基因片段交互作用的情況,並且運用網路拓撲分析其表現模式和生物性功能。網路特性能夠有效率地測量圖論中節點的重要性,以及節點彼此之間的關聯性,藉此辨識在生物系統裡的中心元素。本研究應用各種網路拓撲方法分析高通量染色體結構捕獲連結網路,然後結合 t-SNE 和 Network Embedding 進行資料分群。此外,HiCONET 是針對 Hi-C 資料提供連結熱圖和網路結構視覺化的服務平台。圖形化介面可以協助使用者在視覺上搜尋特定資料模式,同時連結熱圖與網路圖中相關聯的資料內容。借助 R Shiny 平台,使用者能夠透過點選視覺化結果和調整參數,互動式地探索其感興趣的資料範圍。此網路服務平台的網址是 https://changlab.shinyapps.io/hiconet/。 zh_TW dc.description.abstract (摘要) In this work we explore the interactions of gene regions in long-range scale with network topologies for analyzing expression patterns and biological functionalities. Network features help us efficiently measure the significance of nodes and relationships between other nodes, in order to identify the central elements in a biological system. We apply different network topological measures in analyzing Hi-C contact network, then use t-SNE and network embedding method for clustering. Furthermore, we developed a web server to visualize Hi-C data by contact map and network, HiCONET. The graphical interface lets users visually search for patterns in the Hi-C data, as simultaneously plotting related genomic region among contact map and network. Besides, users can interactively explore interesting regions through clicking network or selecting parameters of Hi-C data thanks to R Shiny platform. The server is free available in https://changlab.shinyapps.io/hiconet/. en_US dc.description.tableofcontents 1. Introduction 72. Background 92.1. Chromosome Conformation Capture (3C) 92.2. High-throughput Chromatin Conformation Capture (Hi-C) 102.3. Topologically Associating Domain (TAD) 122.4. Biological Contact Network 133. Related Works 143.1. Hi-C Data Visualization 143.2. Hi-C Data Network Analysis 144. Methods 154.1. Hi-C Data Processing 154.2. Epigenetic TADs 164.3. Hi-C Contact Matrix 194.4. Hi-C Contact Network 214.5. Network Properties 224.6. Network Models 234.7. Network Centrality Measures 244.8. Epigenetic TADs Clustering 254.8.1. Network Embedding 254.8.2. t-Distributed Stochastic Neighbor Embedding (t-SNE) 265. Visualization 285.1. System Structure 285.2. Implementation 296. Results and Discussion 326.1. Network Connectivity 326.2. Network Centrality Distribution per TADs 337. Conclusions 368. Reference 39 zh_TW dc.format.extent 1461288 bytes - dc.format.mimetype application/pdf - dc.source.uri (資料來源) http://thesis.lib.nccu.edu.tw/record/#G0105753026 en_US dc.subject (關鍵詞) Hi-C zh_TW dc.subject (關鍵詞) 連結熱圖 zh_TW dc.subject (關鍵詞) 連結網路 zh_TW dc.subject (關鍵詞) 圖論 zh_TW dc.subject (關鍵詞) 網路嵌入 zh_TW dc.subject (關鍵詞) 資料視覺化 zh_TW dc.subject (關鍵詞) Shiny zh_TW dc.subject (關鍵詞) Hi-C en_US dc.subject (關鍵詞) Contact map en_US dc.subject (關鍵詞) Contact network en_US dc.subject (關鍵詞) Graph theory en_US dc.subject (關鍵詞) Network embedding en_US dc.subject (關鍵詞) Information visualization en_US dc.subject (關鍵詞) Shiny en_US dc.title (題名) 基於圖論之高通量染色體結構捕獲連結網路視覺化與分析 zh_TW dc.title (題名) Apply graph theory to visualizing and analyzing Hi-C contact network en_US dc.type (資料類型) thesis en_US dc.relation.reference (參考文獻) 1.T. Sexton, E. Yaffe, E. Kenigsberg, F. Bantignies, B. Leblanc, M. Hoichman, H. Parrinello, A. Tanay, and G.Cavalli, “Three-Dimensional Folding and Functional Organization Principles of the Drosophila Genome,” in Cell148, pp. 458-472, 2012.2.B. Bonev and G. Cavalli, “Organization and function of the 3D genome,” in Nature Reviews Genetics volume 17,pp. 661-678, 2016.3.S. Rosa and P. Shaw, “Insights into chromatin structure and dynamics in plants,“ in Biology (Basel), Vol. 2(4), pp.1378-1410, 2013.4.J. R. Dixon, D. U. Gorkin, and B. Ren, "Chromatin domains: the unit of chromosome organization," in Mol. Cell,Vol. 62, pp. 668-680, 2016.5.J. Dostie, J. Dekker, “ Mapping networks of physical interactions between genomic elements using 5C technology,”in Nat. Protoc. 2, pp. 988-1002, 2007.6.A. S. Belmont, "Large-scale chromatin organization: the good, the surprising, and the still perplexing," in Curr. Opin.Cell Biol., Vol. 26, pp. 69-78, 2014.7.K. S. Sandhu, G. Li , H. M. Poh, ..., Y. Ruan, “Large-scale functional organization of long-range chromatininteraction networks,” in Cell Rep, Vol. 2(5), pp. 1207-1219, 2012.8.J. Dekker, K. Rippe, M. Dekker, and N. Kleckner, “Capturing chromosome conformation,” in Science, Vol. 295, pp.1306-1311, 2002.9.A. Pombo, and N. Dillon, "Three-dimensional genome architecture: players and mechanisms," in Nat. Rev. Mol.Cell Biol., Vol. 16, pp. 245-257, 2015.10. J. Dekker, “The three ‘C’ s of chromosome conformation capture: controls, controls, controls,” in Nat. Methods 3,pp. 17-21, 2006.11. Z. Zhao, G. Tavoosidana, M. Sjolinder, A. Gondor, ..., U. Singh, “Circular chromosome conformation capture (4C)uncovers extensive networks of epigenetically regulated intra- and interchromosomal interactions,” in Nat. Genet.38, pp. 1341-1347, 2006.12. E. Lieberman-Aiden, N. L. van Berkum, L. Williams, M. Imakaev, ..., J. Dekker, “Comprehensive mapping of longrange interactions reveals folding principles of the human genome,” in Science, Vol. 326(5950), pp. 289-293, 2009.13. W. de Laat and D. Duboule, “Topology of mammalian developmental enhancers and their regulatory landscapes,” inNature, Vol. 502, pp. 499-506, 2013.14. Nora E.P. , Lajoie B.R., Schulz E.G., Giorgetti L., Okamoto I., Servant N., Piolot T., van Berkum N.L., Meisig J.,Sedat J.et al. Spatial partitioning of the regulatory landscape of the X-inactivation centre. Nature . 2012; 485:381–385.15. N. Matharu and N. Ahituv, “Minor Loops in Major Folds: Enhancer-Promoter Looping, Chromatin Restructuring,and Their Association with Transcriptional Regulation and Disease,” in PLoS Genet, 2015.16. G. A. Pavlopoulos, M. Secrier, C. N. Moschopoulos, ..., P. G. Bagos, “Using graph theory to analyze biologicalnetworks,” in BioData Min. 4, 10, 2011.3917. C. Chin, S. Wu, H. Ho, M. Ko, and C. Lin, “cytoHubba: Identifying hub objects and sub-networks from complexinteractome,” in BMC Systems Biology, 8(Suppl 4):S11, 2014.18. H. Jeong, B. Tombor, R. Albert, Z. N. Oltvai, and A. L. Barabási, “The large-scale organization of metabolicnetworks,” in Nature, Vol. 407, pp. 651-654, 2000.19. J. B. Morlot, J. Mozziconacci, and A. Lesne, “Network concepts for analyzing 3D genome structure fromchromosomal contact maps,” in EPJ Nonlinear Biomed Phys, 4: 2, 2016.20. M. W. Schmid, S. Grob, and U. Grossniklaus, “HiCdat: a fast and easy-to-use Hi-C data analysis tool,” in BMCBioinformatics, Vol. 16, pp. 277, 2015.21. G. Castellano, F. Le Dily, A. Hermoso Pulido, M. Beato, and G. Roma, “Hi-Cpipe: a pipeline for high-throughputchromosome capture,” in bioRxiv. Cold Spring Harbor Labs Journals, 2015.22. E. C. Schofield, T. Carver, P. Achuthan, P. Freire-Pritchett, M. Spivakov, J. A. Todd, O. S. Burren, “CHiCP: a web-based tool for the integrative and interactive visualization of promoter capture Hi-C datasets,” in Bioinformatics,Vol. 32, Issue 16, pp. 2511–2513, 2016.23. N. C Durand, J. T. Robinson, M. S. Shamim, I. Machol, J. P. Mesirov, E. S. Lander, and E. Lieberman Aiden,“Juicebox provides a visualization system for Hi-C contact maps with unlimited zoom,” in Cell Systems 3(1), 2016.24. R. Kumar, H. Sobhy, P. Stenberg, and L. Lizana, “Genome Contact Map Explorer - A platform for the comparison,interactive visualization and analysis of genome contact maps,” in Nucleic Acids Res, Vol. 45, Issue 17, pp. e152,2017.25. A. Thibodeau, E. J. Márquez, O. Luo, Y. Ruan, F. Menghi, D. G. Shin, M. L. Stitzel, P. Vera-Licona, and D. Ucar,“QuIN: A Web Server for Querying and Visualizing Chromatin Interaction Networks, ” in PLoS Comput Biol, 2016.26. S. Babaei, A. Mahfouz, M. Hulsman, B. P. Lelieveldt, J. de Ridder, and M. Reinders, “Hi-C chromatin interactionnetworks predict co-expression in the mouse cortex,” in PLoS Comput Biol, 11(5):1004221, 2015.27. B. Schuettengruber, N. Oded Elkayam, T. Sexton, M. Entrevan, S. Stern, A. Thomas, E. Yaffe, H. Parrinello, A.Tanay, and G. Cavalli, “Cooperativity, specificity, and evolutionary stability of Polycomb targeting in Drosophila,”in Cell Reports, 2014.28. Q. Szabo, D. Jost, J. M. Chang, ... and G. Cavalli, “TADs are 3D structural units of higher-order chromosomeorganization in Drosophila,” in Science Advances, 2018.29. T. Schauer, Y. Ghavi-Helm, T. Sexton, ... , P. B. Becker, “Chromosome topology guides the Drosophila DosageCompensation Complex for target gene activation,” in EMBO reports, 2017.30. Ay F, Noble WS. Analysis methods for studying the 3D architecture of the genome. Genome Biol. 2015;16:1–15.31. Schmitt AD, Hu M, Ren B. Genome-wide mapping and analysis of chromosome architecture. Nat Rev Mol CellBiol. 2016;17:743–55.32. P. Erdös and A. Rényi, “On the evolution of random graphs,” in Publ. Math. Inst. Hung. Acad. Sci, Vol. 5, pp. 17-61,1960.33. B. A. László and A. Réka, “Emergence of Scaling in Random Networks,” in Science, Vol. 286, Issue. 5439, pp.509-512, 1999.4034. Peng Cui, Xiao Wang, Jian Pei, and Wenwu Zhu, “A Survey on Network Embedding,” in arXiv preprintarXiv:1711.08752, 2017.35. Xiangyu Li, Weizheng Chen, Yang Chen, Xuegong Zhang, Jin Gu Michael, and Q. Zhang, “Network embedding-based representation learning for single cell RNA-seq data,” in Nucleic Acids Research, Vol 45, Issue 19, pp. E166,2017.36. Jian Tang, Qu Meng, Mingzhe Wang, Ming Zhang, Jun Yan, and Qiaozhu Mei, “Line: Large-scale informationnetwork embedding,” in Proceedings of the 24th International Conference on World Wide Web, 2015.37. B. Perozzi, R.i Al-Rfou, and S. Skiena, “DeepWalk: Online Learning of Social Representations,” in KDD `14Proceedings of the 20th ACM SIGKDD international conference on Knowledge discovery and data mining, pp. 701-710, 2014.38. L. van der Maaten and G. Hinton, “Visualizing Data using t-SNE,” in Journal of Machine Learning Research, Vol. 9,pp. 2579-2605, 2008.39. H. Jeong, S. P. Mason, A. L. Barabási and Z. N. Oltvai, “Lethality and centrality in protein networks,” in Nature,Vol. 411, pp. 41-42, 2001. zh_TW dc.identifier.doi (DOI) 10.6814/THE.NCCU.CS.014.2018.B02 en_US