Publications-Theses
Article View/Open
Publication Export
-
題名 穿透NAT的同儕網路資料發佈與檢索平台之研究
Study on a Cross-NAT P2P network platform for data publishing and searching作者 謝育霖
Hsieh, Yu-Lin貢獻者 蔡子傑
Tsai, Tzu-Chieh
謝育霖
Hsieh, Yu-Lin關鍵詞 大數據
點對點同儕網路
資料發佈與檢索平台
人工智慧
機密性
跨網域
協同網路
Big Data
P2P Data Publishing and Retrieval
AI
Confidentiality
NAT
Cross Domain
Federated Domain
Kademlia日期 2020 上傳時間 4-Jan-2021 11:31:20 (UTC+8) 摘要 大數據(Big Data)是近年的熱門議題,能為公眾創造更大的價值。然而不管是資料的發佈或是檢索,對目前現有的資料系統都有某些限制與不足。我們提出一個創新的P2P點對點同儕網路資料管理平台,並試圖以一個外溢保單的情境以說明我們系統的設計。我們預期此平台,將能讓參與者上傳資料與確保資料機密性,並可讓大數據與人工智慧演算法運用於平台進行運算,讓資料提供者的資料能大幅提升其實用性與影響力。由於NAT網路環境對於P2P的侷限,我們也提出新的NAT-RelayProtocol的網路架構,以利於跨網域(Cross-Domain)下的節點可以彼此儲存、獲取與搜尋資料,進而發展成平台的協同網域(Federated Domain)。此外,對於平台上的資料進行加密與解密的防護,用以增加存放於平台資料的安全性,同時也保障資料上傳者的所有權。我們改善Kademlia演算法,提出資料過濾方法,讓搜尋資料在運作上更全面且有效率。在這個平台也納入公平定價商業模式機制,使平台能更具永續性並擴大其應用的範圍。此平台的雛型驗證了我們的設計,期待有朝一日能落實。
Big Data is a hot topic in recent years, which can create greater value for the public. However, for either data publishing or retrieval, the existing data managementsystems still encounter many limitations and insufficiencies.We propose an innovative P2P data management platform and illustrate our design in an example for the health insurance spillover policy. We expect that the platform will allow participants to upload data with ensuring the data confidentiality, and allow big data and artificial intelligence algorithms to be associated with the platform, so that data can be fullyutilized to its greatest effectivenessand influence.Due to the barrier characteristics of NAT for P2P application, we also develop a new NAT-Relay Protocol network architecture, so that nodes under the cross-domain (Cross-Domain) can store, obtain and search data with each other, and thus form the Federated Domain of the platform. In addition, data encryption is used to increase the security of the data stored on the platform and also protect the ownerships of the data uploaders. We adopt the Kademlia algorithm, and modify it by adding a data filter method integrated with computingto make data operation more holistically and efficiently.A pricing business model mechanism can be also included to let the platform more sustainable and expand its scope of applications. This prototyping of the platform validates our design.It is expected to be in practice in the near future.參考文獻 [1] 李振武,白英彩,“影響 Internet 未來的對等網路”,上海交通大學金橋網路工程中心,技術報告,2003。[2] Sylvia Paul Ratnasamy:“A Scalable Content-Addressable Network,” PhD Dissertation of U.C.Berkeley, 2002[3] Ion Stoica, Robert Morris, David Karger, M. Frans Kaashoek, Hari Balakrishnan:“Chord:A Scalable Peer-to-peer Lookup Service for Internet Applications,” SIGCOMM’01, August 27-31, 2001,CA, USA.[4] A. Rowstron and P. Druschel:“Pastry:Scalable, distributed object location and routing for large – scale peer-to-peer systems.” In Proc. IFIP/ACM Middleware 2001, Heidelberg, Germany, Nov. 2001. https://udn.com/news/story/7239/3586060[5] Keith W. Ross, Dan Rubenstein, “P2P Systems”.[6] J Benet, IPFS - Content Addressed, Versioned, P2P File System., arXiv preprint arXiv:1407.3561, 2014[7] Y Zhao, K Kim, N Venkatasubramanian, DYNATOPS:a dynamic topic-based publish/subscribe architecture, DEBS `13:Proceedings of the 7th ACM international conference on Distributed event-based systemsJune 2013 Pages 75–86[8] D. Zhang, C. Zheng, H. Zhang, et al. Identification and analysis of Skype peer-to-peer traffic, in:Proceedings of the Fifth International Conference on Internet and Web Applications and Services, 2010, pp. 200-206.[9] M. Mauch, S. Dixon, Simultaneous estimation of Chords and musical context from audio, IEEE Transactions on Audio, Speech, and Language Processing, 2010, 18(6), pp. 1280-1289.[10] MIT, P2PSim:a simulator for Peer-to-Peer protocol, http://pdos.csail.mit.edu/P2Psim/ [EB/OL], 2010[11] Adán G. Medrano-Chávez, Elizabeth Pérez-Cortés, Miguel Lopez-Guerrero, A performance comparison of Chord andKademliaDHTs in high churn scenarios., 2015 – Springer.[12] Petar Maymounkov and David Mazi"eres, Kademlia:A Peer-to-Peer Information System Based on the XOR Metric,New York University, 2002.[13] Tonglin Li, RamanVerma, Xi Duan, Hui Jin, Ioan Raicu, Exploring distributed hash tables in High End computing., ACM SIGMETRICS Performance Evaluation Review, December, 2011.[14] An Overview of Cryptography, Available:https://www.garykessler.net/library/crypto.html[15] Traversal Using Relays around NAT(TURN):Relay Extensions to Session Traversal Utilities for NAT(STUN), RFC 5766, 2010.[16] S. Guha, Y. Takeda and P. Francis, “NUTSS:A SIP-based approach to UDP and TCP network connectivity,“ in Proceedings of ACM SIGCOMM Workshops, pp. 43-48, 2004.[17] Interactive Connectivity Establishment(ICE):A Protocol for Network Address Translator(NAT) Traversal for Offer/Answer Protocols, RFC 5245, 2010.[18] Rone-Yan Huang, Department of Computer Science National Chengchi University, MCLAB, Design and Implementation of a Data Publishing, Subscription and Computing Platform on the P2P Network, 2021.[19] Traversal Using Relays around NAT(TURN):Relay Extensions to Session Traversal Utilities for NAT(STUN), RFC 5766, 2010.[20] S. Guha, Y. Takeda and P. Francis, “NUTSS:A SIP-based approach to UDP and TCP network connectivity,“ in Proceedings of ACM SIGCOMM Workshops, pp. 43-48, 2004.[21] Interactive Connectivity Establishment(ICE):A Protocol for Network Address Translator(NAT) Traversal for Offer/Answer Protocols, RFC 5245, 2010.[22] Weixiong Rao, Roman Vitenberg, Lei Chen, Sasu Tarkoma, MTAF:An Adaptive Design for Keyword-Based Content Dissemination on DHT Networks, IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS, VOL. 26, NO. 4, APRIL 2015.[23] Bassirou Ngom, Mesaac Makpangou, Summary Prefix Tree:An over DHT Indexing Data Structure for Efficient Superset Search, 2017 IEEE 16th International Symposium on Network Computing and Applications(NCA).[24] Jeffrey DeanandSanjay Ghemawat, MapReduce:Simplified Data Processing on Large Clusters, googleusercontent.com,2004.[25] Praveen Khethavath, Johnson Thomas, Eric Chan-Tin, Hong Liu,Introducing a Distributed Cloud Architecture with Efficient Resource Discovery and Optimal Resource Allocation, IEEE Ninth World Congress on Services, 2013. 描述 碩士
國立政治大學
資訊科學系碩士在職專班
107971021資料來源 http://thesis.lib.nccu.edu.tw/record/#G0107971021 資料類型 thesis dc.contributor.advisor 蔡子傑 zh_TW dc.contributor.advisor Tsai, Tzu-Chieh en_US dc.contributor.author (Authors) 謝育霖 zh_TW dc.contributor.author (Authors) Hsieh, Yu-Lin en_US dc.creator (作者) 謝育霖 zh_TW dc.creator (作者) Hsieh, Yu-Lin en_US dc.date (日期) 2020 en_US dc.date.accessioned 4-Jan-2021 11:31:20 (UTC+8) - dc.date.available 4-Jan-2021 11:31:20 (UTC+8) - dc.date.issued (上傳時間) 4-Jan-2021 11:31:20 (UTC+8) - dc.identifier (Other Identifiers) G0107971021 en_US dc.identifier.uri (URI) http://nccur.lib.nccu.edu.tw/handle/140.119/133458 - dc.description (描述) 碩士 zh_TW dc.description (描述) 國立政治大學 zh_TW dc.description (描述) 資訊科學系碩士在職專班 zh_TW dc.description (描述) 107971021 zh_TW dc.description.abstract (摘要) 大數據(Big Data)是近年的熱門議題,能為公眾創造更大的價值。然而不管是資料的發佈或是檢索,對目前現有的資料系統都有某些限制與不足。我們提出一個創新的P2P點對點同儕網路資料管理平台,並試圖以一個外溢保單的情境以說明我們系統的設計。我們預期此平台,將能讓參與者上傳資料與確保資料機密性,並可讓大數據與人工智慧演算法運用於平台進行運算,讓資料提供者的資料能大幅提升其實用性與影響力。由於NAT網路環境對於P2P的侷限,我們也提出新的NAT-RelayProtocol的網路架構,以利於跨網域(Cross-Domain)下的節點可以彼此儲存、獲取與搜尋資料,進而發展成平台的協同網域(Federated Domain)。此外,對於平台上的資料進行加密與解密的防護,用以增加存放於平台資料的安全性,同時也保障資料上傳者的所有權。我們改善Kademlia演算法,提出資料過濾方法,讓搜尋資料在運作上更全面且有效率。在這個平台也納入公平定價商業模式機制,使平台能更具永續性並擴大其應用的範圍。此平台的雛型驗證了我們的設計,期待有朝一日能落實。 zh_TW dc.description.abstract (摘要) Big Data is a hot topic in recent years, which can create greater value for the public. However, for either data publishing or retrieval, the existing data managementsystems still encounter many limitations and insufficiencies.We propose an innovative P2P data management platform and illustrate our design in an example for the health insurance spillover policy. We expect that the platform will allow participants to upload data with ensuring the data confidentiality, and allow big data and artificial intelligence algorithms to be associated with the platform, so that data can be fullyutilized to its greatest effectivenessand influence.Due to the barrier characteristics of NAT for P2P application, we also develop a new NAT-Relay Protocol network architecture, so that nodes under the cross-domain (Cross-Domain) can store, obtain and search data with each other, and thus form the Federated Domain of the platform. In addition, data encryption is used to increase the security of the data stored on the platform and also protect the ownerships of the data uploaders. We adopt the Kademlia algorithm, and modify it by adding a data filter method integrated with computingto make data operation more holistically and efficiently.A pricing business model mechanism can be also included to let the platform more sustainable and expand its scope of applications. This prototyping of the platform validates our design.It is expected to be in practice in the near future. en_US dc.description.tableofcontents 目錄摘要 iAbstract ii致謝 iv第一章 簡介 11.1 背景和動機 11.2 平台與情境說明 31.3 目標與挑戰議題 41.4 貢獻 5第二章 相關研究 62.1 P2P DHT Network Overview 62.2 Kademlia Network Algorithm 102.2.1 何謂Kademlia 102.2.2 節點的距離 112.2.3 節點的狀態 122.2.4 何謂K-Bucket 142.2.5 K-Bucket的更新機制 152.2.6 Kademlia指令 162.2.7 路由查詢機制 172.2.8 的查詢 182.2.9 資料的存放 182.2.10 節點加入和離開 182.2.11 K-Bucket和二元樹 192.3 Data ENC & DEC Algorithm 212.4 NAT Network Overview 22第三章 平台的系統架構與設計 253.1 系統登入與註冊[18] 293.2 NCCU-Kademlia-NAT-Relay Protocol 293.2.1 Single NAT-Relay Node 293.2.2 Multiple NAT-Relay Nodes 353.3 NCCU-Kademlia-ENC/DEC Protocol 363.4 NCCU-Kademlia-Set Protocol & NCCU-Kademlia-Get Protocol 403.5 NCCU-Kademlia-Filter Protocol 423.6 NCCU-Kademlia Network Protocol 453.7 計算[18]與報酬 483.8 Term Definition 49第四章 平台的系統實作 524.1 實作程式碼 524.2 註冊與登入 544.3 主畫面、上傳與下載 56第五章 結論和未來展望 645.1 結論 645.2 未來展望 64參考資料 66 zh_TW dc.format.extent 5628194 bytes - dc.format.mimetype application/pdf - dc.source.uri (資料來源) http://thesis.lib.nccu.edu.tw/record/#G0107971021 en_US dc.subject (關鍵詞) 大數據 zh_TW dc.subject (關鍵詞) 點對點同儕網路 zh_TW dc.subject (關鍵詞) 資料發佈與檢索平台 zh_TW dc.subject (關鍵詞) 人工智慧 zh_TW dc.subject (關鍵詞) 機密性 zh_TW dc.subject (關鍵詞) 跨網域 zh_TW dc.subject (關鍵詞) 協同網路 zh_TW dc.subject (關鍵詞) Big Data en_US dc.subject (關鍵詞) P2P Data Publishing and Retrieval en_US dc.subject (關鍵詞) AI en_US dc.subject (關鍵詞) Confidentiality en_US dc.subject (關鍵詞) NAT en_US dc.subject (關鍵詞) Cross Domain en_US dc.subject (關鍵詞) Federated Domain en_US dc.subject (關鍵詞) Kademlia en_US dc.title (題名) 穿透NAT的同儕網路資料發佈與檢索平台之研究 zh_TW dc.title (題名) Study on a Cross-NAT P2P network platform for data publishing and searching en_US dc.type (資料類型) thesis en_US dc.relation.reference (參考文獻) [1] 李振武,白英彩,“影響 Internet 未來的對等網路”,上海交通大學金橋網路工程中心,技術報告,2003。[2] Sylvia Paul Ratnasamy:“A Scalable Content-Addressable Network,” PhD Dissertation of U.C.Berkeley, 2002[3] Ion Stoica, Robert Morris, David Karger, M. Frans Kaashoek, Hari Balakrishnan:“Chord:A Scalable Peer-to-peer Lookup Service for Internet Applications,” SIGCOMM’01, August 27-31, 2001,CA, USA.[4] A. Rowstron and P. Druschel:“Pastry:Scalable, distributed object location and routing for large – scale peer-to-peer systems.” In Proc. IFIP/ACM Middleware 2001, Heidelberg, Germany, Nov. 2001. https://udn.com/news/story/7239/3586060[5] Keith W. Ross, Dan Rubenstein, “P2P Systems”.[6] J Benet, IPFS - Content Addressed, Versioned, P2P File System., arXiv preprint arXiv:1407.3561, 2014[7] Y Zhao, K Kim, N Venkatasubramanian, DYNATOPS:a dynamic topic-based publish/subscribe architecture, DEBS `13:Proceedings of the 7th ACM international conference on Distributed event-based systemsJune 2013 Pages 75–86[8] D. Zhang, C. Zheng, H. Zhang, et al. Identification and analysis of Skype peer-to-peer traffic, in:Proceedings of the Fifth International Conference on Internet and Web Applications and Services, 2010, pp. 200-206.[9] M. Mauch, S. Dixon, Simultaneous estimation of Chords and musical context from audio, IEEE Transactions on Audio, Speech, and Language Processing, 2010, 18(6), pp. 1280-1289.[10] MIT, P2PSim:a simulator for Peer-to-Peer protocol, http://pdos.csail.mit.edu/P2Psim/ [EB/OL], 2010[11] Adán G. Medrano-Chávez, Elizabeth Pérez-Cortés, Miguel Lopez-Guerrero, A performance comparison of Chord andKademliaDHTs in high churn scenarios., 2015 – Springer.[12] Petar Maymounkov and David Mazi"eres, Kademlia:A Peer-to-Peer Information System Based on the XOR Metric,New York University, 2002.[13] Tonglin Li, RamanVerma, Xi Duan, Hui Jin, Ioan Raicu, Exploring distributed hash tables in High End computing., ACM SIGMETRICS Performance Evaluation Review, December, 2011.[14] An Overview of Cryptography, Available:https://www.garykessler.net/library/crypto.html[15] Traversal Using Relays around NAT(TURN):Relay Extensions to Session Traversal Utilities for NAT(STUN), RFC 5766, 2010.[16] S. Guha, Y. Takeda and P. Francis, “NUTSS:A SIP-based approach to UDP and TCP network connectivity,“ in Proceedings of ACM SIGCOMM Workshops, pp. 43-48, 2004.[17] Interactive Connectivity Establishment(ICE):A Protocol for Network Address Translator(NAT) Traversal for Offer/Answer Protocols, RFC 5245, 2010.[18] Rone-Yan Huang, Department of Computer Science National Chengchi University, MCLAB, Design and Implementation of a Data Publishing, Subscription and Computing Platform on the P2P Network, 2021.[19] Traversal Using Relays around NAT(TURN):Relay Extensions to Session Traversal Utilities for NAT(STUN), RFC 5766, 2010.[20] S. Guha, Y. Takeda and P. Francis, “NUTSS:A SIP-based approach to UDP and TCP network connectivity,“ in Proceedings of ACM SIGCOMM Workshops, pp. 43-48, 2004.[21] Interactive Connectivity Establishment(ICE):A Protocol for Network Address Translator(NAT) Traversal for Offer/Answer Protocols, RFC 5245, 2010.[22] Weixiong Rao, Roman Vitenberg, Lei Chen, Sasu Tarkoma, MTAF:An Adaptive Design for Keyword-Based Content Dissemination on DHT Networks, IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS, VOL. 26, NO. 4, APRIL 2015.[23] Bassirou Ngom, Mesaac Makpangou, Summary Prefix Tree:An over DHT Indexing Data Structure for Efficient Superset Search, 2017 IEEE 16th International Symposium on Network Computing and Applications(NCA).[24] Jeffrey DeanandSanjay Ghemawat, MapReduce:Simplified Data Processing on Large Clusters, googleusercontent.com,2004.[25] Praveen Khethavath, Johnson Thomas, Eric Chan-Tin, Hong Liu,Introducing a Distributed Cloud Architecture with Efficient Resource Discovery and Optimal Resource Allocation, IEEE Ninth World Congress on Services, 2013. zh_TW dc.identifier.doi (DOI) 10.6814/NCCU202001817 en_US