學術產出-學位論文

文章檢視/開啟

書目匯出

Google ScholarTM

政大圖書館

引文資訊

TAIR相關學術產出

題名 基於MapReduce框架進行有效的天際線查詢處理
Efficient Skyline Query Processing with MapReduce
作者 詹智渝
Chan, Chih Yu
貢獻者 陳良弼
Chen, Arbee L.P.
詹智渝
Chan, Chih Yu
關鍵詞 天際線查詢
巨量資料
分散式運算
日期 2013
上傳時間 1-十一月-2013 11:44:16 (UTC+8)
摘要 隨著人們對資料庫使用的需求增加,使用者對資料的查詢方法也越來越多樣,促使近年來偏好查詢成為一個很熱門的研究議題。在所有的查詢中,Skyline查詢更是在現今資料庫以及資料檢索中熱門的研究題目。伴隨著科技的演進,人們可以收集和利用的資料急劇增長,巨量資料的運算處理變成迫切的問題。藉由Google在2004年發表的一份開放文件中分享了MapReduce程式化運算框架,以往許多查詢在巨量資料環境遇到的障礙都得到有效的解決方案。
Skyline查詢是一件高時間複雜度的工作,面臨到巨量資料時的處理更是困難,因此近年來對於Skyline在巨量資料查詢的研究也逐漸熱絡發展。本研究目的在於如何設計更有效的MapReduce演算法使得Skyline查詢處理能夠更有效進行,對此演算法進行詳細的說明,最後在Hadoop平台上實作並驗證此演算法具有更佳的有效性及可用性。
With the increasing number of querying methods, preference queries become a very popular research topic. Among all kinds of queries, skyline query is important in today`s databases and information retrieval. Moreover, the development of technologies makes it possible to collect and utilize the rapid growth of data. Google in 2004 published an open document to share a computing framework named MapReduce, which makes big data processing efficient.
Skyline query costs much in processing, and it becomes even more difficult when facing a huge amount of data. In this study, we designed an efficient MapReduce algorithm for skyline queries. We also implemented the algorithm on the Hadoop platform to verify the efficiency and effectiveness of this algorithm.
參考文獻 [1] J. Dean, and S. Ghemawat, “MapReduce: Simplified Data Processing on Large Cluster,” in Proceedings of the Operating Systems Design and Implementation, 2004.
[2] S. Borzsonyi, D. Kossmann, and K. Stocker, “The Skyline Operator,” in Proceedings of the International Conference on Data Engineering, 2001.
[3] B. L. Zhang, S. G. Zhou, and J. H. Guan, “Adapting Skyline computation to the MapReduce Framework: Algorithms and Experiments,” in Proceeding of the Database Systems for Advanced Applications workshop, 2011.
[4] L. L. DING, J. C. XIN, G. R. WANG, and S. HUANG, “Efficient Skyline Query Processing of Massive Data Based on Map-Reduce,” in Chinese Journal of Computers, 2012.
[5] J. Chomicki, P. Godfery, J. Gryz, and D. Liang, “Skyline with presorting,” in Proceedings of the International Conference on Data Engineering, 2003.
[6] J. Chomicki, P. Godfrey, J. Gryz, and D. Liang, “Skyline with presorting: Theory and optimizations,” in Journal of the Intelligent Information Systems, 2005.
[7] P. Godfrey, R. Shipley, and J. Gryz, “Maximal vector computation in large data Sets,” in Proceedings of the Very Large Databases, 2005.
[8] I. Bartolini, P. Ciaccia, and M. Patella, “SaLSa: Computing the Skyline without Scanning the Whole Sky,” in Proceeding of the Conference on Information and Knowledge Management, 2006.
[9] D. Papadias, Y. Tao, G. Fu, and B. Seeger, “An Optimal and Progressive Algorithm for Skyline Queries,” in Proceedings of ACM International Conference on Management of Data, 2003.
[10] D. Kossmann, F. Ramsak, and S. Rost, “Shooting stars in the sky: an online algorithm for Skyline queries,” in Proceedings of the Very Large Databases, 2002.
[11] D. Papadias, Y. Tao, G. Fu, and B. Seeger, “Progressive Skyline computation in database systems,” in Proceedings of the Transactions on Database Systems, 2005.
[12] S. M. Zhang, N. Mamoulis, and D. W. Cheung, “Scalable Skyline Computation Using Object-based Space Partitioning,” in Proceedings of the ACM International Conference on Management of Data, SIGMOD, 2009
[13] B. Cui, H. Lu, Q. Xu, L. Chen, Y. Dai, and Y. Zhou, “Parallel distributed processing of constrained Skyline queries by filtering,” in Proceedings of the International Conference on Data Engineering, 2008.
[14] J.B. Rocha-Junior, A. Vlachou, C. Doulkeridis, and K. Nørvåg, “Efficient execution plans for distributed Skyline query processing,” in Proceedings of the Extending Database Technology, 2011.
[15] A. Vlachou, C. Doulkeridis, and Y. Kotidis, “Angle-based space partitioning for efficient parallel Skyline computation,” in Proceedings of the ACM International Conference on Management of Data, SIGMOD, 2008.
[16] H. Köhler, J. Yang, and X. Zhou, “Efficient Parallel Skyline Processing using Hyperplane Projections,” in Proceedings of the ACM International Conference on Management of Data, SIGMOD, 2011.
描述 碩士
國立政治大學
資訊科學學系
100753037
102
資料來源 http://thesis.lib.nccu.edu.tw/record/#G0100753037
資料類型 thesis
dc.contributor.advisor 陳良弼zh_TW
dc.contributor.advisor Chen, Arbee L.P.en_US
dc.contributor.author (作者) 詹智渝zh_TW
dc.contributor.author (作者) Chan, Chih Yuen_US
dc.creator (作者) 詹智渝zh_TW
dc.creator (作者) Chan, Chih Yuen_US
dc.date (日期) 2013en_US
dc.date.accessioned 1-十一月-2013 11:44:16 (UTC+8)-
dc.date.available 1-十一月-2013 11:44:16 (UTC+8)-
dc.date.issued (上傳時間) 1-十一月-2013 11:44:16 (UTC+8)-
dc.identifier (其他 識別碼) G0100753037en_US
dc.identifier.uri (URI) http://nccur.lib.nccu.edu.tw/handle/140.119/61492-
dc.description (描述) 碩士zh_TW
dc.description (描述) 國立政治大學zh_TW
dc.description (描述) 資訊科學學系zh_TW
dc.description (描述) 100753037zh_TW
dc.description (描述) 102zh_TW
dc.description.abstract (摘要) 隨著人們對資料庫使用的需求增加,使用者對資料的查詢方法也越來越多樣,促使近年來偏好查詢成為一個很熱門的研究議題。在所有的查詢中,Skyline查詢更是在現今資料庫以及資料檢索中熱門的研究題目。伴隨著科技的演進,人們可以收集和利用的資料急劇增長,巨量資料的運算處理變成迫切的問題。藉由Google在2004年發表的一份開放文件中分享了MapReduce程式化運算框架,以往許多查詢在巨量資料環境遇到的障礙都得到有效的解決方案。
Skyline查詢是一件高時間複雜度的工作,面臨到巨量資料時的處理更是困難,因此近年來對於Skyline在巨量資料查詢的研究也逐漸熱絡發展。本研究目的在於如何設計更有效的MapReduce演算法使得Skyline查詢處理能夠更有效進行,對此演算法進行詳細的說明,最後在Hadoop平台上實作並驗證此演算法具有更佳的有效性及可用性。
zh_TW
dc.description.abstract (摘要) With the increasing number of querying methods, preference queries become a very popular research topic. Among all kinds of queries, skyline query is important in today`s databases and information retrieval. Moreover, the development of technologies makes it possible to collect and utilize the rapid growth of data. Google in 2004 published an open document to share a computing framework named MapReduce, which makes big data processing efficient.
Skyline query costs much in processing, and it becomes even more difficult when facing a huge amount of data. In this study, we designed an efficient MapReduce algorithm for skyline queries. We also implemented the algorithm on the Hadoop platform to verify the efficiency and effectiveness of this algorithm.
en_US
dc.description.tableofcontents 第1章 緒論 1
第2章 相關研究 4
2.1 Skyline演算法的相關研究 4
2.2 Skyline在高度分散環境下查詢處理的相關研究 5
2.3 MapReduce框架 8
2.4 適用於MapReduce框架的Skyline查詢的相關研究 10
第3章 問題與定義 12
3.1 問題 12
3.2 直觀演算法 13
3.3 問題定義 16
第4章 資料分割及演算法介紹 18
4.1 演算法概要 18
4.2 網格分割及篩選 20
4.3 角度分割及篩選 22
4.4 兩種分割方法的效能分析 25
第5章 實驗與結果 27
5.1 回應時間 27
5.2 片段運算平衡 28
5.3 片段Global Skyline的貢獻比較 29
第6章 結論 31
參考文獻 32
zh_TW
dc.format.extent 1369104 bytes-
dc.format.mimetype application/pdf-
dc.language.iso en_US-
dc.source.uri (資料來源) http://thesis.lib.nccu.edu.tw/record/#G0100753037en_US
dc.subject (關鍵詞) 天際線查詢zh_TW
dc.subject (關鍵詞) 巨量資料zh_TW
dc.subject (關鍵詞) 分散式運算zh_TW
dc.title (題名) 基於MapReduce框架進行有效的天際線查詢處理zh_TW
dc.title (題名) Efficient Skyline Query Processing with MapReduceen_US
dc.type (資料類型) thesisen
dc.relation.reference (參考文獻) [1] J. Dean, and S. Ghemawat, “MapReduce: Simplified Data Processing on Large Cluster,” in Proceedings of the Operating Systems Design and Implementation, 2004.
[2] S. Borzsonyi, D. Kossmann, and K. Stocker, “The Skyline Operator,” in Proceedings of the International Conference on Data Engineering, 2001.
[3] B. L. Zhang, S. G. Zhou, and J. H. Guan, “Adapting Skyline computation to the MapReduce Framework: Algorithms and Experiments,” in Proceeding of the Database Systems for Advanced Applications workshop, 2011.
[4] L. L. DING, J. C. XIN, G. R. WANG, and S. HUANG, “Efficient Skyline Query Processing of Massive Data Based on Map-Reduce,” in Chinese Journal of Computers, 2012.
[5] J. Chomicki, P. Godfery, J. Gryz, and D. Liang, “Skyline with presorting,” in Proceedings of the International Conference on Data Engineering, 2003.
[6] J. Chomicki, P. Godfrey, J. Gryz, and D. Liang, “Skyline with presorting: Theory and optimizations,” in Journal of the Intelligent Information Systems, 2005.
[7] P. Godfrey, R. Shipley, and J. Gryz, “Maximal vector computation in large data Sets,” in Proceedings of the Very Large Databases, 2005.
[8] I. Bartolini, P. Ciaccia, and M. Patella, “SaLSa: Computing the Skyline without Scanning the Whole Sky,” in Proceeding of the Conference on Information and Knowledge Management, 2006.
[9] D. Papadias, Y. Tao, G. Fu, and B. Seeger, “An Optimal and Progressive Algorithm for Skyline Queries,” in Proceedings of ACM International Conference on Management of Data, 2003.
[10] D. Kossmann, F. Ramsak, and S. Rost, “Shooting stars in the sky: an online algorithm for Skyline queries,” in Proceedings of the Very Large Databases, 2002.
[11] D. Papadias, Y. Tao, G. Fu, and B. Seeger, “Progressive Skyline computation in database systems,” in Proceedings of the Transactions on Database Systems, 2005.
[12] S. M. Zhang, N. Mamoulis, and D. W. Cheung, “Scalable Skyline Computation Using Object-based Space Partitioning,” in Proceedings of the ACM International Conference on Management of Data, SIGMOD, 2009
[13] B. Cui, H. Lu, Q. Xu, L. Chen, Y. Dai, and Y. Zhou, “Parallel distributed processing of constrained Skyline queries by filtering,” in Proceedings of the International Conference on Data Engineering, 2008.
[14] J.B. Rocha-Junior, A. Vlachou, C. Doulkeridis, and K. Nørvåg, “Efficient execution plans for distributed Skyline query processing,” in Proceedings of the Extending Database Technology, 2011.
[15] A. Vlachou, C. Doulkeridis, and Y. Kotidis, “Angle-based space partitioning for efficient parallel Skyline computation,” in Proceedings of the ACM International Conference on Management of Data, SIGMOD, 2008.
[16] H. Köhler, J. Yang, and X. Zhou, “Efficient Parallel Skyline Processing using Hyperplane Projections,” in Proceedings of the ACM International Conference on Management of Data, SIGMOD, 2011.
zh_TW