Publications-Theses

Article View/Open

Publication Export

Google ScholarTM

NCCU Library

Citation Infomation

Related Publications in TAIR

題名 設計與實作一個臉書粉絲頁資料抓取器
Design and Implementation of a Facebook Fan Page Data Crawler
作者 鄭博元
Cheng, Po Yuan
貢獻者 徐國偉
Hsu, Kuo Wei
鄭博元
Cheng, Po Yuan
關鍵詞 臉書
網路爬蟲
平行處理
Facebook
Web Crawler
Parallel Processing
日期 2015
上傳時間 2-Nov-2015 14:50:44 (UTC+8)
摘要 近年來隨著社群網路服務的盛行,臉書已成為現代人最主要的社交工具,許多名人及公司企業也都搶搭著這股風潮,紛紛在臉書上建立起粉絲頁來和粉絲們互動,而在虛擬世界和現實社會之間,兩者所互相造成的影響帶動出許多新興研究議題,透過資訊技術收集虛擬世界裡的資料,能幫助人文學者與社會科學家探索出數位科技與人文社會間的新現象。
本研究針對臉書上的粉絲頁,設計建構出一套臉書資料抓取系統,以協助學者研究分析粉絲頁的動態消息資料,本系統可幫助研究者搜尋出相關粉絲頁,並依照按讚次數排列呈現,協助挑選受歡迎的粉絲頁;讓研究者能抓取特定的粉絲頁資料,抓取到的資料經過解析後分為文章訊息、留言訊息、按讚訊息,並將結果儲存至資料庫;針對已抓取的粉絲頁,自動定時更新至最新資料。
With the popularity of social networking services in recent years, Facebook has become a major social tool for people. Many celebrities and companies have also gone with the tide to and established a fan page on Facebook to interact with fans. The mutual influence of the virtual world and the real world drives many emerging research agenda. Using information technology to collect data in the virtual world can help the humanities scholars and social scientists to explore new phenomena between digital technology and humanities community.
In this thesis, we focus on Facebook fan page data. We design and construct a Facebook fan page crawler to help scholars get data for analysis. The crawler can help researchers find the relevant fan pages along with the numbers of thumbs up and it can help researchers select fan pages. The crawler can help researchers to get the fan page data which they want by extracting post messages, comment messages, and like messages from the data and then storing the results into the database. The crawler also can set update timer to help researchers get the latest information.
參考文獻 [1] 項潔、涂豐恩,導論—什麼是數位人文,從保存到創造:開啟數位人文研究,項潔編,頁9-28,臺灣大學出版中心,臺灣,2011。
[2] 林泳舜,臉書專頁貼文類型、使用者動機與使用者投入參與三者關係之初探,世新大學碩士論文,2014。
[3] 陳重任,社群網站使用者對社群網站態度之研究,中華大學碩士論文,2013。
[4] 王莉瑛,社群網站使用者的網絡規模對社群網站的影響,元智大學碩士論文,2013。
[5] 林近,社群媒體的沉默螺旋現象-以臉書為例,臺灣大學碩士論文,2014。
[6] 康至青,健康相關訊息透過社群媒體行銷在台灣之探索性研究,國立中山大學碩士論文,2014。
[7] 陳慧潔,國小高年級學童臉書使用行為、臉書成癮與人際溝通能力相關之研究,中華大學碩士論文,2013。
[8] 黃昆山,以沉浸理論探討臉書遊戲商品購買意願,國立中央大學碩士論文,2012。
[9] 陳子玲,運用臉書提升大學生參與健康促進活動意願之研究-以北部某技術學院為例,元智大學碩士論文,2012。
[10] Sergey Brin and Lawrence Page, “The Anatomy of a Large-Scale Hypertextual Web Search Engine”, in Proceedings of the 7th International World Wide Web Conference, pp. 107–117, 1998.
[11] Carlos Castillo, “Effective Web Crawling”, doctoral dissertation, University of Chile, 2004.
[12] Junghoo Cho and Hector Garcia-Molina, “Parallel Crawlers”, in Proceedings of the 11th International Conference on World Wide Web, pp. 124-135, 2002.
[13] Debajyoti Mukhopadhyay, Sajal Mukherjee, Soumya Ghosh, Saheli Kar, and Young-Chon Kim, “Architecture of A Scalable Dynamic Parallel WebCrawler with High Speed Downloadable Capability for a Web Search Engine”, in Proceedings of the 6th International Workshop on MSPT, pp. 103-108, 2006.
[14] Rajashree Shettar and Dr. Shobha G, “Web Crawler on Client Machine”, in Proceedings of the International Multi Conference of Engineers and Computer Scientists, Vol 2, pp. 1121-1124, 2008.
[15] Soumen Chakrabarti, Martin van den Berg, and Byron Dom , “Focused Crawling: a New Approach to Topic-specific Web Resource Discovery”, in Proceedings of the 8th International World Wide Web Conference, pp. 545-562, 1999.
[16] Matko Bošnjak, Eduardo Oliveira, José Martins, Eduarda Mendes, and Luís Sarmento , “Twitterecho: a Distributed Focused Crawler to Support Open Research with Twitter Data”, in Proceedings of the 21st International Conference on World Wide Web, pp. 1233–1240, 2012.
[17] Zhiyong Zhang and Olfa Nasraoui, “Profile-Based Focused Crawler for Social Media-Sharing Websites”, ICTAI `08. 20th IEEE International Conference on Tools with Artificial Intelligence, vol.1, pp. 317-324, 2008.
[18] Melanie Neunerdt, Markus Niermann, Rudolf Mathar, and Bianka Trevisan, “Focused Crawling for Building Web Comment Corpora”, in Proceedings of the 10th Annual IEEE Consumer Communications and Networking Conference, pp. 761-765, 2013.
[19] Duen Horng Chau, Shashank Pandit, Samuel Wang, and Christos Faloutsos, “Parallel Crawling for Online Social Networks”, in Proceedings of the 16th International Conference on World Wide Web, pp. 1283–1284, 2007.
[20] Salvatore A. Catanese, Pasquale De Meo, Emilio Ferrara, Giacomo Fiumara, and Alessandro Provetti, “Crawling Facebook for Social Network Analysis Purposes”, in Proceedings of the International Conference on Web Intelligence, Mining and Semantics, pp. 52-59, Sogndal, Norway, ACM, 2011.
[21] Bimal Viswanath, Alan Mislove, Meeyoung Cha, and Krishna P. Gummadi, “On the Evolution of User Interaction in Facebook”, in Proceedings of the ACM Workshop on Online Social Networks, pp. 37-42, 2009.
[22] Tanu Siwag, Priyank Sirohi, and Niraj Singhal,”Novel Architecture of a Focused Crawler for Social Websites”, International Journal of Computer Engineering and Applications, Volume VII, Issue III, Part I, pp. 132-144, 2014.
[23] 潘伯彥,臉書使用者行為蒐集系統之設計與實作,國立政治大學碩士論文,2015。
描述 碩士
國立政治大學
資訊科學學系
102753030
資料來源 http://thesis.lib.nccu.edu.tw/record/#G0102753030
資料類型 thesis
dc.contributor.advisor 徐國偉zh_TW
dc.contributor.advisor Hsu, Kuo Weien_US
dc.contributor.author (Authors) 鄭博元zh_TW
dc.contributor.author (Authors) Cheng, Po Yuanen_US
dc.creator (作者) 鄭博元zh_TW
dc.creator (作者) Cheng, Po Yuanen_US
dc.date (日期) 2015en_US
dc.date.accessioned 2-Nov-2015 14:50:44 (UTC+8)-
dc.date.available 2-Nov-2015 14:50:44 (UTC+8)-
dc.date.issued (上傳時間) 2-Nov-2015 14:50:44 (UTC+8)-
dc.identifier (Other Identifiers) G0102753030en_US
dc.identifier.uri (URI) http://nccur.lib.nccu.edu.tw/handle/140.119/79209-
dc.description (描述) 碩士zh_TW
dc.description (描述) 國立政治大學zh_TW
dc.description (描述) 資訊科學學系zh_TW
dc.description (描述) 102753030zh_TW
dc.description.abstract (摘要) 近年來隨著社群網路服務的盛行,臉書已成為現代人最主要的社交工具,許多名人及公司企業也都搶搭著這股風潮,紛紛在臉書上建立起粉絲頁來和粉絲們互動,而在虛擬世界和現實社會之間,兩者所互相造成的影響帶動出許多新興研究議題,透過資訊技術收集虛擬世界裡的資料,能幫助人文學者與社會科學家探索出數位科技與人文社會間的新現象。
本研究針對臉書上的粉絲頁,設計建構出一套臉書資料抓取系統,以協助學者研究分析粉絲頁的動態消息資料,本系統可幫助研究者搜尋出相關粉絲頁,並依照按讚次數排列呈現,協助挑選受歡迎的粉絲頁;讓研究者能抓取特定的粉絲頁資料,抓取到的資料經過解析後分為文章訊息、留言訊息、按讚訊息,並將結果儲存至資料庫;針對已抓取的粉絲頁,自動定時更新至最新資料。
zh_TW
dc.description.abstract (摘要) With the popularity of social networking services in recent years, Facebook has become a major social tool for people. Many celebrities and companies have also gone with the tide to and established a fan page on Facebook to interact with fans. The mutual influence of the virtual world and the real world drives many emerging research agenda. Using information technology to collect data in the virtual world can help the humanities scholars and social scientists to explore new phenomena between digital technology and humanities community.
In this thesis, we focus on Facebook fan page data. We design and construct a Facebook fan page crawler to help scholars get data for analysis. The crawler can help researchers find the relevant fan pages along with the numbers of thumbs up and it can help researchers select fan pages. The crawler can help researchers to get the fan page data which they want by extracting post messages, comment messages, and like messages from the data and then storing the results into the database. The crawler also can set update timer to help researchers get the latest information.
en_US
dc.description.tableofcontents 第一章 導論 1
1.1 研究背景 1
1.1.1 Facebook的沿革 1
1.1.2 Facebook粉絲頁 3
1.2 研究動機和目的 4
1.3 論文架構 5
第二章 文獻探討 6
2.1 數位人文 6
2.2 社群網站相關研究 6
2.3 網路爬蟲相關研究 8
2.3.1 Parallel Crawler 10
2.3.2 Focused Web Crawler 11
第三章 系統設計 13
3.1 系統設計概念 13
3.2 系統架構設計 18
3.3 系統模組設計 23
3.3.1 搜尋資料模組 24
3.3.2 資料抓取模組 25
3.3.3 資料剖析模組 26
3.3.4 資料更新模組 28
3.3.5 內容呈現模組 30
第四章 系統實作 31
4.1 Facebook API 31
4.1.1 Graph API 31
4.2 Facebook SDK for C# 35
4.3 粉絲頁塗鴉牆資料結構 37
4.4 平行聚焦抓取及剖析 37
4.5 系統實作結果 39
4.5.1 操作執行介面 39
4.5.2 資料顯示介面 44
第五章 實驗 47
5.1 系統測試環境 47
5.2 非平行運算與平行運算結果比較 47
5.3 測試抓取結果 48
5.3.1 遊戲類粉絲專頁 49
5.3.2 政治人物粉絲專頁 55
5.3.3 社會議題粉絲專頁 60
5.3.4 速食店粉絲專頁 64
5.3.5 網路論壇粉絲專頁 69
5.4 結果整理 73
第六章 結論與未來展望 75
6.1 結論 75
6.2 未來展望 76
參考文獻 77
zh_TW
dc.format.extent 1582909 bytes-
dc.format.mimetype application/pdf-
dc.source.uri (資料來源) http://thesis.lib.nccu.edu.tw/record/#G0102753030en_US
dc.subject (關鍵詞) 臉書zh_TW
dc.subject (關鍵詞) 網路爬蟲zh_TW
dc.subject (關鍵詞) 平行處理zh_TW
dc.subject (關鍵詞) Facebooken_US
dc.subject (關鍵詞) Web Crawleren_US
dc.subject (關鍵詞) Parallel Processingen_US
dc.title (題名) 設計與實作一個臉書粉絲頁資料抓取器zh_TW
dc.title (題名) Design and Implementation of a Facebook Fan Page Data Crawleren_US
dc.type (資料類型) thesisen
dc.relation.reference (參考文獻) [1] 項潔、涂豐恩,導論—什麼是數位人文,從保存到創造:開啟數位人文研究,項潔編,頁9-28,臺灣大學出版中心,臺灣,2011。
[2] 林泳舜,臉書專頁貼文類型、使用者動機與使用者投入參與三者關係之初探,世新大學碩士論文,2014。
[3] 陳重任,社群網站使用者對社群網站態度之研究,中華大學碩士論文,2013。
[4] 王莉瑛,社群網站使用者的網絡規模對社群網站的影響,元智大學碩士論文,2013。
[5] 林近,社群媒體的沉默螺旋現象-以臉書為例,臺灣大學碩士論文,2014。
[6] 康至青,健康相關訊息透過社群媒體行銷在台灣之探索性研究,國立中山大學碩士論文,2014。
[7] 陳慧潔,國小高年級學童臉書使用行為、臉書成癮與人際溝通能力相關之研究,中華大學碩士論文,2013。
[8] 黃昆山,以沉浸理論探討臉書遊戲商品購買意願,國立中央大學碩士論文,2012。
[9] 陳子玲,運用臉書提升大學生參與健康促進活動意願之研究-以北部某技術學院為例,元智大學碩士論文,2012。
[10] Sergey Brin and Lawrence Page, “The Anatomy of a Large-Scale Hypertextual Web Search Engine”, in Proceedings of the 7th International World Wide Web Conference, pp. 107–117, 1998.
[11] Carlos Castillo, “Effective Web Crawling”, doctoral dissertation, University of Chile, 2004.
[12] Junghoo Cho and Hector Garcia-Molina, “Parallel Crawlers”, in Proceedings of the 11th International Conference on World Wide Web, pp. 124-135, 2002.
[13] Debajyoti Mukhopadhyay, Sajal Mukherjee, Soumya Ghosh, Saheli Kar, and Young-Chon Kim, “Architecture of A Scalable Dynamic Parallel WebCrawler with High Speed Downloadable Capability for a Web Search Engine”, in Proceedings of the 6th International Workshop on MSPT, pp. 103-108, 2006.
[14] Rajashree Shettar and Dr. Shobha G, “Web Crawler on Client Machine”, in Proceedings of the International Multi Conference of Engineers and Computer Scientists, Vol 2, pp. 1121-1124, 2008.
[15] Soumen Chakrabarti, Martin van den Berg, and Byron Dom , “Focused Crawling: a New Approach to Topic-specific Web Resource Discovery”, in Proceedings of the 8th International World Wide Web Conference, pp. 545-562, 1999.
[16] Matko Bošnjak, Eduardo Oliveira, José Martins, Eduarda Mendes, and Luís Sarmento , “Twitterecho: a Distributed Focused Crawler to Support Open Research with Twitter Data”, in Proceedings of the 21st International Conference on World Wide Web, pp. 1233–1240, 2012.
[17] Zhiyong Zhang and Olfa Nasraoui, “Profile-Based Focused Crawler for Social Media-Sharing Websites”, ICTAI `08. 20th IEEE International Conference on Tools with Artificial Intelligence, vol.1, pp. 317-324, 2008.
[18] Melanie Neunerdt, Markus Niermann, Rudolf Mathar, and Bianka Trevisan, “Focused Crawling for Building Web Comment Corpora”, in Proceedings of the 10th Annual IEEE Consumer Communications and Networking Conference, pp. 761-765, 2013.
[19] Duen Horng Chau, Shashank Pandit, Samuel Wang, and Christos Faloutsos, “Parallel Crawling for Online Social Networks”, in Proceedings of the 16th International Conference on World Wide Web, pp. 1283–1284, 2007.
[20] Salvatore A. Catanese, Pasquale De Meo, Emilio Ferrara, Giacomo Fiumara, and Alessandro Provetti, “Crawling Facebook for Social Network Analysis Purposes”, in Proceedings of the International Conference on Web Intelligence, Mining and Semantics, pp. 52-59, Sogndal, Norway, ACM, 2011.
[21] Bimal Viswanath, Alan Mislove, Meeyoung Cha, and Krishna P. Gummadi, “On the Evolution of User Interaction in Facebook”, in Proceedings of the ACM Workshop on Online Social Networks, pp. 37-42, 2009.
[22] Tanu Siwag, Priyank Sirohi, and Niraj Singhal,”Novel Architecture of a Focused Crawler for Social Websites”, International Journal of Computer Engineering and Applications, Volume VII, Issue III, Part I, pp. 132-144, 2014.
[23] 潘伯彥,臉書使用者行為蒐集系統之設計與實作,國立政治大學碩士論文,2015。
zh_TW