發展社群媒體事件之圖像快篩方法：以2016年美濃地震之Twitter資料為例

Publications-Theses

Article View/Open

pdf(92)

Publication Export

Google Scholar^TM

NCCU Library

Discovery System

Citation Infomation

Related Publications in TAIR

Simple Record
Full Record

題名	發展社群媒體事件之圖像快篩方法：以2016年美濃地震之Twitter資料為例 Developing a Fast Screening Method for Visual Images in Social Media Events: A Case Study of Twitter Data during 2016 Meinong Earthquake
作者	馮書昭 FENG, SHU-CHAO
貢獻者	陳恭 Chen, Kung 馮書昭 FENG,SHU-CHAO
關鍵詞	2016 高雄美濃地震推特社群媒體分析圖像分析文字分析圖論 2016 Meinong earthquake Twitter Social media analysis Text analysis Graph theory
日期	2018
上傳時間	1-Aug-2018 16:38:54 (UTC+8)
摘要	近年來，網際網路的普及以及各種社群網站的興盛，使得無論人們身在何處，無時無刻不能夠在社群網路上表達自己的想法，無論是資訊傳播、經驗分享以及意見交流，社群平台上的互動已經逐漸變成人們生活中的一部份，而這些充斥在社群平台上的即時資訊，也使得社群媒體成為蘊含各種最新消息的寶庫，而隨著社群網站的演進以及功能的增加，除了文字外人們更可以透過各種的多媒體，以更豐富的方式傳達內容。在災難發生時，資訊的掌握對於相關單位解析災難的相關情況以及制定後續行動非常重要，此時社群網路中的豐富即時資訊就可以扮演一個快速獲取資料的來源，藉由解析社群媒體上的資訊，可以即時地獲得第一手資訊，專家更可以藉由分析照片或影片等多媒體或是字裏行間中隱含的資訊瞭解到現場的情況，制定更加完備的因應對策。本研究以2016年高雄美濃地震為例，透過探索研究地震期間在推特上交流的資訊，包含發文內容、附加圖片以及發文者、發文時間、發文語系以及轉推等後設資訊（meta data），瞭解在地震相關的貼文資訊中，附圖的貼文較未附圖的貼文更加的容易被轉推，而後本研究著重在貼文中的附圖，研擬使用圖像辨識的方法，賦予推特上的圖片物件標籤，後以標籤共現嘗試將跟地震相關的圖片分群，確認此方法分析圖片的結果以及其限制，最後回到發佈的貼文本身資訊，本研究分析了地震相關熱門貼文的發布者身份，藉由爬梳推特上的使用者資訊，瞭解到除了台灣本身，包含日本、中國、美國、英國、俄羅斯等多國媒體都有報導的相關資訊，且可能由於推特中日本的使用者較多，且日本位置距離台灣較相近，因此熱門貼文的發文者大多是使用日文的使用者。 Recently, thanks to various social media platforms and availability of mobile web, people have got used to interactive through the internet anywhere anytime. Activities on social media have become everyone`s routine, such as searching or sharing information and communication. In the meanwhile, these activities make social media platforms a treasure for getting information. This boundless, real-time information strongly connected to our real life, which means, by locating the information of some position and moment, we are able to analyze specific events of the real world. This feature is especially important at analyzing disasters. At the beginning of disasters, getting information is vital for those authorized. By properly extract messages from social media, experts can realize more and much quickly about the disaster to take the next steps. This work provides a case study of 2016 Meinong Earthquake happened in Taiwan by exploring the data on Twitter. First of all, this work analyzes the metadata of tweets and show that tweets with images are more likely to be retweeted. Then, after using the computer vision services to label the images, this work provides the results and resistance of using label co-occurrence to cluster images. In the end, by crawling the user information of popular tweets` publisher, we can realize that besides the media of Taiwan, there are also media from other countries caring about this disasters. Moreover, we also know that most personal user publisher of popular tweets use Japanese. The reason might be that Japanese users are the second most in Tweet and location of Japan is relatively near Taiwan than other occidental countries.
參考文獻	[1] Imran, M., Castillo, C., Diaz, F., & Vieweg, S. （2015）. Processing social media messages in mass emergency: A survey. ACM Computing Surveys （CSUR）, 47（4）, 67. [2] Daly, S., & Thom, J. A. （2016, May）. Mining and Classifying Image Posts on Social Media to Analyse Fires. In ISCRAM. [3] Mouzannar, H., Rizk, Y., & Awad, M. Damage Identification in Social Media Posts using Multimodal Deep Learning. [4]Weng, J., & Lee, B. S. (2011). Event detection in twitter. ICWSM, 11, 401-408. [5]Abel, F., Hauff, C., Houben, G. J., Stronkman, R., & Tao, K. (2012, June). Semantics+ filtering+ search= twitcident. exploring information in social web streams. In Proceedings of the 23rd ACM conference on Hypertext and social media (pp. 285-294). ACM. [6]Alam, F., Imran, M., & Ofli, F. (2017, July). Image4act: Online social media image processing for disaster response. In Proceedings of the 2017 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining 2017 (pp. 601-604). ACM. [7]Peters, R., & de Albuquerque, J. P. (2015). Investigating images as indicators for relevant social media messages in disaster management. In ISCRAM. [8] Harris, Z. S. （1954）. Distributional Structure. Word, 10（2/3）, 146–162 [9] Upadhyaya, N., & Dixit, M. （2016）. A Review: Relating Low Level Features to High Level Semantics in CBIR. International Journal of Signal Processing, Image Processing and Pattern Recognition, 9（3）, 433-444. [10] Jain, R., Kasturi, R., & Schunck, B. G. （1995）. Machine vision （Vol. 5）. New York: McGraw-Hill. [11]Raghavan, U. N., Albert, R., & Kumara, S. （2007）. Near linear time algorithm to detect community structures in large-scale networks. Physical review E, 76（3）, 036106. （label） [12]Blondel, V. D., Guillaume, J. L., Lambiotte, R., & Lefebvre, E. （2008）. Fast unfolding of communities in large networks. Journal of statistical mechanics: theory and experiment, 2008（10）, P10008. [13] Pons, Pascal, and Matthieu Latapy. "Computing communities in large networks using random walks." ISCIS. Vol. 3733. 2005. [14] Almeida, Hélio, et al. "Is there a best quality metric for graph clusters?." Joint European Conference on Machine Learning and Knowledge Discovery in Databases. Springer, Berlin, Heidelberg, 2011. [15]Santorini, B. （1990）. Part-of-speech tagging guidelines for the Penn Treebank Project （3rd revision）. Technical Reports （CIS）, 570. [16]González, F.A., Gelbukh, A.F., & Jiménez, S. （2015）. Soft Cardinality in Semantic Text Processing: Experience of the SemEval International Competitions. Polibits, 51, 63-72. [17]鄭宇君, & 陳百齡. （2017）. 香港雨傘運動的眾聲喧嘩：探討Twitter社群的多語系貼文. 傳播與社會學刊 41 頁81-117
描述	碩士國立政治大學資訊科學系 104753005
資料來源	http://thesis.lib.nccu.edu.tw/record/#G0104753005
資料類型	thesis

dc.contributor.advisor	陳恭	zh_TW
dc.contributor.advisor	Chen, Kung	en_US
dc.contributor.author (Authors)	馮書昭	zh_TW
dc.contributor.author (Authors)	FENG,SHU-CHAO	en_US
dc.creator (作者)	馮書昭	zh_TW
dc.creator (作者)	FENG, SHU-CHAO	en_US
dc.date (日期)	2018	en_US
dc.date.accessioned	1-Aug-2018 16:38:54 (UTC+8)	-
dc.date.available	1-Aug-2018 16:38:54 (UTC+8)	-
dc.date.issued (上傳時間)	1-Aug-2018 16:38:54 (UTC+8)	-
dc.identifier (Other Identifiers)	G0104753005	en_US
dc.identifier.uri (URI)	http://nccur.lib.nccu.edu.tw/handle/140.119/119141	-
dc.description (描述)	碩士	zh_TW
dc.description (描述)	國立政治大學	zh_TW
dc.description (描述)	資訊科學系	zh_TW
dc.description (描述)	104753005	zh_TW
dc.description.abstract (摘要)	近年來，網際網路的普及以及各種社群網站的興盛，使得無論人們身在何處，無時無刻不能夠在社群網路上表達自己的想法，無論是資訊傳播、經驗分享以及意見交流，社群平台上的互動已經逐漸變成人們生活中的一部份，而這些充斥在社群平台上的即時資訊，也使得社群媒體成為蘊含各種最新消息的寶庫，而隨著社群網站的演進以及功能的增加，除了文字外人們更可以透過各種的多媒體，以更豐富的方式傳達內容。在災難發生時，資訊的掌握對於相關單位解析災難的相關情況以及制定後續行動非常重要，此時社群網路中的豐富即時資訊就可以扮演一個快速獲取資料的來源，藉由解析社群媒體上的資訊，可以即時地獲得第一手資訊，專家更可以藉由分析照片或影片等多媒體或是字裏行間中隱含的資訊瞭解到現場的情況，制定更加完備的因應對策。本研究以2016年高雄美濃地震為例，透過探索研究地震期間在推特上交流的資訊，包含發文內容、附加圖片以及發文者、發文時間、發文語系以及轉推等後設資訊（meta data），瞭解在地震相關的貼文資訊中，附圖的貼文較未附圖的貼文更加的容易被轉推，而後本研究著重在貼文中的附圖，研擬使用圖像辨識的方法，賦予推特上的圖片物件標籤，後以標籤共現嘗試將跟地震相關的圖片分群，確認此方法分析圖片的結果以及其限制，最後回到發佈的貼文本身資訊，本研究分析了地震相關熱門貼文的發布者身份，藉由爬梳推特上的使用者資訊，瞭解到除了台灣本身，包含日本、中國、美國、英國、俄羅斯等多國媒體都有報導的相關資訊，且可能由於推特中日本的使用者較多，且日本位置距離台灣較相近，因此熱門貼文的發文者大多是使用日文的使用者。	zh_TW
dc.description.abstract (摘要)	Recently, thanks to various social media platforms and availability of mobile web, people have got used to interactive through the internet anywhere anytime. Activities on social media have become everyone`s routine, such as searching or sharing information and communication. In the meanwhile, these activities make social media platforms a treasure for getting information. This boundless, real-time information strongly connected to our real life, which means, by locating the information of some position and moment, we are able to analyze specific events of the real world. This feature is especially important at analyzing disasters. At the beginning of disasters, getting information is vital for those authorized. By properly extract messages from social media, experts can realize more and much quickly about the disaster to take the next steps. This work provides a case study of 2016 Meinong Earthquake happened in Taiwan by exploring the data on Twitter. First of all, this work analyzes the metadata of tweets and show that tweets with images are more likely to be retweeted. Then, after using the computer vision services to label the images, this work provides the results and resistance of using label co-occurrence to cluster images. In the end, by crawling the user information of popular tweets` publisher, we can realize that besides the media of Taiwan, there are also media from other countries caring about this disasters. Moreover, we also know that most personal user publisher of popular tweets use Japanese. The reason might be that Japanese users are the second most in Tweet and location of Japan is relatively near Taiwan than other occidental countries.	en_US
dc.description.tableofcontents	第一章緒論 1 1.1 研究背景與動機 1 1.2 研究目的與問題 2 1.3 研究架構 4 第二章相關研究 6 2.1 社群媒體事件研究 6 2.2 文字探勘 7 2.2.1 詞袋模型 8 2.2.2 中文斷詞 8 2.3 圖片分析 9 2.3.1 圖片標記服務 10 2.3.2 重複圖片偵測 11 2.4 圖論 11 2.4.1 節點的分群方法 12 2.4.2 分群方法的評估方法 13 第三章研究資料 15 3.1 研究資料 15 3.2 資料清理 17 3.2.1 時間區間過濾 18 3.2.2 文字過濾 22 3.3 圖片自動下標記 23 3.3.2 不同服務間的標籤分析 26 3.3.3 服務選擇 29 第四章研究流程及結果 37 4.1 附圖貼文及未附圖貼文的比較 37 4.2 無監督式圖片類別 39 4.2.1 標籤分析 39 4.2.2 分群及其結果 42 4.2.3 結果分析 44 4.3 高轉推貼文發文者身份 51 第五章結論及未來展望 60	zh_TW
dc.source.uri (資料來源)	http://thesis.lib.nccu.edu.tw/record/#G0104753005	en_US
dc.subject (關鍵詞)	2016 高雄美濃地震	zh_TW
dc.subject (關鍵詞)	推特	zh_TW
dc.subject (關鍵詞)	社群媒體分析	zh_TW
dc.subject (關鍵詞)	圖像分析	zh_TW
dc.subject (關鍵詞)	文字分析	zh_TW
dc.subject (關鍵詞)	圖論	zh_TW
dc.subject (關鍵詞)	2016 Meinong earthquake	en_US
dc.subject (關鍵詞)	Twitter	en_US
dc.subject (關鍵詞)	Social media analysis	en_US
dc.subject (關鍵詞)	Text analysis	en_US
dc.subject (關鍵詞)	Graph theory	en_US
dc.title (題名)	發展社群媒體事件之圖像快篩方法：以2016年美濃地震之Twitter資料為例	zh_TW
dc.title (題名)	Developing a Fast Screening Method for Visual Images in Social Media Events: A Case Study of Twitter Data during 2016 Meinong Earthquake	en_US
dc.type (資料類型)	thesis	en_US
dc.relation.reference (參考文獻)	[1] Imran, M., Castillo, C., Diaz, F., & Vieweg, S. （2015）. Processing social media messages in mass emergency: A survey. ACM Computing Surveys （CSUR）, 47（4）, 67. [2] Daly, S., & Thom, J. A. （2016, May）. Mining and Classifying Image Posts on Social Media to Analyse Fires. In ISCRAM. [3] Mouzannar, H., Rizk, Y., & Awad, M. Damage Identification in Social Media Posts using Multimodal Deep Learning. [4]Weng, J., & Lee, B. S. (2011). Event detection in twitter. ICWSM, 11, 401-408. [5]Abel, F., Hauff, C., Houben, G. J., Stronkman, R., & Tao, K. (2012, June). Semantics+ filtering+ search= twitcident. exploring information in social web streams. In Proceedings of the 23rd ACM conference on Hypertext and social media (pp. 285-294). ACM. [6]Alam, F., Imran, M., & Ofli, F. (2017, July). Image4act: Online social media image processing for disaster response. In Proceedings of the 2017 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining 2017 (pp. 601-604). ACM. [7]Peters, R., & de Albuquerque, J. P. (2015). Investigating images as indicators for relevant social media messages in disaster management. In ISCRAM. [8] Harris, Z. S. （1954）. Distributional Structure. Word, 10（2/3）, 146–162 [9] Upadhyaya, N., & Dixit, M. （2016）. A Review: Relating Low Level Features to High Level Semantics in CBIR. International Journal of Signal Processing, Image Processing and Pattern Recognition, 9（3）, 433-444. [10] Jain, R., Kasturi, R., & Schunck, B. G. （1995）. Machine vision （Vol. 5）. New York: McGraw-Hill. [11]Raghavan, U. N., Albert, R., & Kumara, S. （2007）. Near linear time algorithm to detect community structures in large-scale networks. Physical review E, 76（3）, 036106. （label） [12]Blondel, V. D., Guillaume, J. L., Lambiotte, R., & Lefebvre, E. （2008）. Fast unfolding of communities in large networks. Journal of statistical mechanics: theory and experiment, 2008（10）, P10008. [13] Pons, Pascal, and Matthieu Latapy. "Computing communities in large networks using random walks." ISCIS. Vol. 3733. 2005. [14] Almeida, Hélio, et al. "Is there a best quality metric for graph clusters?." Joint European Conference on Machine Learning and Knowledge Discovery in Databases. Springer, Berlin, Heidelberg, 2011. [15]Santorini, B. （1990）. Part-of-speech tagging guidelines for the Penn Treebank Project （3rd revision）. Technical Reports （CIS）, 570. [16]González, F.A., Gelbukh, A.F., & Jiménez, S. （2015）. Soft Cardinality in Semantic Text Processing: Experience of the SemEval International Competitions. Polibits, 51, 63-72. [17]鄭宇君, & 陳百齡. （2017）. 香港雨傘運動的眾聲喧嘩：探討Twitter社群的多語系貼文. 傳播與社會學刊 41 頁81-117	zh_TW
dc.identifier.doi (DOI)	10.6814/THE.NCCU.CS.004.2018.B02	-

Publications-Theses

Article View/Open

Publication Export

Google ScholarTM

NCCU Library

Citation Infomation

Related Publications in TAIR

Google Scholar^TM