Publications-Theses

Article View/Open

Publication Export

Google ScholarTM

NCCU Library

Citation Infomation

Related Publications in TAIR

題名 以社群媒體為考量之選民政治傾向探索
Predicting the Political Preference of Plurk Users
作者 江家榕
Chiang, Chia Jung
貢獻者 陳良弼
Chen, L. P.
江家榕
Chiang, Chia Jung
關鍵詞 社群媒體
政治傾向
日期 2014
上傳時間 2-Mar-2015 10:13:50 (UTC+8)
摘要 近年來,社群媒體的廣為使用,讓人們可以輕易地在社群網站中發表想法
     或是接收感興趣的資訊,促使許多研究專注於探究這些大量的個人化資訊
     所提供之預測力。
     本研究擬從社群媒體著手,以臺灣 2012 總統大選為背景,收集投票日
     前六個月選民資料,進而透過文字訊息以及互動結構特徵達成選民政治傾
     向分析。實驗結果發現,預測政治熱衷使用者之政治傾向準確度可達
     94.08%。
     此外,因游離選民通常為選舉致勝關鍵點,本研究不僅僅將選民分為兩
     黨,並依據其於選舉前之熱門政治討論議題之立場變化,將其細分為五個
     族群(深藍、淺藍、中立、淺綠、深綠),以拓展應用於其他實務,如競選
     策略等,使其更具有實用性。而熱門政治討論議題之選擇可透過以日為單
     位,擷取政治新聞關鍵字,並計算其於噗浪上的討論程度決定。最終,可
     將 275 名使用者細分為五個群體,並選擇淺藍、中立、淺綠等 208 名為主
     要宣傳目標,以提升競選策略成效。
Nowadays, the use of social media is increasingly popular all over the world.
     People can easily express their thoughts or receive information that they are
     interested in via social media. Many studies have focused on exploring the
     predictive power of the large amount of data generated from social media.
     In this thesis, we address the problem of predicting the political preference
     of social media users given the data of their past activities on Plurk and
     evaluating our approach on the Taiwan 2012 presidential election. We first
     collected Plurk messages posted six months before the election day. By building
     predicting models based on a variety of contextual and behavioral features, we
     find that predicting political preference of active users achieved up to 94.08%
     classification accuracy. In the meanwhile, in order to extend the usability of our
     work, we further use our models to analyze the change of user political
     preference based on political events which happened before the election.
     Identifying people who change their political preference frequently or stay
     neutrally allows a candidate to design strategies to affect these people. All of the
     political events are automatically selected by the popularity of political
     keywords used in Plurk, and keywords can be extracted from daily political
     news. In the end, we get 208 swing voters from 275 voters, who become the
     main targets for enhancing the effectiveness of the campaign strategy.
參考文獻 [AH10] S. Asur and B. A. Huberman, “Predicting the future with social media,” International Joint Conference on Web Intelligence and Intelligent Agent Technology (WI-IAT’10), 2010, Vol. 1, pp.492-499.
      [BF10] L. Barbosa and J. Feng, “Robust sentiment detection on Twitter from biased and noisy data,” Proceedings of the International Conference on Computational Linguistics (COLING`10), 2010, pp. 36-44.
     [BK12] A. Boutet, H. Kim, and E. Yoneki, “What’s in Your Tweets? I Know Who You Supported in the UK 2010 General Election,” Proceedings of the International Conference on Weblogs and Social Media (ICWSM’12), 2012.
     [BM11] J. Bollen, H. Mao and X. Zeng, “Twitter mood predicts the stock market,” Journal of Computational Science, 2011,2(1), pp.1-8.
     [CG11] M. D. Conver, B. Goncalves, J. Ratkiweicz, A. Flammini, F. Menczer, “Predicting the Political Alignment of Twitter Users,” Proceedings of the IEEE Conference on Social Computing (SocialCom’11), 2011.
     [CL11] C. C. Chang and C. J. Lin, “LIBSVM : a library for support vector machines,” ACM Transactions on Intelligent Systems and Technology, 2011,2:27:1--27:27. Software available at http://www.csie.ntu.edu.tw/~cjlin/libsvm.
     [CW13] C. Chen, K. Wu, V. Srinivasan and X. Zhang, “Battling the Internet Water Army: Detection of Hidden Paid Posters,” Proceedings of the International Conference on Advances in Social Network Analysis and Mining (ASONAM’13), 2013.
     [CZ12] A. Cui, H. Zhang, Y. Liu, M. Zhang, and S. Ma, “Lexicon-Based Sentiment Analysis on Topical Chinese Microblog Messages,” Proceedings of the the Joint Conference of the Chinese Semantic Web Symposium (CSWS’12) and the Chinese Web Science Conference (CWSC`12), 2012.
     [DD06] Z. Dong and Q. Dong, “HowNet and the Computation of Meaning,” World Scientific Publishing Co., Inc., River Edge, NJ, 2006.
     [ES06] A. Esuli and F. Sebastiani, “SentiWordNet: A publicly available lexical resource for opinion mining,” Proceedings of the Conference on Language Resources and Evaluation (LREC’06), 2006, pp. 417–422.
     [FB13] Clay Fink, Nathan Bos, Alexander Perrone, Edwina Liu, and Jonathon Kopcky, “Twitter, Public Opinion, and the 2011 Nigerian Presidential Election,” Proceedings of the IEEE Conference on Social Computing (SocialCom’13), 2013.
     [GB09] Go. A, R. Bhayani, and L. Huang, “Twitter sentiment classification using distant supervision,” Technical Report, Stanford Digital Library Technologies Project, 2009.
     [GM11] D. Gayo-Avello, P. T. Metaxas and E. Mustafaraj, “Limits of Electoral Predictions using Twitter,” Proceedings of the International Conference on Weblogs and Social Media (ICWSM’11), 2011.
     [JC13] F. Jiang, A. Cui, Y. Liu, M. Zhang and S. Ma, “Every Term Has Sentiment: Learning from Emoticon Evidences for Chinese Microblog Sentiment Analysis,” Proceedings of the Conference on Natural Language Processing and Chinese Computing (NLP&CC’13), 2013.
     [KC07] L. W. Ku and H. H. Chen, "Mining Opinions from the Web: Beyond Relevance Retrieval," Journal of American Society for Information Science and Technology, Special Issue on Mining Web Resources for Enhancing Information Retrieval, 2007, Volume 58 Issue 12, pp.1838-1850.
     [LK77] J. R. Landis and G. G. Koch, "An Application of Hierarchical Kappa-type Statistics in the Assessment of Majority Agreement among Multiple Observers," Biometrics, 1977, Vol. 33, No. 2, pp. 363-374.
     [LM11] C. Lui, P. T. Metaxas, and E. Mustafaraj, ”On the predictability of the US elections through search volume activity,” Proceedings of the IADIS International Conference on e-Society, 2011.
     [LW12] H. C. Liu and J. H. Wang, “Social Influence Estimation for Short Texts in Plurk,” Proceedings of the International Conference on Advances in Social Network Analysis and Mining (ASONAM’12), 2012.
     [MR13] A. Makazhanov and D. Rafiel, “Predicting Political Preference of Twitter Users,” Proceedings of the International Conference on Advances in Social Network Analysis and Mining (ASONAM’13), 2013.
     [OB10] B. O’Connor, R. Balasubramanyan, B. R. Routledge, and N. A. Smith, “From Tweets to Polls: Linking Text Sentiment to Public Opinion Time Series,” Proceedings of the International Conference on Weblogs and Social Media (ICWSM’10), 2010.
     [OB13] S. O’Banion and L. Birnbaum, “Using Explicit Linguistic Expressions of Preference in Social Media to Predict Voting Behavior,” Proceedings of the International Conference on Advances in Social Network Analysis and Mining (ASONAM’13), 2013.
     [PL08] B. Pang and L. Lee, “Opinion mining and sentiment analysis,” Foundations and Trends in Information Retrieval, 2008, Vol.2, No. 1-2, pp. 1-135.
     [PP11] Marco Pennacchiotti, Ana-Maria Popescu,” Democrats, Republicans and Starbucks Afficionados: User Classification in Twitter,” Proceedings of the 17th SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD’11), 2011.
     [S10] F. Santo, “Community detection in graphs,” Physics Reports, Vol.486, 2010, pp.75-174.
     [TS10] A. Tumasjan, T. O. Sprenger, P. G. Sandner and I. M. Welpe, “Predicting Elections with Twitter: What 140 Characters Reveal about Political Sentiment,” Proceedings of the International Conference on Weblogs and Social Media (ICWSM’10), 2010.
     [李游12] 李政儒, 游基鑫, 和陳信希, “廣義知網詞彙意見極性的預測,” 中華民國計算語言學學會, 2012.
     [陳黃04] 陳克健, 黃淑齡, 施悅音, 和陳怡君, “多層次概念定義與複雜關係表達-繁體字知網的新增架構,” 漢語詞彙語義研究的現狀與發展趨勢國際學術研討會, 2004.
     [孫陳10] 孫瑛澤, 陳建良, 劉峻杰, 劉昭麟, 和蘇豐文, “中文短句之情緒分類,” Proceedings of the Conference on Computational Linguistics and Speech Processing (ROCLING’10), 2010.
描述 碩士
國立政治大學
資訊科學學系
101753008
103
資料來源 http://thesis.lib.nccu.edu.tw/record/#G1017530081
資料類型 thesis
dc.contributor.advisor 陳良弼zh_TW
dc.contributor.advisor Chen, L. P.en_US
dc.contributor.author (Authors) 江家榕zh_TW
dc.contributor.author (Authors) Chiang, Chia Jungen_US
dc.creator (作者) 江家榕zh_TW
dc.creator (作者) Chiang, Chia Jungen_US
dc.date (日期) 2014en_US
dc.date.accessioned 2-Mar-2015 10:13:50 (UTC+8)-
dc.date.available 2-Mar-2015 10:13:50 (UTC+8)-
dc.date.issued (上傳時間) 2-Mar-2015 10:13:50 (UTC+8)-
dc.identifier (Other Identifiers) G1017530081en_US
dc.identifier.uri (URI) http://nccur.lib.nccu.edu.tw/handle/140.119/73574-
dc.description (描述) 碩士zh_TW
dc.description (描述) 國立政治大學zh_TW
dc.description (描述) 資訊科學學系zh_TW
dc.description (描述) 101753008zh_TW
dc.description (描述) 103zh_TW
dc.description.abstract (摘要) 近年來,社群媒體的廣為使用,讓人們可以輕易地在社群網站中發表想法
     或是接收感興趣的資訊,促使許多研究專注於探究這些大量的個人化資訊
     所提供之預測力。
     本研究擬從社群媒體著手,以臺灣 2012 總統大選為背景,收集投票日
     前六個月選民資料,進而透過文字訊息以及互動結構特徵達成選民政治傾
     向分析。實驗結果發現,預測政治熱衷使用者之政治傾向準確度可達
     94.08%。
     此外,因游離選民通常為選舉致勝關鍵點,本研究不僅僅將選民分為兩
     黨,並依據其於選舉前之熱門政治討論議題之立場變化,將其細分為五個
     族群(深藍、淺藍、中立、淺綠、深綠),以拓展應用於其他實務,如競選
     策略等,使其更具有實用性。而熱門政治討論議題之選擇可透過以日為單
     位,擷取政治新聞關鍵字,並計算其於噗浪上的討論程度決定。最終,可
     將 275 名使用者細分為五個群體,並選擇淺藍、中立、淺綠等 208 名為主
     要宣傳目標,以提升競選策略成效。
zh_TW
dc.description.abstract (摘要) Nowadays, the use of social media is increasingly popular all over the world.
     People can easily express their thoughts or receive information that they are
     interested in via social media. Many studies have focused on exploring the
     predictive power of the large amount of data generated from social media.
     In this thesis, we address the problem of predicting the political preference
     of social media users given the data of their past activities on Plurk and
     evaluating our approach on the Taiwan 2012 presidential election. We first
     collected Plurk messages posted six months before the election day. By building
     predicting models based on a variety of contextual and behavioral features, we
     find that predicting political preference of active users achieved up to 94.08%
     classification accuracy. In the meanwhile, in order to extend the usability of our
     work, we further use our models to analyze the change of user political
     preference based on political events which happened before the election.
     Identifying people who change their political preference frequently or stay
     neutrally allows a candidate to design strategies to affect these people. All of the
     political events are automatically selected by the popularity of political
     keywords used in Plurk, and keywords can be extracted from daily political
     news. In the end, we get 208 swing voters from 275 voters, who become the
     main targets for enhancing the effectiveness of the campaign strategy.
en_US
dc.description.tableofcontents 第一章 緒論 1
     1.1研究動機與目的 1
     1.2研究對象 3
     1.3研究貢獻 4
     1.4論文架構 4
     第二章 文獻探討 5
     2.1微網誌之相關研究 5
     2.2微網誌之政治相關研究 5
     2.3微網誌與意見分析 8
     第三章 研究方法 10
     3.1系統架構 10
     3.2資料前處理 11
     3.3人工標註之選民政治傾向 11
     3.4特徵擷取 12
     3.5政治傾向分類 17
     3.6政治相關熱門討論議題與政治傾向變化 18
     3.6.1 政治相關熱門討論議題 19
     3.6.2 人工標註之政治議題傾向 20
     3.6.3 政治議題特徵擷取與分類 20
     3.6.3 政治傾向變化 22
     第四章 實驗方法與驗證 23
     4.1實驗資料 23
     4.1.1 儲存格式 23
     4.1.2 資料量 24
     4.2實驗環境 24
     4.3實驗資料標註 25
     4.4資料結果與驗證 26
     4.4.1 實驗評估指標 26
     4.4.2 使用者政治傾向 26
     4.4.4 使用者政治傾向變化 34
     4.4.5 討論 35
     第五章 結論 37
     參考文獻 38
zh_TW
dc.source.uri (資料來源) http://thesis.lib.nccu.edu.tw/record/#G1017530081en_US
dc.subject (關鍵詞) 社群媒體zh_TW
dc.subject (關鍵詞) 政治傾向zh_TW
dc.title (題名) 以社群媒體為考量之選民政治傾向探索zh_TW
dc.title (題名) Predicting the Political Preference of Plurk Usersen_US
dc.type (資料類型) thesisen
dc.relation.reference (參考文獻) [AH10] S. Asur and B. A. Huberman, “Predicting the future with social media,” International Joint Conference on Web Intelligence and Intelligent Agent Technology (WI-IAT’10), 2010, Vol. 1, pp.492-499.
      [BF10] L. Barbosa and J. Feng, “Robust sentiment detection on Twitter from biased and noisy data,” Proceedings of the International Conference on Computational Linguistics (COLING`10), 2010, pp. 36-44.
     [BK12] A. Boutet, H. Kim, and E. Yoneki, “What’s in Your Tweets? I Know Who You Supported in the UK 2010 General Election,” Proceedings of the International Conference on Weblogs and Social Media (ICWSM’12), 2012.
     [BM11] J. Bollen, H. Mao and X. Zeng, “Twitter mood predicts the stock market,” Journal of Computational Science, 2011,2(1), pp.1-8.
     [CG11] M. D. Conver, B. Goncalves, J. Ratkiweicz, A. Flammini, F. Menczer, “Predicting the Political Alignment of Twitter Users,” Proceedings of the IEEE Conference on Social Computing (SocialCom’11), 2011.
     [CL11] C. C. Chang and C. J. Lin, “LIBSVM : a library for support vector machines,” ACM Transactions on Intelligent Systems and Technology, 2011,2:27:1--27:27. Software available at http://www.csie.ntu.edu.tw/~cjlin/libsvm.
     [CW13] C. Chen, K. Wu, V. Srinivasan and X. Zhang, “Battling the Internet Water Army: Detection of Hidden Paid Posters,” Proceedings of the International Conference on Advances in Social Network Analysis and Mining (ASONAM’13), 2013.
     [CZ12] A. Cui, H. Zhang, Y. Liu, M. Zhang, and S. Ma, “Lexicon-Based Sentiment Analysis on Topical Chinese Microblog Messages,” Proceedings of the the Joint Conference of the Chinese Semantic Web Symposium (CSWS’12) and the Chinese Web Science Conference (CWSC`12), 2012.
     [DD06] Z. Dong and Q. Dong, “HowNet and the Computation of Meaning,” World Scientific Publishing Co., Inc., River Edge, NJ, 2006.
     [ES06] A. Esuli and F. Sebastiani, “SentiWordNet: A publicly available lexical resource for opinion mining,” Proceedings of the Conference on Language Resources and Evaluation (LREC’06), 2006, pp. 417–422.
     [FB13] Clay Fink, Nathan Bos, Alexander Perrone, Edwina Liu, and Jonathon Kopcky, “Twitter, Public Opinion, and the 2011 Nigerian Presidential Election,” Proceedings of the IEEE Conference on Social Computing (SocialCom’13), 2013.
     [GB09] Go. A, R. Bhayani, and L. Huang, “Twitter sentiment classification using distant supervision,” Technical Report, Stanford Digital Library Technologies Project, 2009.
     [GM11] D. Gayo-Avello, P. T. Metaxas and E. Mustafaraj, “Limits of Electoral Predictions using Twitter,” Proceedings of the International Conference on Weblogs and Social Media (ICWSM’11), 2011.
     [JC13] F. Jiang, A. Cui, Y. Liu, M. Zhang and S. Ma, “Every Term Has Sentiment: Learning from Emoticon Evidences for Chinese Microblog Sentiment Analysis,” Proceedings of the Conference on Natural Language Processing and Chinese Computing (NLP&CC’13), 2013.
     [KC07] L. W. Ku and H. H. Chen, "Mining Opinions from the Web: Beyond Relevance Retrieval," Journal of American Society for Information Science and Technology, Special Issue on Mining Web Resources for Enhancing Information Retrieval, 2007, Volume 58 Issue 12, pp.1838-1850.
     [LK77] J. R. Landis and G. G. Koch, "An Application of Hierarchical Kappa-type Statistics in the Assessment of Majority Agreement among Multiple Observers," Biometrics, 1977, Vol. 33, No. 2, pp. 363-374.
     [LM11] C. Lui, P. T. Metaxas, and E. Mustafaraj, ”On the predictability of the US elections through search volume activity,” Proceedings of the IADIS International Conference on e-Society, 2011.
     [LW12] H. C. Liu and J. H. Wang, “Social Influence Estimation for Short Texts in Plurk,” Proceedings of the International Conference on Advances in Social Network Analysis and Mining (ASONAM’12), 2012.
     [MR13] A. Makazhanov and D. Rafiel, “Predicting Political Preference of Twitter Users,” Proceedings of the International Conference on Advances in Social Network Analysis and Mining (ASONAM’13), 2013.
     [OB10] B. O’Connor, R. Balasubramanyan, B. R. Routledge, and N. A. Smith, “From Tweets to Polls: Linking Text Sentiment to Public Opinion Time Series,” Proceedings of the International Conference on Weblogs and Social Media (ICWSM’10), 2010.
     [OB13] S. O’Banion and L. Birnbaum, “Using Explicit Linguistic Expressions of Preference in Social Media to Predict Voting Behavior,” Proceedings of the International Conference on Advances in Social Network Analysis and Mining (ASONAM’13), 2013.
     [PL08] B. Pang and L. Lee, “Opinion mining and sentiment analysis,” Foundations and Trends in Information Retrieval, 2008, Vol.2, No. 1-2, pp. 1-135.
     [PP11] Marco Pennacchiotti, Ana-Maria Popescu,” Democrats, Republicans and Starbucks Afficionados: User Classification in Twitter,” Proceedings of the 17th SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD’11), 2011.
     [S10] F. Santo, “Community detection in graphs,” Physics Reports, Vol.486, 2010, pp.75-174.
     [TS10] A. Tumasjan, T. O. Sprenger, P. G. Sandner and I. M. Welpe, “Predicting Elections with Twitter: What 140 Characters Reveal about Political Sentiment,” Proceedings of the International Conference on Weblogs and Social Media (ICWSM’10), 2010.
     [李游12] 李政儒, 游基鑫, 和陳信希, “廣義知網詞彙意見極性的預測,” 中華民國計算語言學學會, 2012.
     [陳黃04] 陳克健, 黃淑齡, 施悅音, 和陳怡君, “多層次概念定義與複雜關係表達-繁體字知網的新增架構,” 漢語詞彙語義研究的現狀與發展趨勢國際學術研討會, 2004.
     [孫陳10] 孫瑛澤, 陳建良, 劉峻杰, 劉昭麟, 和蘇豐文, “中文短句之情緒分類,” Proceedings of the Conference on Computational Linguistics and Speech Processing (ROCLING’10), 2010.
zh_TW