學術產出-Theses

Article View/Open

Publication Export

Google ScholarTM

政大圖書館

Citation Infomation

題名 基於Word2Vec台灣媒體政治傾向探討
A Study of word2vec: Application of Media Bias Investigation
作者 黃中廷
Huang, Chung-Ting
貢獻者 薛慧敏
黃中廷
Huang, Chung-Ting
關鍵詞 媒體偏差
政治新聞
Word2Vec
相關性
Media bias
Political news
Word2Vec
Correlation
日期 2019
上傳時間 4-Mar-2019 19:12:14 (UTC+8)
摘要   新聞媒體在資訊傳遞與監督政府上扮演重要的角色,但媒體偏差的問題也伴隨著大量的報導一同產生,尤其是政治新聞。近期Word2vec方法被用來將類別型態的字詞映射至實數向量空間上,經過量化後,字詞間的相關性得以被測量。本論文將Word2Vec應用於台灣電子媒體之新聞資料,透過提取關鍵字的方式加以分析,以探討媒體偏差之存在性。我們並探究Word2Vec中的模型與傳統統計觀點異同之處。本研究使用2014年台灣地區的政治類新聞,比較兩家電子媒體對台北市長候選人相關新聞,另外也檢測word2vec方法中的窗格大小對分析結果的敏感度。我們發現不同的媒體在用字與寫作風格上有明顯差異,另外也存在著政黨偏好的可能性。
  News media plays an important role in information transmission and supervising the government, but the problem of media bias is accompanied with massive numbers of news especially in political news. Word2Vec is used to map categorical data into real number space. The correlation between words can be measured after quantifying. In this paper, we apply Word2Vec on the news data of Taiwan electronic media, capturing keywords and analyzing them to find out the existence of media bias. We also explore the differences of views between the model in Word2Vec and original statistics concepts. In this research, we use the political news in Taiwan in 2014, comparing news associated with candidates of the mayor in Taipei. On the other hand, we do some detection on the sensitivity of the window size to Word2Vec. Finally we discover that besides different media having different writing habit themselves, they also have the possibility of party preferences.
參考文獻 [1] 台北市媒體服務代理商協會,(2017).2017年台灣媒體白皮書.
https://maataipei.org/download/2017媒體白皮書/
[2] D`Alessio, D, Allen, M,(2000).Media Bias in Presidential
Elections: A Meta-Analysis, Journal of Communication,50,133-156.
[3] 媒體改造學社、台灣媒體觀察教育基金會、優質新聞發展協會,(2018).
【聯合聲明】選舉新聞嚴重失衡,媒體自律形同具文,媒體觀察組織發表
嚴厲譴責暨申訴行動聲明.http://www.mediawatch.org.tw/news/9787。
[4] Eberl J.-M., Boomgaarden H.G., Wagner M.,(2017).One Bias Fits
All? Three Types of Media Bias and Their Effects on Party
Preferences, Communication Research,44,1125-1148.
[5] Sun J.Y., 结巴中文分词.https://github.com/fxsjy/jieba
[6] Cavnar W.B., Trenkle J.M.,(1994). N-Gram-Based Text
Categorization, Proceedings of SDAIR-94, 3rd Annual Symposium on
Document Analysis and Information Retrieval,161-175.
[7] Pennington J., Socher R., Manning C.D.,(2014). GloVe: Global
Vectors for Word Representation, EMNLP 2014,1532-1543.
[8] Selivanov D.,(2015). GloVe vs word2vec revisited,
http://dsnotes.com/post/glove-enwiki/
[9] McCormick C.,(2016). Word2Vec Tutorial-The Skip-Gram Model.
http://mccormickml.com/2016/04/19/word2vec-tutorial-the-skip-
gram-model/
[10] Mikolov T., Chen K., corrado G.S., Dean J.,(2013). Efficient
Estimation of Word Representations in Vector Space,
arXiv:1301.3781v3.
[11] Mikolov T., Sutskever I., Chen K., Corrado G.S., Dean J.,(2013).
Distributed Representations of Words and Phrases and their
Compositionality, NIPS 2013,3111-3119.
描述 碩士
國立政治大學
統計學系
105354013
資料來源 http://thesis.lib.nccu.edu.tw/record/#G0105354013
資料類型 thesis
dc.contributor.advisor 薛慧敏zh_TW
dc.contributor.author (Authors) 黃中廷zh_TW
dc.contributor.author (Authors) Huang, Chung-Tingen_US
dc.creator (作者) 黃中廷zh_TW
dc.creator (作者) Huang, Chung-Tingen_US
dc.date (日期) 2019en_US
dc.date.accessioned 4-Mar-2019 19:12:14 (UTC+8)-
dc.date.available 4-Mar-2019 19:12:14 (UTC+8)-
dc.date.issued (上傳時間) 4-Mar-2019 19:12:14 (UTC+8)-
dc.identifier (Other Identifiers) G0105354013en_US
dc.identifier.uri (URI) http://nccur.lib.nccu.edu.tw/handle/140.119/122378-
dc.description (描述) 碩士zh_TW
dc.description (描述) 國立政治大學zh_TW
dc.description (描述) 統計學系zh_TW
dc.description (描述) 105354013zh_TW
dc.description.abstract (摘要)   新聞媒體在資訊傳遞與監督政府上扮演重要的角色,但媒體偏差的問題也伴隨著大量的報導一同產生,尤其是政治新聞。近期Word2vec方法被用來將類別型態的字詞映射至實數向量空間上,經過量化後,字詞間的相關性得以被測量。本論文將Word2Vec應用於台灣電子媒體之新聞資料,透過提取關鍵字的方式加以分析,以探討媒體偏差之存在性。我們並探究Word2Vec中的模型與傳統統計觀點異同之處。本研究使用2014年台灣地區的政治類新聞,比較兩家電子媒體對台北市長候選人相關新聞,另外也檢測word2vec方法中的窗格大小對分析結果的敏感度。我們發現不同的媒體在用字與寫作風格上有明顯差異,另外也存在著政黨偏好的可能性。zh_TW
dc.description.abstract (摘要)   News media plays an important role in information transmission and supervising the government, but the problem of media bias is accompanied with massive numbers of news especially in political news. Word2Vec is used to map categorical data into real number space. The correlation between words can be measured after quantifying. In this paper, we apply Word2Vec on the news data of Taiwan electronic media, capturing keywords and analyzing them to find out the existence of media bias. We also explore the differences of views between the model in Word2Vec and original statistics concepts. In this research, we use the political news in Taiwan in 2014, comparing news associated with candidates of the mayor in Taipei. On the other hand, we do some detection on the sensitivity of the window size to Word2Vec. Finally we discover that besides different media having different writing habit themselves, they also have the possibility of party preferences.en_US
dc.description.tableofcontents 第一章 緒論 1
第二章 研究方法 4
第一節 Word2Vec簡介 4
第二節 公式與推導 6
第三章 實證分析 9
第一節 資料之處理 9
第二節 與「柯文哲」高度相關的關鍵字 12
第三節 與「連勝文」高度相關的關鍵字 13
第四節 窗格的影響 21
第四章 結論與建議 24
參考文獻 26
zh_TW
dc.format.extent 1450579 bytes-
dc.format.mimetype application/pdf-
dc.source.uri (資料來源) http://thesis.lib.nccu.edu.tw/record/#G0105354013en_US
dc.subject (關鍵詞) 媒體偏差zh_TW
dc.subject (關鍵詞) 政治新聞zh_TW
dc.subject (關鍵詞) Word2Veczh_TW
dc.subject (關鍵詞) 相關性zh_TW
dc.subject (關鍵詞) Media biasen_US
dc.subject (關鍵詞) Political newsen_US
dc.subject (關鍵詞) Word2Vecen_US
dc.subject (關鍵詞) Correlationen_US
dc.title (題名) 基於Word2Vec台灣媒體政治傾向探討zh_TW
dc.title (題名) A Study of word2vec: Application of Media Bias Investigationen_US
dc.type (資料類型) thesisen_US
dc.relation.reference (參考文獻) [1] 台北市媒體服務代理商協會,(2017).2017年台灣媒體白皮書.
https://maataipei.org/download/2017媒體白皮書/
[2] D`Alessio, D, Allen, M,(2000).Media Bias in Presidential
Elections: A Meta-Analysis, Journal of Communication,50,133-156.
[3] 媒體改造學社、台灣媒體觀察教育基金會、優質新聞發展協會,(2018).
【聯合聲明】選舉新聞嚴重失衡,媒體自律形同具文,媒體觀察組織發表
嚴厲譴責暨申訴行動聲明.http://www.mediawatch.org.tw/news/9787。
[4] Eberl J.-M., Boomgaarden H.G., Wagner M.,(2017).One Bias Fits
All? Three Types of Media Bias and Their Effects on Party
Preferences, Communication Research,44,1125-1148.
[5] Sun J.Y., 结巴中文分词.https://github.com/fxsjy/jieba
[6] Cavnar W.B., Trenkle J.M.,(1994). N-Gram-Based Text
Categorization, Proceedings of SDAIR-94, 3rd Annual Symposium on
Document Analysis and Information Retrieval,161-175.
[7] Pennington J., Socher R., Manning C.D.,(2014). GloVe: Global
Vectors for Word Representation, EMNLP 2014,1532-1543.
[8] Selivanov D.,(2015). GloVe vs word2vec revisited,
http://dsnotes.com/post/glove-enwiki/
[9] McCormick C.,(2016). Word2Vec Tutorial-The Skip-Gram Model.
http://mccormickml.com/2016/04/19/word2vec-tutorial-the-skip-
gram-model/
[10] Mikolov T., Chen K., corrado G.S., Dean J.,(2013). Efficient
Estimation of Word Representations in Vector Space,
arXiv:1301.3781v3.
[11] Mikolov T., Sutskever I., Chen K., Corrado G.S., Dean J.,(2013).
Distributed Representations of Words and Phrases and their
Compositionality, NIPS 2013,3111-3119.
zh_TW
dc.identifier.doi (DOI) 10.6814/THE.NCCU.STAT.003.2019.B03en_US