學術產出-Theses

Article View/Open

Publication Export

Google ScholarTM

政大圖書館

Citation Infomation

  • No doi shows Citation Infomation
題名 巨量資料環境下之新聞主題暨輿情與股價關係之研究
A Study of the Relevance between News Topics & Public Opinion and Stock Prices in Big Data
作者 張良杰
Chang, Liang Chieh
貢獻者 楊建民
張良杰
Chang, Liang Chieh
關鍵詞 巨量資料
文字探勘
新聞主題偵測與追蹤
連結分析
情感分析
Big data
Text mining
News topic detection and tracking
Link analysis
Sentiment analysis
日期 2013
上傳時間 29-Jul-2014 16:03:34 (UTC+8)
摘要 近年來科技、網路以及儲存媒介的發達,產生的資料量呈現爆炸性的成長,也宣告了巨量資料時代的來臨。擁有巨量資料代表了不必再依靠傳統抽樣的方式來蒐集資料,分析數據也不再有資料收集不足以致於無法代表母題的限制。突破傳統的限制後,巨量資料的精隨在於如何從中找出有價值的資訊。
以擁有大量輿論和人際互動資訊的社群網站為例,就有相關學者研究其情緒與股價具有正相關性,本研究也試著利用同樣具有巨量資料特性的網路新聞,抓取中央新聞社2013年7月至2014年5月之經濟類新聞共計30,879篇,結合新聞主題偵測與追蹤技術及情感分析,利用新聞事件相似的概念,透過連結匯聚成網絡並且分析新聞的情緒和股價指數的關係。
研究結果顯示,新聞事件間可以連結成一特定新聞主題,且能在龐大的網絡中找出不同的新聞主題,並透過新聞主題之連結產生新聞主題脈絡。對此提供一種新的方式來迅速了解巨量新聞內容,也能有效的回溯新聞主題及新聞事件。
在新聞情緒和股價指數方面,研究發現新聞情緒影響了股價指數之波動,其相關係數達到0.733562;且藉由情緒與心理線及買賣意願指標之比較,顯示新聞的情緒具有一定的程度能夠成為股價判斷之參考依據。
In recent years, the technology, network, and storage media developed, the amount of generated data with the explosive growth, and also declared the new era of big data. Having big data let us no longer rely on the traditional sample ways to collect data, and no longer have the issue that could not represent the population which caused by the inadequate data collection. Once we break the limitations, the main spirit of big data is how to find out the valuable information in big data.
For example, the social network sites (SNS) have a lot of public opinions and interpersonal information, and scholars have founded that the emotions in SNS have a positive correlation with stock prices. Therefore, the thesis tried to focus on the news which have the same characteristic of big data, using the web crawl to catch total of 30,879 economics news articles form the Central News Agency, furthermore, took the “Topic Detection & Tracking” and “Sentiment Analysis” technology on these articles. Finally, based on the concept of the similarity between news articles, through the links converging networks and analyze the relevant between news sentiment and stock prices.
The results shows that news events can be linked to specific news topics, identify different news topics in a large network, and form the news topic context by linked news topics together. The thesis provides a new way to quickly understand the huge amount of news, and backtracking news topics and news event with effective.
In the aspect of news sentiment and stock prices, the results shows that the news sentiments impact the fluctuations of stock prices, and the correlation coefficient is 0.733562. By comparing the emotion with psychological lines & trading willingness indicators, the emotion is better than the two indicators in the stock prices determination.
參考文獻 Adler, E. (2013). Here`s Why `The Internet Of Things` Will Be Huge, And Drive Tremendous Value For People And Businesses. Retrieved from Business Insider website: http://www.businessinsider.com/growth-in-the-internet-of-things-2013-10
Allan, J., Papka, R., & Lavrenko, V. (1998). On-line new event detection and tracking. Paper presented at the Proceedings of the 21st annual international ACM SIGIR conference on Research and development in information retrieval, Melbourne, Australia.
Atzori, L., Iera, A., & Morabito, G. (2010). The internet of things: A survey. Computer networks, 54(15), 2787-2805.
Ballve, M. (2013). Big Data Will Drive The Next Phase Of Innovation In Mobile Computing. Retrieved from Business Insider website: http://www.businessinsider.com/big-data-is-growing-thanks-to-mobile-2013-12
Bar-Haim, R., Dinur, E., Feldman, R., Fresko, M., & Goldstein, G. (2011). Identifying and following expert investors in stock microblogs. Paper presented at the Proceedings of the Conference on Empirical Methods in Natural Language Processing.
Bollen, J., Mao, H., & Zeng, X. (2011). Twitter mood predicts the stock market. Journal of Computational Science, 2(1), 1-8.
Brown, G. W. (1999). Volatility, sentiment, and noise traders. Financial Analysts Journal, 82-90.
Cambria, E., Rajagopal, D., Olsher, D., & Das, D. (2013). Big social data analysis. Big Data Computing, 401-414.
Chen, C., Chen, Y.-T., Sun, Y., & Chen, M. (2003). Life Cycle Modeling of News Events Using Aging Theory. In N. Lavrač, D. Gamberger, H. Blockeel & L. Todorovski (Eds.), Machine Learning: ECML 2003 (Vol. 2837, pp. 47-59): Springer Berlin Heidelberg.
Cieri, C., Strassel, S., Graff, D., Martey, N., Rennert, K., & Liberman, M. (2002). Corpora for topic detection and tracking Topic detection and tracking (pp. 33-66): Springer.
Davenport, T. H., & Dyché, J. (2013). Big Data in Big Companies.
Dean, J., & Ghemawat, S. (2008). MapReduce: simplified data processing on large clusters. Communications of the ACM, 51(1), 107-113.
Devitt, A., & Ahmad, K. (2007). Sentiment polarity identification in financial news: A cohesion-based approach. Paper presented at the ACL.
Esuli, A., & Sebastiani, F. (2006). Determining Term Subjectivity and Term Orientation for Opinion Mining. Paper presented at the EACL.
Feldman, R. (2013). Techniques and applications for sentiment analysis. Commun. ACM, 56(4), 82-89. doi: 10.1145/2436256.2436274
Feldman, R., Rosenfeld, B., Bar-Haim, R., & Fresko, M. (2011). The stock sonar—sentiment analysis of stocks based on a hybrid approach. Paper presented at the Twenty-Third IAAI Conference.
Gantz, J., & Reinsel, D. (2012). THE DIGITAL UNIVERSE IN 2020: Big Data,
Bigger Digital Shadow s, and Biggest Grow th in
the Far East. IDC: IDC.
Ghemawat, S., Gobioff, H., & Leung, S.-T. (2003). The Google file system. Paper presented at the ACM SIGOPS Operating Systems Review.
Gloor, P. A., Krauss, J., Nann, S., Fischbach, K., & Schoder, D. (2009). Web science 2.0: Identifying trends through semantic social network analysis. Paper presented at the Computational Science and Engineering, 2009. CSE`09. International Conference on.
Gold, M. K. (2012). Debates in the Digital Humanities: University of Minnesota Press.
Handcock, M. S., Raftery, A. E., & Tantrum, J. M. (2007). Model‐based clustering for social networks. Journal of the Royal Statistical Society: Series A (Statistics in Society), 170(2), 301-354.
Hatzivassiloglou, V., & McKeown, K. R. (1997). Predicting the semantic orientation of adjectives. Paper presented at the Proceedings of the 35th Annual Meeting of the Association for Computational Linguistics and Eighth Conference of the European Chapter of the Association for Computational Linguistics.
Hu, M., & Liu, B. (2004). Mining opinion features in customer reviews.
Huang, Y.-L. (2013). The Asymmetric Effect of Investor Sentiment and Stock Returns.
IBM. What is big data? Bringing big data to the enterprise. Retrieved 3/15, 2014, from http://www-01.ibm.com/software/au/data/bigdata/
Ikeda, D., Fujiki, T., & Okumura, M. (2006). Automatically Linking News Articles to Blog Entries. Paper presented at the AAAI Spring Symposium: Computational Approaches to Analyzing Weblogs.
Intel. What Happens In An Internet Minute? Retrieved 3/21, 2014, from http://www.intel.com/content/www/us/en/communications/internet-minute-infographic.html
Issenberg, S. (2013). How president obama`s campaign used big data to rally individual voters. Technology Review, 116(1), 38-49.
Ku, L.-W., Lo, Y.-S., & Chen, H.-H. (2007). Using polarity scores of words for sentence-level opinion extraction. Paper presented at the Proceedings of NTCIR-6 workshop meeting.
Laney, D. (2001). 3D Data Management: Controlling Data Volume, Velocity, and Variety.
Laney, D. (2012). The Importance of `Big Data`: A Definition: Gartner.
Lin, F.-r., & Liang, C.-H. (2008). Storyline-based summarization for news topic retrospection. Decision Support Systems, 45(3), 473-490.
Liu, B. (2012). Sentiment analysis and opinion mining. Synthesis Lectures on Human Language Technologies, 5(1), 1-167.
Liu, B., Mobasher, B., & Nasraoui, O. (2011). Web Usage Mining Web Data Mining (pp. 527-603): Springer Berlin Heidelberg.
Loughran, T., & McDonald, B. (2011). When is a liability not a liability? Textual analysis, dictionaries, and 10‐Ks. The Journal of Finance, 66(1), 35-65.
Magnusson, J. (2012). Social Network Analysis Utilizing Big Data Technology.
Melnik, S., Gubarev, A., Long, J. J., Romer, G., Shivakumar, S., Tolton, M., & Vassilakis, T. (2010). Dremel: interactive analysis of web-scale datasets. Proceedings of the VLDB Endowment, 3(1-2), 330-339.
Miller, G. A., Beckwith, R., Fellbaum, C., Gross, D., & Miller, K. J. (1990). Introduction to wordnet: An on-line lexical database*. International journal of lexicography, 3(4), 235-244.
Mishne, G. (2006). Multiple ranking strategies for opinion retrieval in blogs. Paper presented at the Online Proceedings of TREC.
Mohammad, S. M., & Turney, P. D. (2010). Emotions evoked by common words and phrases: Using Mechanical Turk to create an emotion lexicon. Paper presented at the Proceedings of the NAACL HLT 2010 Workshop on Computational Approaches to Analysis and Generation of Emotion in Text.
NIST. (2004). 2004 Topic Detection and Tracking (TDT-2004) Evaluation. Retrieved 12/25, 2013, from http://www.itl.nist.gov/iad/mig/tests/tdt/2004/
Normandeau, K. (2013). Beyond Volume, Variety and Velocity is the Issue of Big Data Veracity. Retrieved 3/21, 2014, from http://inside-bigdata.com/2013/09/12/beyond-volume-variety-velocity-issue-big-data-veracity/
Papka, R. (1999). On-line new event detection, clustering, and tracking. University of Massachusetts Amherst.
Popescu, A. R. (2001). Implementation of term weighting in a simple IR system.
Salton, G., & Buckley, C. (1988). Term-weighting approaches in automatic text retrieval. Information processing & management, 24(5), 513-523.
Salton, G., & McGill, M. J. (1983). Introduction to modern information retrieval.
Salton, G., Wong, A., & Yang, C.-S. (1975). A vector space model for automatic indexing. Communications of the ACM, 18(11), 613-620.
Scherer, M. (2012). Inside the secret world of the data crunchers who helped Obama win. swampland. time. com/2012/11/07/inside-thesecret-world-of-quants-and-data-crunchers-who-helped-obama-win.
Stone, P., Dunphy, D. C., Smith, M. S., & Ogilvie, D. (1968). The general inquirer: A computer approach to content analysis. Journal of Regional Science, 8(1), 113-116.
Turney, P. D. (2002). Thumbs up or thumbs down?: semantic orientation applied to unsupervised classification of reviews. Paper presented at the Proceedings of the 40th annual meeting on association for computational linguistics.
Turney, P. D., & Littman, M. L. (2003). Measuring praise and criticism: Inference of semantic orientation from association. ACM Transactions on Information Systems (TOIS), 21(4), 315-346.
Uramoto, N., & Takeda, K. (1998). A method for relating multiple newspaper articles by using graphs, and its application to webcasting. Paper presented at the Proceedings of the 17th international conference on Computational linguistics-Volume 2.
Vigna, P. (2013). Stocks Plunge, Quickly Recover, on Fake Tweet. Retrieved from The Wall Street Journal website: http://blogs.wsj.com/moneybeat/2013/04/23/stocks-plunge-quickly-recover-on-fake-tweet/
Vu, D. Q., Hunter, D. R., & Schweinberger, M. (2013). Model-based clustering of large networks. The Annals of Applied Statistics, 7(2), 1010-1039.
Wilson, T., Wiebe, J., & Hoffmann, P. (2005). Recognizing contextual polarity in phrase-level sentiment analysis. Paper presented at the Proceedings of the conference on human language technology and empirical methods in natural language processing.
Wu, H.-H., CHARNG-RURNG TSAI, A., TZONG-HAN TSAIi, R., & YUNG-JEN HSU, J. (2013). Building a Graded Chinese Sentiment Dictionary Based on Commonsense Knowledge for Sentiment Analysis of Song Lyrics. Journal of Information Science & Engineering, 29(4).
Yang, Y., Ault, T., Pierce, T., & Lattimer, C. W. (2000). Improving text categorization methods for event tracking. Paper presented at the Proceedings of the 23rd annual international ACM SIGIR conference on Research and development in information retrieval.
Yang, Y., Pierce, T., Archibald, B. T., Carbonell, J. G., Brown, R. D., & Liu, X. (1999). Learning approaches for detecting and tracking news events. IEEE Intelligent Systems, 14(4), 32-43.
Zhang, W., & Skiena, S. (2010). Trading Strategies to Exploit Blog and News Sentiment. Paper presented at the ICWSM.
古倫維. (2000). 中英文新聞文件主題偵測方法之研究. 國立臺灣大學.
李啟菁. (2010). 中文部落格文章之意見分析. (碩士), 國立台北科技大學.
胡家瑜. (2009). 追蹤進行中新聞議題產生事件主軸摘要. 清華大學. Available from Airiti AiritiLibrary database. (2009年)
孫瑛澤, 陳建良, 劉峻杰, 劉昭麟, & 蘇豐文. (2010). 中文短句之情緒分類.
婁鑫坡, 柴., 昝紅英,韓英傑. (2012). 微博情感倾向性分析.
許凱玲. (2011). Twitter「情緒指數」成預測股市走勢利器. Retrieved from 數位時代 website: http://www.bnext.com.tw/focus/view/cid/103/id/20060
郭敏華. (2009). 如何測量投資人情緒?.
戴尚學. (2003). 運用事件偵測與追蹤技術於中文多文件摘要之研究.
描述 碩士
國立政治大學
資訊管理研究所
101356002
102
資料來源 http://thesis.lib.nccu.edu.tw/record/#G0101356002
資料類型 thesis
dc.contributor.advisor 楊建民zh_TW
dc.contributor.author (Authors) 張良杰zh_TW
dc.contributor.author (Authors) Chang, Liang Chiehen_US
dc.creator (作者) 張良杰zh_TW
dc.creator (作者) Chang, Liang Chiehen_US
dc.date (日期) 2013en_US
dc.date.accessioned 29-Jul-2014 16:03:34 (UTC+8)-
dc.date.available 29-Jul-2014 16:03:34 (UTC+8)-
dc.date.issued (上傳時間) 29-Jul-2014 16:03:34 (UTC+8)-
dc.identifier (Other Identifiers) G0101356002en_US
dc.identifier.uri (URI) http://nccur.lib.nccu.edu.tw/handle/140.119/67862-
dc.description (描述) 碩士zh_TW
dc.description (描述) 國立政治大學zh_TW
dc.description (描述) 資訊管理研究所zh_TW
dc.description (描述) 101356002zh_TW
dc.description (描述) 102zh_TW
dc.description.abstract (摘要) 近年來科技、網路以及儲存媒介的發達,產生的資料量呈現爆炸性的成長,也宣告了巨量資料時代的來臨。擁有巨量資料代表了不必再依靠傳統抽樣的方式來蒐集資料,分析數據也不再有資料收集不足以致於無法代表母題的限制。突破傳統的限制後,巨量資料的精隨在於如何從中找出有價值的資訊。
以擁有大量輿論和人際互動資訊的社群網站為例,就有相關學者研究其情緒與股價具有正相關性,本研究也試著利用同樣具有巨量資料特性的網路新聞,抓取中央新聞社2013年7月至2014年5月之經濟類新聞共計30,879篇,結合新聞主題偵測與追蹤技術及情感分析,利用新聞事件相似的概念,透過連結匯聚成網絡並且分析新聞的情緒和股價指數的關係。
研究結果顯示,新聞事件間可以連結成一特定新聞主題,且能在龐大的網絡中找出不同的新聞主題,並透過新聞主題之連結產生新聞主題脈絡。對此提供一種新的方式來迅速了解巨量新聞內容,也能有效的回溯新聞主題及新聞事件。
在新聞情緒和股價指數方面,研究發現新聞情緒影響了股價指數之波動,其相關係數達到0.733562;且藉由情緒與心理線及買賣意願指標之比較,顯示新聞的情緒具有一定的程度能夠成為股價判斷之參考依據。
zh_TW
dc.description.abstract (摘要) In recent years, the technology, network, and storage media developed, the amount of generated data with the explosive growth, and also declared the new era of big data. Having big data let us no longer rely on the traditional sample ways to collect data, and no longer have the issue that could not represent the population which caused by the inadequate data collection. Once we break the limitations, the main spirit of big data is how to find out the valuable information in big data.
For example, the social network sites (SNS) have a lot of public opinions and interpersonal information, and scholars have founded that the emotions in SNS have a positive correlation with stock prices. Therefore, the thesis tried to focus on the news which have the same characteristic of big data, using the web crawl to catch total of 30,879 economics news articles form the Central News Agency, furthermore, took the “Topic Detection & Tracking” and “Sentiment Analysis” technology on these articles. Finally, based on the concept of the similarity between news articles, through the links converging networks and analyze the relevant between news sentiment and stock prices.
The results shows that news events can be linked to specific news topics, identify different news topics in a large network, and form the news topic context by linked news topics together. The thesis provides a new way to quickly understand the huge amount of news, and backtracking news topics and news event with effective.
In the aspect of news sentiment and stock prices, the results shows that the news sentiments impact the fluctuations of stock prices, and the correlation coefficient is 0.733562. By comparing the emotion with psychological lines & trading willingness indicators, the emotion is better than the two indicators in the stock prices determination.
en_US
dc.description.tableofcontents 誌謝 i
摘要 ii
Abstract iii
第一章 緒論 1
第一節 研究動機與目的 1
第二章 文獻探討 3
第一節 巨量資料(Big Data) 3
2.1.1 巨量資料之定義與特性 3
2.1.2 巨量資料之應用 4
第二節 新聞主題偵測與追蹤(Topic Detection and Tracking, TDT) 6
2.2.1 新聞主題網絡 7
第三節 情感分析 9
2.3.1 情感分析之定義 9
2.3.2 情感詞典 9
2.3.3 情感分析與股價間之關聯 11
第四節 小結 12
第三章 研究方法與設計 13
第一節 資料蒐集 15
第二節 資料前處理 16
3.2.1 CKIP斷詞 16
3.2.2 詞彙精簡 17
3.2.3 經濟詞彙判定 17
3.2.4 特徵詞萃取 18
3.2.5 向量空間模型 19
第三節 新聞事件處理 21
3.3.1 新聞事件偵測與追蹤 21
3.3.2 新聞事件偵測與追蹤之參數設置 21
第四節 產生新聞主題網絡 25
3.4.1 新聞事件間關係計算 25
第五節 新聞情緒計算 26
3.5.1 情感詞彙判定 26
3.5.2 情感詞彙計算 26
3.5.3 新聞層面下的情緒 27
3.5.4 新聞情緒與股價指數之關係 29
第四章 研究結果 31
第一節 網絡式新聞主題 31
4.1.1 群集連結建立 31
4.1.2 網絡關鍵點 33
4.1.3 新聞主題脈絡 47
第二節 情緒與股價指數 52
4.2.1 情緒與股價指數之關係 52
4.2.2 情緒與心理線之關係 55
4.2.3 情緒與買賣意願指標之關係 59
第五章 研究結論與未來方向 63
第一節 研究結論 63
第二節 未來方向 65
參考文獻 67
zh_TW
dc.format.extent 2587803 bytes-
dc.format.mimetype application/pdf-
dc.language.iso en_US-
dc.source.uri (資料來源) http://thesis.lib.nccu.edu.tw/record/#G0101356002en_US
dc.subject (關鍵詞) 巨量資料zh_TW
dc.subject (關鍵詞) 文字探勘zh_TW
dc.subject (關鍵詞) 新聞主題偵測與追蹤zh_TW
dc.subject (關鍵詞) 連結分析zh_TW
dc.subject (關鍵詞) 情感分析zh_TW
dc.subject (關鍵詞) Big dataen_US
dc.subject (關鍵詞) Text miningen_US
dc.subject (關鍵詞) News topic detection and trackingen_US
dc.subject (關鍵詞) Link analysisen_US
dc.subject (關鍵詞) Sentiment analysisen_US
dc.title (題名) 巨量資料環境下之新聞主題暨輿情與股價關係之研究zh_TW
dc.title (題名) A Study of the Relevance between News Topics & Public Opinion and Stock Prices in Big Dataen_US
dc.type (資料類型) thesisen
dc.relation.reference (參考文獻) Adler, E. (2013). Here`s Why `The Internet Of Things` Will Be Huge, And Drive Tremendous Value For People And Businesses. Retrieved from Business Insider website: http://www.businessinsider.com/growth-in-the-internet-of-things-2013-10
Allan, J., Papka, R., & Lavrenko, V. (1998). On-line new event detection and tracking. Paper presented at the Proceedings of the 21st annual international ACM SIGIR conference on Research and development in information retrieval, Melbourne, Australia.
Atzori, L., Iera, A., & Morabito, G. (2010). The internet of things: A survey. Computer networks, 54(15), 2787-2805.
Ballve, M. (2013). Big Data Will Drive The Next Phase Of Innovation In Mobile Computing. Retrieved from Business Insider website: http://www.businessinsider.com/big-data-is-growing-thanks-to-mobile-2013-12
Bar-Haim, R., Dinur, E., Feldman, R., Fresko, M., & Goldstein, G. (2011). Identifying and following expert investors in stock microblogs. Paper presented at the Proceedings of the Conference on Empirical Methods in Natural Language Processing.
Bollen, J., Mao, H., & Zeng, X. (2011). Twitter mood predicts the stock market. Journal of Computational Science, 2(1), 1-8.
Brown, G. W. (1999). Volatility, sentiment, and noise traders. Financial Analysts Journal, 82-90.
Cambria, E., Rajagopal, D., Olsher, D., & Das, D. (2013). Big social data analysis. Big Data Computing, 401-414.
Chen, C., Chen, Y.-T., Sun, Y., & Chen, M. (2003). Life Cycle Modeling of News Events Using Aging Theory. In N. Lavrač, D. Gamberger, H. Blockeel & L. Todorovski (Eds.), Machine Learning: ECML 2003 (Vol. 2837, pp. 47-59): Springer Berlin Heidelberg.
Cieri, C., Strassel, S., Graff, D., Martey, N., Rennert, K., & Liberman, M. (2002). Corpora for topic detection and tracking Topic detection and tracking (pp. 33-66): Springer.
Davenport, T. H., & Dyché, J. (2013). Big Data in Big Companies.
Dean, J., & Ghemawat, S. (2008). MapReduce: simplified data processing on large clusters. Communications of the ACM, 51(1), 107-113.
Devitt, A., & Ahmad, K. (2007). Sentiment polarity identification in financial news: A cohesion-based approach. Paper presented at the ACL.
Esuli, A., & Sebastiani, F. (2006). Determining Term Subjectivity and Term Orientation for Opinion Mining. Paper presented at the EACL.
Feldman, R. (2013). Techniques and applications for sentiment analysis. Commun. ACM, 56(4), 82-89. doi: 10.1145/2436256.2436274
Feldman, R., Rosenfeld, B., Bar-Haim, R., & Fresko, M. (2011). The stock sonar—sentiment analysis of stocks based on a hybrid approach. Paper presented at the Twenty-Third IAAI Conference.
Gantz, J., & Reinsel, D. (2012). THE DIGITAL UNIVERSE IN 2020: Big Data,
Bigger Digital Shadow s, and Biggest Grow th in
the Far East. IDC: IDC.
Ghemawat, S., Gobioff, H., & Leung, S.-T. (2003). The Google file system. Paper presented at the ACM SIGOPS Operating Systems Review.
Gloor, P. A., Krauss, J., Nann, S., Fischbach, K., & Schoder, D. (2009). Web science 2.0: Identifying trends through semantic social network analysis. Paper presented at the Computational Science and Engineering, 2009. CSE`09. International Conference on.
Gold, M. K. (2012). Debates in the Digital Humanities: University of Minnesota Press.
Handcock, M. S., Raftery, A. E., & Tantrum, J. M. (2007). Model‐based clustering for social networks. Journal of the Royal Statistical Society: Series A (Statistics in Society), 170(2), 301-354.
Hatzivassiloglou, V., & McKeown, K. R. (1997). Predicting the semantic orientation of adjectives. Paper presented at the Proceedings of the 35th Annual Meeting of the Association for Computational Linguistics and Eighth Conference of the European Chapter of the Association for Computational Linguistics.
Hu, M., & Liu, B. (2004). Mining opinion features in customer reviews.
Huang, Y.-L. (2013). The Asymmetric Effect of Investor Sentiment and Stock Returns.
IBM. What is big data? Bringing big data to the enterprise. Retrieved 3/15, 2014, from http://www-01.ibm.com/software/au/data/bigdata/
Ikeda, D., Fujiki, T., & Okumura, M. (2006). Automatically Linking News Articles to Blog Entries. Paper presented at the AAAI Spring Symposium: Computational Approaches to Analyzing Weblogs.
Intel. What Happens In An Internet Minute? Retrieved 3/21, 2014, from http://www.intel.com/content/www/us/en/communications/internet-minute-infographic.html
Issenberg, S. (2013). How president obama`s campaign used big data to rally individual voters. Technology Review, 116(1), 38-49.
Ku, L.-W., Lo, Y.-S., & Chen, H.-H. (2007). Using polarity scores of words for sentence-level opinion extraction. Paper presented at the Proceedings of NTCIR-6 workshop meeting.
Laney, D. (2001). 3D Data Management: Controlling Data Volume, Velocity, and Variety.
Laney, D. (2012). The Importance of `Big Data`: A Definition: Gartner.
Lin, F.-r., & Liang, C.-H. (2008). Storyline-based summarization for news topic retrospection. Decision Support Systems, 45(3), 473-490.
Liu, B. (2012). Sentiment analysis and opinion mining. Synthesis Lectures on Human Language Technologies, 5(1), 1-167.
Liu, B., Mobasher, B., & Nasraoui, O. (2011). Web Usage Mining Web Data Mining (pp. 527-603): Springer Berlin Heidelberg.
Loughran, T., & McDonald, B. (2011). When is a liability not a liability? Textual analysis, dictionaries, and 10‐Ks. The Journal of Finance, 66(1), 35-65.
Magnusson, J. (2012). Social Network Analysis Utilizing Big Data Technology.
Melnik, S., Gubarev, A., Long, J. J., Romer, G., Shivakumar, S., Tolton, M., & Vassilakis, T. (2010). Dremel: interactive analysis of web-scale datasets. Proceedings of the VLDB Endowment, 3(1-2), 330-339.
Miller, G. A., Beckwith, R., Fellbaum, C., Gross, D., & Miller, K. J. (1990). Introduction to wordnet: An on-line lexical database*. International journal of lexicography, 3(4), 235-244.
Mishne, G. (2006). Multiple ranking strategies for opinion retrieval in blogs. Paper presented at the Online Proceedings of TREC.
Mohammad, S. M., & Turney, P. D. (2010). Emotions evoked by common words and phrases: Using Mechanical Turk to create an emotion lexicon. Paper presented at the Proceedings of the NAACL HLT 2010 Workshop on Computational Approaches to Analysis and Generation of Emotion in Text.
NIST. (2004). 2004 Topic Detection and Tracking (TDT-2004) Evaluation. Retrieved 12/25, 2013, from http://www.itl.nist.gov/iad/mig/tests/tdt/2004/
Normandeau, K. (2013). Beyond Volume, Variety and Velocity is the Issue of Big Data Veracity. Retrieved 3/21, 2014, from http://inside-bigdata.com/2013/09/12/beyond-volume-variety-velocity-issue-big-data-veracity/
Papka, R. (1999). On-line new event detection, clustering, and tracking. University of Massachusetts Amherst.
Popescu, A. R. (2001). Implementation of term weighting in a simple IR system.
Salton, G., & Buckley, C. (1988). Term-weighting approaches in automatic text retrieval. Information processing & management, 24(5), 513-523.
Salton, G., & McGill, M. J. (1983). Introduction to modern information retrieval.
Salton, G., Wong, A., & Yang, C.-S. (1975). A vector space model for automatic indexing. Communications of the ACM, 18(11), 613-620.
Scherer, M. (2012). Inside the secret world of the data crunchers who helped Obama win. swampland. time. com/2012/11/07/inside-thesecret-world-of-quants-and-data-crunchers-who-helped-obama-win.
Stone, P., Dunphy, D. C., Smith, M. S., & Ogilvie, D. (1968). The general inquirer: A computer approach to content analysis. Journal of Regional Science, 8(1), 113-116.
Turney, P. D. (2002). Thumbs up or thumbs down?: semantic orientation applied to unsupervised classification of reviews. Paper presented at the Proceedings of the 40th annual meeting on association for computational linguistics.
Turney, P. D., & Littman, M. L. (2003). Measuring praise and criticism: Inference of semantic orientation from association. ACM Transactions on Information Systems (TOIS), 21(4), 315-346.
Uramoto, N., & Takeda, K. (1998). A method for relating multiple newspaper articles by using graphs, and its application to webcasting. Paper presented at the Proceedings of the 17th international conference on Computational linguistics-Volume 2.
Vigna, P. (2013). Stocks Plunge, Quickly Recover, on Fake Tweet. Retrieved from The Wall Street Journal website: http://blogs.wsj.com/moneybeat/2013/04/23/stocks-plunge-quickly-recover-on-fake-tweet/
Vu, D. Q., Hunter, D. R., & Schweinberger, M. (2013). Model-based clustering of large networks. The Annals of Applied Statistics, 7(2), 1010-1039.
Wilson, T., Wiebe, J., & Hoffmann, P. (2005). Recognizing contextual polarity in phrase-level sentiment analysis. Paper presented at the Proceedings of the conference on human language technology and empirical methods in natural language processing.
Wu, H.-H., CHARNG-RURNG TSAI, A., TZONG-HAN TSAIi, R., & YUNG-JEN HSU, J. (2013). Building a Graded Chinese Sentiment Dictionary Based on Commonsense Knowledge for Sentiment Analysis of Song Lyrics. Journal of Information Science & Engineering, 29(4).
Yang, Y., Ault, T., Pierce, T., & Lattimer, C. W. (2000). Improving text categorization methods for event tracking. Paper presented at the Proceedings of the 23rd annual international ACM SIGIR conference on Research and development in information retrieval.
Yang, Y., Pierce, T., Archibald, B. T., Carbonell, J. G., Brown, R. D., & Liu, X. (1999). Learning approaches for detecting and tracking news events. IEEE Intelligent Systems, 14(4), 32-43.
Zhang, W., & Skiena, S. (2010). Trading Strategies to Exploit Blog and News Sentiment. Paper presented at the ICWSM.
古倫維. (2000). 中英文新聞文件主題偵測方法之研究. 國立臺灣大學.
李啟菁. (2010). 中文部落格文章之意見分析. (碩士), 國立台北科技大學.
胡家瑜. (2009). 追蹤進行中新聞議題產生事件主軸摘要. 清華大學. Available from Airiti AiritiLibrary database. (2009年)
孫瑛澤, 陳建良, 劉峻杰, 劉昭麟, & 蘇豐文. (2010). 中文短句之情緒分類.
婁鑫坡, 柴., 昝紅英,韓英傑. (2012). 微博情感倾向性分析.
許凱玲. (2011). Twitter「情緒指數」成預測股市走勢利器. Retrieved from 數位時代 website: http://www.bnext.com.tw/focus/view/cid/103/id/20060
郭敏華. (2009). 如何測量投資人情緒?.
戴尚學. (2003). 運用事件偵測與追蹤技術於中文多文件摘要之研究.
zh_TW