Publications-Theses
Article View/Open
Publication Export
-
題名 文字探勘對量化交易策略報酬之影響:人民幣兌美金外匯保證金商品
Impact of Text Mining on Quantitative Trading Strategy : CNY/USD Foreign Exchange Margin Commodity作者 俞家禾
Yu, Jia-Ho貢獻者 林士貴<br>蔡銘峰
Lin, Shih-Kuei<br>Tsai, Ming-Feng
俞家禾
Yu, Jia-Ho關鍵詞 爬蟲
文字探勘
量化交易
情緒指標
外匯保證金
Web crawlers
Text mining
Quantitative trading
Sentiment indicators
Foreign exchange margin
Ckiptagger日期 2020 上傳時間 1-Jul-2020 13:41:53 (UTC+8) 摘要 本論文使用2015/09/14~2018/12/31的人民幣兌美元之外匯保證金商品報價,及透過爬蟲爬取相同時間區間中、英文兩種語言之新聞,來源包括鉅亨網、FXWeek、Currency News、Daily Beast、Reuters等新聞網站。藉由Ckiptagger、NLTK等python套件,對文章進行分詞處理,再透過TF-IDF與滾動線性迴歸等量化方式,製作多種語言情緒指標。最後利用情緒指標將消息面的資訊導入交易策略中,並基於新聞來源及策略不同,建立多個交易策略模型,並嘗試使用除了獲利以外的衡量指標,來比較策略模型的績效。實證結果發現,中文情緒指標建立之策略較英文情緒指標建立之策略績效較於優異,而英文情緒指標策略在新聞來源不同時,多個新聞來源較單一新聞來源策略績效優異;若情緒指標運用在出場策略,普遍能降低最大策略虧損,若將情緒指標運用在進場策略,普遍能提高勝率。
This paper uses the foreign exchange margin commodity quotes of USD/CNY from 2015/09/14 to 2018/12/31, and web crawls the news in both English and Chinese. Sources include Juheng.com, FXWeek Currency news, daily beasts, Reuters. Through Ckiptagger, NLTK, TF-IDF, rolling linear regression and other quantitative methods to produce a variety of language sentiment indicators. Finally, the sentiment indicators are used to import the information on the news surface into the trading strategy, and compare the performance of the strategy model.The empirical results found that the strategy established by the Chinese sentiment indicator is superior to the English sentiment indicator, and multiple English news sources perform better than the single; The appearance strategy can generally reduce the maximum strategy loss. If the sentiment index is used in the entry strategy, it can generally improve the winning rate.參考文獻 一、 中文文獻1. 李昱穎. (2019). 新聞輿情分析在台灣股票市場之應用: 文字轉向量與動能策略. 政治大學金融學系學位論文, 1-40.2. 黃于珊. (2017). 文字探勘在總體經濟上之應用: 以美國聯準會會議紀錄為例 (Doctoral dissertation, 黃于珊).3. 張邑齊. (2019). 以情緒分析量化新聞資料探討新台幣匯率市場效率性之研究.4. 鄭雅竹. (2013). 商品通道指標及威廉指標應用於外匯市場之獲利性研究-以新台幣/美元為例 (Doctoral dissertation, 鄭雅竹).二、 英文文獻1. Azar, P. D., & Lo, A. W. (2016). The wisdom of Twitter crowds: Predicting stock market reactions to FOMC meetings via Twitter feeds. The Journal of Portfolio Management, 42(5), 123-134.2. Balakrishnan, V., & Lloyd-Yemoh, E. (2014). Stemming and lemmatization: a comparison of retrieval performances.3. Chiang, J. K., & Chen, C. C. (2016, February). Sentimental analysis on Big Data–on case of financial document text mining to predict sub-index trend. In 2015 5th International Conference on Computer Sciences and Automation Engineering (ICCSAE 2015). Atlantis Press.4. Gayo-Avello, D., Metaxas, P. T., Mustafaraj, E., Strohmaier, M., Schoen, H., Gloor, P., ... & Tarabanis, K. (2013). Understanding the predictive power of social media. Internet Research.5. Heston, S. L., & Sinha, N. R. (2017). News vs. sentiment: Predicting stock returns from news stories. Financial Analysts Journal, 73(3), 67-83.6. Jegadeesh, N., & Wu, D. (2013). Word power: A new approach for content analysis. Journal of Financial Economics, 110(3), 712-729.7. Loper, E., & Bird, S. (2002). NLTK: the natural language toolkit. arXiv preprint cs/0205028.8. Miao, D., & Lang, F. (2017, October). A recommendation system based on text mining. In 2017 International Conference on Cyber-Enabled Distributed Computing and Knowledge Discovery (CyberC) (pp. 318-321). IEEE.9. Szakmary, A. C., Shen, Q., & Sharma, S. C. (2010). Trend-following trading strategies in commodity futures: A re-examination. Journal of Banking & Finance, 34(2), 409-426.10. Yao, Y., & Huang, Z. (2016, October). Bi-directional LSTM recurrent neural network for Chinese word segmentation. In International Conference on Neural Information Processing (pp. 345-353). Springer, Cham. 描述 碩士
國立政治大學
金融學系
107352032資料來源 http://thesis.lib.nccu.edu.tw/record/#G0107352032 資料類型 thesis dc.contributor.advisor 林士貴<br>蔡銘峰 zh_TW dc.contributor.advisor Lin, Shih-Kuei<br>Tsai, Ming-Feng en_US dc.contributor.author (Authors) 俞家禾 zh_TW dc.contributor.author (Authors) Yu, Jia-Ho en_US dc.creator (作者) 俞家禾 zh_TW dc.creator (作者) Yu, Jia-Ho en_US dc.date (日期) 2020 en_US dc.date.accessioned 1-Jul-2020 13:41:53 (UTC+8) - dc.date.available 1-Jul-2020 13:41:53 (UTC+8) - dc.date.issued (上傳時間) 1-Jul-2020 13:41:53 (UTC+8) - dc.identifier (Other Identifiers) G0107352032 en_US dc.identifier.uri (URI) http://nccur.lib.nccu.edu.tw/handle/140.119/130545 - dc.description (描述) 碩士 zh_TW dc.description (描述) 國立政治大學 zh_TW dc.description (描述) 金融學系 zh_TW dc.description (描述) 107352032 zh_TW dc.description.abstract (摘要) 本論文使用2015/09/14~2018/12/31的人民幣兌美元之外匯保證金商品報價,及透過爬蟲爬取相同時間區間中、英文兩種語言之新聞,來源包括鉅亨網、FXWeek、Currency News、Daily Beast、Reuters等新聞網站。藉由Ckiptagger、NLTK等python套件,對文章進行分詞處理,再透過TF-IDF與滾動線性迴歸等量化方式,製作多種語言情緒指標。最後利用情緒指標將消息面的資訊導入交易策略中,並基於新聞來源及策略不同,建立多個交易策略模型,並嘗試使用除了獲利以外的衡量指標,來比較策略模型的績效。實證結果發現,中文情緒指標建立之策略較英文情緒指標建立之策略績效較於優異,而英文情緒指標策略在新聞來源不同時,多個新聞來源較單一新聞來源策略績效優異;若情緒指標運用在出場策略,普遍能降低最大策略虧損,若將情緒指標運用在進場策略,普遍能提高勝率。 zh_TW dc.description.abstract (摘要) This paper uses the foreign exchange margin commodity quotes of USD/CNY from 2015/09/14 to 2018/12/31, and web crawls the news in both English and Chinese. Sources include Juheng.com, FXWeek Currency news, daily beasts, Reuters. Through Ckiptagger, NLTK, TF-IDF, rolling linear regression and other quantitative methods to produce a variety of language sentiment indicators. Finally, the sentiment indicators are used to import the information on the news surface into the trading strategy, and compare the performance of the strategy model.The empirical results found that the strategy established by the Chinese sentiment indicator is superior to the English sentiment indicator, and multiple English news sources perform better than the single; The appearance strategy can generally reduce the maximum strategy loss. If the sentiment index is used in the entry strategy, it can generally improve the winning rate. en_US dc.description.tableofcontents 第一章 緒論 1第一節 研究動機 1第二節 研究目的 2第三節 研究架構 2第二章 參考文獻 3第一節 分詞模型之應用 3第二節 文字探勘於財務上之應用 4第三節 通道策略與文字量化應用 4第三章 研究方法 6第一節 中文文字探勘研究方法 6第二節 英文文字探勘研究方法 9第三節 文字量化流程方法 10第四章 實證分析 12第一節 商品選擇 12第二節 資料來源 12第三節 實驗設計 14第四節 實驗結果 18第五章 結論與建議 32第一節 結論 32第二節 未來研究與建議 33參考文獻 34 zh_TW dc.format.extent 3462264 bytes - dc.format.mimetype application/pdf - dc.source.uri (資料來源) http://thesis.lib.nccu.edu.tw/record/#G0107352032 en_US dc.subject (關鍵詞) 爬蟲 zh_TW dc.subject (關鍵詞) 文字探勘 zh_TW dc.subject (關鍵詞) 量化交易 zh_TW dc.subject (關鍵詞) 情緒指標 zh_TW dc.subject (關鍵詞) 外匯保證金 zh_TW dc.subject (關鍵詞) Web crawlers en_US dc.subject (關鍵詞) Text mining en_US dc.subject (關鍵詞) Quantitative trading en_US dc.subject (關鍵詞) Sentiment indicators en_US dc.subject (關鍵詞) Foreign exchange margin en_US dc.subject (關鍵詞) Ckiptagger en_US dc.title (題名) 文字探勘對量化交易策略報酬之影響:人民幣兌美金外匯保證金商品 zh_TW dc.title (題名) Impact of Text Mining on Quantitative Trading Strategy : CNY/USD Foreign Exchange Margin Commodity en_US dc.type (資料類型) thesis en_US dc.relation.reference (參考文獻) 一、 中文文獻1. 李昱穎. (2019). 新聞輿情分析在台灣股票市場之應用: 文字轉向量與動能策略. 政治大學金融學系學位論文, 1-40.2. 黃于珊. (2017). 文字探勘在總體經濟上之應用: 以美國聯準會會議紀錄為例 (Doctoral dissertation, 黃于珊).3. 張邑齊. (2019). 以情緒分析量化新聞資料探討新台幣匯率市場效率性之研究.4. 鄭雅竹. (2013). 商品通道指標及威廉指標應用於外匯市場之獲利性研究-以新台幣/美元為例 (Doctoral dissertation, 鄭雅竹).二、 英文文獻1. Azar, P. D., & Lo, A. W. (2016). The wisdom of Twitter crowds: Predicting stock market reactions to FOMC meetings via Twitter feeds. The Journal of Portfolio Management, 42(5), 123-134.2. Balakrishnan, V., & Lloyd-Yemoh, E. (2014). Stemming and lemmatization: a comparison of retrieval performances.3. Chiang, J. K., & Chen, C. C. (2016, February). Sentimental analysis on Big Data–on case of financial document text mining to predict sub-index trend. In 2015 5th International Conference on Computer Sciences and Automation Engineering (ICCSAE 2015). Atlantis Press.4. Gayo-Avello, D., Metaxas, P. T., Mustafaraj, E., Strohmaier, M., Schoen, H., Gloor, P., ... & Tarabanis, K. (2013). Understanding the predictive power of social media. Internet Research.5. Heston, S. L., & Sinha, N. R. (2017). News vs. sentiment: Predicting stock returns from news stories. Financial Analysts Journal, 73(3), 67-83.6. Jegadeesh, N., & Wu, D. (2013). Word power: A new approach for content analysis. Journal of Financial Economics, 110(3), 712-729.7. Loper, E., & Bird, S. (2002). NLTK: the natural language toolkit. arXiv preprint cs/0205028.8. Miao, D., & Lang, F. (2017, October). A recommendation system based on text mining. In 2017 International Conference on Cyber-Enabled Distributed Computing and Knowledge Discovery (CyberC) (pp. 318-321). IEEE.9. Szakmary, A. C., Shen, Q., & Sharma, S. C. (2010). Trend-following trading strategies in commodity futures: A re-examination. Journal of Banking & Finance, 34(2), 409-426.10. Yao, Y., & Huang, Z. (2016, October). Bi-directional LSTM recurrent neural network for Chinese word segmentation. In International Conference on Neural Information Processing (pp. 345-353). Springer, Cham. zh_TW dc.identifier.doi (DOI) 10.6814/NCCU202000487 en_US
