應用強化學習於股票的投資選擇-以台灣股市為例

彭志偉; Phang, Chee-Wai

Please use this identifier to cite or link to this item: https://ah.lib.nccu.edu.tw/handle/140.119/136570

題名:	應用強化學習於股票的投資選擇-以台灣股市為例 Applying Reinforcement Learning to Stock Investment–Taiwan Stock Market as an Example
作者:	彭志偉 Phang, Chee-Wai
貢獻者:	蕭明福<br>蔡瑞煌彭志偉 Phang, Chee-Wai
關鍵詞:	金融股票市場機器學習強化學習神經網路股票選擇 Stock Market Machine Learning Reinforcement Learning Neural Networks Stock Selection
日期:	2021
上傳時間:	4-Aug-2021
摘要:	強化學習在各領域都是一門不可或缺的學科，而在金融界的實際應用已有信用借貸/違約評估、風險控管、人工智慧客服及股市預測等等，金融科技則是運用數學模型來解決金融環境中的問題，本研究將應用強化學習演算法的學習框架套用於臺灣股票金融市場環境當中，設計一個股票投資的學習環境並模擬投資人在該環境中進行演算法超參數調整的實驗，代理人的最終目的在於控制投資風險的情況下將投資報酬最大化，本研究採用已上市達21年，且為臺灣股市總市值前15大之股票作為強化學習之環境模擬的訓練對象，使用2000年至2016年的股票歷史資料作為訓練數據資料集來進行訓練，2017年至2021年作為測試資料集，最後本研究將評估其實驗結果及跟其他的投資績效策略進行投資報酬績效的比較。\n本研究在強化學習框架中所訓練之智慧代理人在環境模擬訓練的過程中，智慧代理人透過模擬學習在一定程度上捕捉到股票市場上股票價格的變動，並且藉由訓練達到有效的自我提升，在其後介紹的實驗測試結果中將會詳細介紹。而研究結果顯示，部分實驗測試的成果比加權股票指數及隨機分配投資策略的績效要好，在經過超參數調參後，仍以本研究之實驗二的成果為最佳選擇，並在測試結果中發現代理人在訓練的過程中有效的學習到了在控制投資風險的情況下進行投資獲利。 Reinforcement learning is an indispensable subject in various fields, and the practical applications in the financial sector include credit lending, default assessment, risk control, artificial intelligence customer service, stock market forecasting, etc., and financial technology uses mathematical tools to explain the problems of the financial environment, this research will apply the learning framework of reinforcement learning algorithm to the Taiwan stock financial market environment, design a stock investment learning environment and simulate the experiment of investors in the environment to adjust the hyper parameters of the algorithm, and the ultimate purpose of the reinforcement learning’s agent is putting effort on learning to minimize investment risks and maximize investment returns. The total time data set in this study is 21 years long, and the stock history data from year 2000 to 2016 is used as the training data set for training, from year 2017 to 2021 will be treated as a test data set. Finally, this research will evaluate its experimental results and compare its return on investment performance with other investment performance strategies.\nIn the process of environmental simulation training, the intelligent agent trained in this research in the framework of reinforcement learning is able to acquire the stock’s price movement that changes in the stock market in a certain extent and can achieve effective self-improvement. In experiments two, five and ten The results of the test are better than the weighted stock price index and random allocation of investment strategies. In the test results of the experiments, that is found the agent is able to learn to make investment profits while controlling investment risks during the training process.
參考文獻:	中文部分\n[1] 蔡岳霖(2013)，一個使用遺傳演算法改良之投資組合保險模型之研究，國立高雄大學資訊工程學系碩士論文。\n[2] 施承和(2016)，機構投資人與散戶的投資策略之探討，朝陽科技大學財務金融系碩士論文。\n[3] 劉俞含(2018)，XGBoost模型、隨機森林模型、彈性網模型於股價指數趨勢之預測—以台灣、日本、美國為例，國立中山大學財務管理學系碩士論文。\n[4] 陳人豪(2018)，台股股利完全填權息關鍵影響因素之研究，國立政治大學資訊科學系碩士在職專班碩士論文。\n[5] 陳昱安(2020)，資產配置基於集成學習的多因子模型－以台灣股市為例，國立政治大學金融學系碩士論文。\n\n英文部分\n[1] Markowitz, H. (1952). PORTFOLIO SELECTION. The Journal of Finance 7(1): 77-91.\n[2] H. Ahmadi (1990). Testability of the arbitrage pricing theory by neural network, IJCNN International Joint Conference on Neural Networks, 1990, pp. 385-393 vol.1, doi: 10.1109/IJCNN.1990.137598.\n[3] Nison, S. (1991). Japanese candlestick charting techniques : a contemporary guide to the ancient investment techniques of the Far East, New York Institute Of Finance.\n[4] Sharpe, W. (1994). The Sharpe Ratio. Journal of Portfolio Management 21, No.1, Fall: 49-58.\n[5] Acar, E. and S. James (1997). Maximum loss and maximum drawdown in financial markets. Proceedings of International Conference on Forecasting Financial Markets.\n[6] Hochreiter, S. and J. Schmidhuber (1997). LSTM can solve hard long time lag problems. Advances in neural information processing systems.\n[7] Moody, J. and L. Wu (1997). Optimization of trading systems and portfolios. Proceedings of the IEEE/IAFE Computational Intelligence for Financial Engineering: 300-307.\n[8] Powell, Nicole, et al. (2008). Supervised and Unsupervised Methods for Stock Trend Forecasting. 203 - 205. 10.1109/SSST.2008.4480220.\n[9] Chung, J., et al. (2014). Empirical evaluation of gated recurrent neural networks on sequence modeling. arXiv preprint arXiv:1412.3555.\n[10] Kingma, D. P. and J. Ba (2014). Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980.\n[11] Cumming, J., et al. (2015). An investigation into the use of reinforcement learning techniques within the algorithmic trading domain, Imperial College London: London, UK.\n[12] Gabrielsson, P. and U. Johansson (2015). High-frequency equity index futures trading using recurrent reinforcement learning with candlesticks. 2015 IEEE Symposium Series on Computational Intelligence, IEEE.\n[13] Lillicrap, T. P., et al. (2015). Continuous control with deep reinforcement learning. arXiv preprint arXiv:.02971.\n[14] Meger, D., et al. (2018). Addressing function approximation error in actor-critic methods. International Conference on Machine Learning(PMLR): 1587-1596.\n[15] Pendharkar, P. C. and P. Cusatis (2018). Trading financial indices with reinforcement learning agents. Expert Systems with Applications 103: 1-13.\n[16] Kanwar, N. (2019). Deep Reinforcement Learning-based Portfolio Management, Ph.D. Dissertation, The University of Texas at Arlington: Arlington, TX, USA.\n[17] Liu, L., et al. (2019). On the variance of the adaptive learning rate and beyond. arXiv preprint arXiv:.03265.\n[18] Misra, D. (2019). Mish: A self regularized non-monotonic neural activation function. arXiv preprint arXiv:.08681.\n[19] Zhang, M., et al. (2019). Lookahead optimizer: k steps forward, 1 step back. Advances in Neural Information Processing Systems.\n[20] Corazza, et al. (2019). A comparison among Reinforcement Learning algorithms in financial trading systems, No 2019:33, Working Papers, Department of Economics, University of Venice "Ca` Foscari".
描述:	碩士國立政治大學經濟學系 108258044
資料來源:	http://thesis.lib.nccu.edu.tw/record/#G0108258044
資料類型:	thesis
Appears in Collections:	學位論文

Files in This Item:

File	Description	Size	Format
804401.pdf		2.72 MB	Adobe PDF2	View/Open

Show full item record

Google Scholar^TM

Check

Files in This Item:

Google Scholar^TM

Altmetric

Altmetric

Files in This Item:

Google ScholarTM

Altmetric

Altmetric

Google Scholar^TM