Please use this identifier to cite or link to this item: https://ah.nccu.edu.tw/handle/140.119/89054


Title: EPSO-GHSOM股票巨量資料選擇交易策略
EPSO-GHSOM Stock Selecting and Trading Strategy with Big Data Analytics
Authors: 陳婷妤
Chen, Ting Yu
Contributors: 劉文卿
Liu, Wenqing
陳婷妤
Chen, Ting Yu
Keywords: 粒子群最佳化演算法
增長層級式自我組織映射圖演算法
網路探勘暨情緒分析
股票策略
巨量資料分析
Particle Swarm Optimization
Growing Hierarchical Self-Organizing Map
Web Mining with Sentiment Analysis
Stock Strategy
Big Data Analytics
Date: 2016
Issue Date: 2016-05-02 13:49:44 (UTC+8)
Abstract: 巨量資料分析(Big Data Analytics)是以資料觀點來進行分析研究、探討問題,找出未發現的知識與態樣。巨量資料分析包括三個部份,分别為資料存取運算、資料隱私與領域知識、巨量資料探勘。資料存取運算部份為處理巨量資料與分析的分散式平台與技術(technology),資料隱私及領域知識部份為解決問題的專業領域知識(domain);巨量資料探勘部份則為分析巨量資料所引用的方法(technique)等。巨量資料分析的特點,在平台架構部份,使用分散式運算處理儲存架構,鬆綁了電腦資源的限制;而在分析部份,巨量的歷史執行資料常常蘊含著大量有價值的潛存資訊和知識,讓資料說話之新思維的解決問題方法,能夠忠實客觀呈現問題或事實真象,對於問題獲得解決及知識發現將帶來重要的助益。
因此,本研究提出一個基於巨量資料分析的觀點為主軸之創新的EPSO-GHSOM股票選擇交易策略。在資料存取運算部份,本研究採Hadoop分散式運算架構、HBase分散儲存資料庫、Elastic Search技術,以及撰寫資料分析應用程式,建構股票選擇交易策略分散式運算平台;在資料隱私及領域知識部份,則以基本分析之價值投資理論、變動天數移動平均線技術指標與其黃金死亡交叉決策準則,作為鑑別股票優劣與買賣交易點的領域知識探討核心;分析資料來源部份,以股票交易資料、公司財務報告重要資訊、網頁財經新聞訊息等各類來源資料作為分析標的;在巨量資料探勘部份,本研究提出改良PSO演算法之EPSO(Elite Particle Swarm Optimization)最佳化演算法,以及資料驅動點概念,並運用增長層級式自我組織映射圖(Growing Hierarchical Self-Organizing Map,簡稱GHSOM)演算法及網路探勘暨情緒分析(Web Mining with Sentiment Analysis)等方法,處理結構化與非結構化資料,作為資料探勘與知識發掘的分析核心,建構由來源資料自動探索並決定的股票選擇交易策略模型,從中進行知識挖掘,透過資料的角度發現股票選擇交易策略態樣、準則存在,提供以資料觀點的新方法給予投資人進行股票選擇交易決策建議。更明確地說,整體EPSO-GHSOM股票選擇交易策略,先經由股票選擇策略後挑選出優質的股票,再依股票交易策略決定最佳的買賣點提供給投資人進行決策。
本研究依所提方法進行實證結果發現,(1)在投資報酬表現部份,績效優於長期持有交易策略、MMPSO策略、KennedyPSO策略。(2)在股票選擇策略部份發現,屬量分析以稅後淨利、股東權益報酬率、每股盈餘等指標鑑別公司經營獲利能力最強,而且從5年財務指標趨勢發現有優質成長明星股(straight up)、經營不善之地雷股(straight down)及混合型股(U- or W-shaped)等三種型態,由於混合型股仍包含獲利能力好及表現不佳之公司,針對混合型股分群結果特徵分析後建立filter過濾股票規則,有效提升股票鑑別力。而在屬質之網路財經新聞資料情緒分析部份,發現實驗來源媒體報導多以每股盈餘、稅後淨利等財務指標作為評價依據,與屬量分析的分析結果相依性高,並考慮網路財經新聞資料情緒積分高低,篩選出情緒積分表現高之股票,更強化股票選擇策略能力。(3)在股票交易策略部份發現,多數的股票不符合移動平均線SMA技術指標的黃金死亡交叉決策準則,而且實驗所得最佳決策之SMA天數型態多屬中長期天數。另外,ROI績效受買賣高低價的影響,當價差愈大,ROI獲利或損失則愈大。(4)本研究與其他研究比較,測試期間ROI獲利表現較於其他方法相對偏高,整體策略考量涵蓋層面較其他方法廣泛。
由於Big Data Analytics屬於近年來新興發展學科,實際研究案例尚少,本研究所提出之整體研究方法、系統架構與建置步驟,除能作為股票選擇交易決策的參考外,並可套用至其他巨量資料分析研究案例進行建構,及以本研究為基礎發展理財智慧代理人等輔助決策模型。 
Big data analytics is the process of analyzing data, examining problems, and identifying unknown correlations and patterns. It can be categorized into three parts: data accessing and computing, data privacy and domain knowledge, and big data mining algorithms. Data accessing and computing refer to decentralized platforms and technologies that handle big data and analysis and can be further sub-categorized into real-time and batch-processing platform/technology frameworks. Data privacy and domain knowledge refer to the specialized domain knowledge required to resolve problems. Big data mining algorithms refer to techniques used in analyzing big data. In terms of a platform framework, the application of decentralized processing and storage platforms in big data analytics alleviates restrictions on computer resources. In terms of analysis, historical big data often contain large amounts of valuable hidden information and knowledge. The novel problem-solving method of data narration provides a realistic and objective overview of situations and problems, which facilitates the resolution of problems and discovery of knowledge.
Hence, this study proposed an innovative elite particle swarm optimization (EPSO)–growing hierarchical self-organizing map (GHSOM) stock selection and trading strategy that is based on big data analytics. In terms of data accessing and computing technology, this study used the Hadoop decentralized computing framework, Hbase decentralized storage database, Elasticsearch technology, and data analysis software to construct a decentralized computing platform for the stock selection and trading strategy. In terms of data privacy and domain knowledge, value investment theory, variable length moving averages (VLMA), and golden cross and death cross decision rules were adopted as a basis for analysis to investigate the core domain knowledge of distinguishing between the pros and cons of shares and share trading. Data from the stock market, company financial reports, financial news from websites, and other data were collected as sources for the analysis. In terms of big data mining algorithms, this study proposed an improved particle swarming optimization (PSO)method called EPSO and used the concept of data-driven points, GHSOM, and web data mining with sentiment analysis to process structured and non-structured data, form core data mining and knowledge discovery data for analysis, and establish a model that automatically explores source information exploration and selects decision-making strategies. The model uncovers knowledge from data and identifies stock selection and trading strategies, patterns, and rules from a data perspective. This provides investors with a novel data analysis method and facilitates them in making decisions regarding the trading of stocks. Specifically, the EPSO–GHSOM stock selection and trading strategy first selects quality stocks using the stock selection strategy, and then, the stock trading strategy decides the optimal buy and sell points, providing investors with information for making decisions.
Using our proposed method, we found that(1)in terms of investment return performance, the results are superior to the investment return rates of buy-and-hold, MMPSO, and Kennedy PSO strategies.(2)In terms of stock selection strategy of quantitative analysis, we found the net profit, ROE, and EPS demonstrated the strongest ability to distinguish among company profitability indicators. The financial index trend over five years also presented straight up, straight down, and U- or W-shaped patterns. Because U- or W-shaped trends contain companies of both good and suboptimal profitability indicators, the application of a filter for U- or W-shaped clustering characteristic analysis effectively increases the ability to distinguish among stocks. Using sentiment analysis on the qualitative online financial news information, we found that the media reports mainly used ROE, EPS, and other similar financial indicators as tools for evaluation, similar to the quantitative analysis results.(3)In terms of the stock trading strategy, most stocks failed to follow the golden cross and death cross decision rules of the SMA. Optimal SMA parameter days for decision making were largely mid-term to long term. Furthermore, ROI results were affected by high and low trading prices, increased price margins, and increased profit–loss margins.(4)The ROI performance was relatively better when compared with previous studies. In terms of domain knowledge analysis methods, considerations for the strategy cover a wider range of areas than other methods.
In recent years, big data analytics has been an emerging science, and thus, practical study cases are scarce. The study method, system framework, and procedures proposed in this study can be applied to other areas of big data analysis in addition to being a reference for stock selection and trading strategy. The results obtained in this study can also form the basis for decision assistance models, such as an intelligent financial management agent.
Reference: [1]. Aboueldahab, Tarek, and Mahumod Fakhreldin. "Prediction of stock market indices using hybrid genetic algorithm/particle swarm optimization with perturbation term." International Conference on Swarm Intelligence. 2011.
[2]. Alfarano, Simone;Milakovic, Mishael. "Network structure and N-dependence in agent-based herding models." Journal of Economic Dynamics and Control 33.1(2009): 78-92.
[3]. Antweiler, W., and Frank, M. "Is All That Talk Just Noise? The Information Content of Internet Stock Message Boards. " The Journal of Finance 59.3(2004): 1259-1295.
[4]. Arik, Sercan, Sukru Burc Eryilmaz, and Adam Goldberg. "Supervised classification-based stock prediction and portfolio optimization." arXiv preprint arXiv:1406.0824(2014).
[5]. Atsalakis, George S., and Kimon P. Valavanis. "Surveying stock market forecasting techniques–Part II: Soft computing methods." Expert Systems with Applications 36.3(2009): 5932-5941.
[6]. Bawa M. "Multi-structured data: platform capabilities required for big data analytics." Teradata Aster – The Data Blog. 2011.
[7]. Begoli, Edmon, and James Horey. "Design principles for effective knowledge discovery from big data." Software Architecture(WICSA)and European Conference on Software Architecture(ECSA), 2012 Joint Working IEEE/IFIP Conference on. IEEE. 2012.
[8]. Bing Liu. "Sentiment Analysis and Opinion Mining. " Morgan & Claypool Publishers. 2012.
[9]. Boonpeng, Sabaithip, and Piyasak Jeatrakul. "Enhance the performance of neural networks for stock market prediction: An analytical study." Digital Information Management(ICDIM), 2014 Ninth International Conference on. IEEE. 2014.
[10]. Butler, Matthew, and Dimitar Kazakov. "Testing implications of the adaptive market hypothesis via computational intelligence." Computational Intelligence for Financial Engineering & Economics(CIFEr), 2012 IEEE Conference on. IEEE. 2012.
[11]. Buffett, Warren, and Carol Loomis. "Warren Buffett on the stock market." Fortune, December 10(2001): 2001.
[12]. B. Salem, R. Nakatsu, M. Rauterberg. "Kansei experience: aesthetic, emotions and inner balance." International Journal of Cognitive Informatics and Natural Intelligence 3.2(2009): 18-36.
[13]. Chang, Pei-Chann, Chin-Yuan Fan, and Chen-Hao Liu. "Integrating a piecewise linear representation method and a neural network model for stock trading points prediction." Systems, Man, and Cybernetics, Part C: Applications and Reviews, IEEE Transactions on 39.1(2009): 80-92.
[14]. Chavarnakul, Thira, and David Enke. "A hybrid stock trading system for intelligent technical analysis-based equivolume charting." Neurocomputing 72.16(2009): 3517-3528.
[15]. Chen, CL Philip, and Chun-Yang Zhang. "Data-intensive applications, challenges, techniques and technologies: A survey on Big Data." Information Sciences 275(2014): 314-347.
[16]. Chien, Ya-Wen Chang, and Yen-Liang Chen. "Mining associative classification rules with stock trading data–A GA-based method." Knowledge-Based Systems 23.6(2010): 605-614.
[17]. Chittineni, Suresh, et al. "Cat swarm optimization based neural network and particle swarm optimization based neural network in stock rates prediction." Proc. the 3rd International Conferences on Machine Learning and Computing. 2011.
[18]. Chen, CL Philip, and Chun-Yang Zhang. "Data-intensive applications, challenges, techniques and technologies: A survey on Big Data." Information Sciences 275(2014): 314-347.
[19]. Das, Debashish, and Mohammad Shorif Uddin. "Data mining and neural network techniques in stock market prediction: A Methodological review." International Journal of Artificial Intelligence & Applications 4.1(2013): 117.
[20]. Das, Sanjiv R., and Mike Y. Chen. "Yahoo! for Amazon: Sentiment extraction from small talk on the web." Management Science 53.9(2007): 1375-1388.
[21]. Dittenbach et al., "Uncovering hierarchical structure in data using the growing hierarchical self-organizing map." Neurocomputing 48.1(2002): 199-216.
[22]. Fama, Eugene F. "Efficient capital markets: A review of theory and empirical work*." The journal of Finance 25.2(1970): 383-417.
[23]. Fama, Eugene F., and Kenneth R. French. "The cross‐section of expected stock returns." The Journal of Finance 47.2(1992): 427-465.
[24]. Fama, Eugene F., and Kenneth R. French. "Size and book‐to‐market factors in earnings and returns." The Journal of Finance 50.1(1995): 131-155.
[25]. Fidan, Neslihan, and Beyza Ahlatcioglu Ozkok. "A Review on Applied Data Mining Techniques to Stock Market Prediction." Enterprise Business Modeling, Optimization Techniques, and Flexible Information Systems. IGI Global(2013): 108-026.
[26]. Frankel, Jeffrey A., and Kenneth A. Froot. "Understanding the US dollar in the eighties: the expectations of chartists and fundamentalists." Economic record 62.1(1986): 24-38.
[27]. Gunasekarage, Abeyratna, and David M. Power. "The profitability of moving average trading rules in South Asian stock markets." Emerging Markets Review 2.1(2001): 17-33.
[28]. Gartner, http://www.gartner.com/it-glossary/big-data/
[29]. Graham, Benjamin, and David L. Dodd. "Security analysis: principles and technique." McGraw-Hill. 1934.
[30]. Gupta, Abhishek, and Samidha D. Sharma. "Clustering-Classification Based Prediction of Stock Market Future Prediction." International Journal of Computer Science and Information Technologies 5.3(2014): 2806-2809.
[31]. Hagenau, Michael, Michael Liebmann, and Dirk Neumann. "Automated news reading: Stock price prediction based on financial news using context-capturing features." Decision Support Systems 55.3(2013): 685-697.
[32]. Haugen, Robert A. "The inefficient stock market: What pays off and why." Upper Saddle River, NJ: Prentice Hall. 1999.
[33]. Hsu, Ling-Yuan, et al. "Mutual funds trading strategy based on particle swarm optimization." Expert Systems with Applications 38.6(2011): 7582-7602.
[34]. Huang, Shin-Ying, Rua-Huan Tsaih, and Wan-Ying Lin. "Unsupervised neural networks approach for understanding fraudulent financial reporting." Industrial Management & Data Systems 112.2(2012): 224-244.
[35]. Hagstrom, Robert G. "The Warren Buffett way: Investment strategies of the world's greatest investor. " John Wiley & Sons. 1997.
[36]. Han, Yufeng, Ke Yang, and Guofu Zhou. "A new anomaly: The cross-sectional profitability of technical analysis." Journal of Financial and Quantitative Analysis 48.05(2013): 1433-1461.
[37]. Jahnavi, M., and J. Katyayani. "Neural networks for stock market prediction: A methodological review." South Asian Journal of Marketing & Management Research 4.2(2014): 105-111.
[38]. Karimi, Farzad, Mohsen Dastgir, and Monireh Shariati. "Index Prediction in Tehran Stock Exchange Using Hybrid Model of Artificial Neural Networks and Genetic Algorithms." International Journal of Academic Research in Accounting, Finance and Management Sciences 4.1(2014): 352-357.
[39]. Kaur, Savinderjit, and Veenu Mangat. "Improved Accuracy of PSO and DE using Normalization: an Application to Stock Price Prediction." arXiv preprint arXiv:1302.0962(2013).
[40]. Kennedy, James. "Particle swarm optimization." Encyclopedia of Machine Learning. Springer US(2010): 760-766.
[41]. Kohonen, Teuvo. "The self-organizing map." Neurocomputing 21.1(1998): 1-6.
[42]. Kwon, Yung-Keun, and Byung-Ro Moon. "A hybrid neurogenetic approach for stock forecasting." Neural Networks, IEEE Transactions on 18.3(2007): 851-864.
[43]. Liaw, Siqin. "Technical Analysis: An Asian Perspective."(2012).
[44]. Li, Chunquan, et al. "Stock prediction on basis of General Regression Neural Network optimized with modified simple Particle Swarm Optimization." Journal of Convergence Information Technology 7.16(2012): 1-10.
[45]. Lin, Nana, Xu, Wei, Zhang, Xinwei, and Lv, Siqi. "Can Web News Media Sentiments Improve Stock Trading Signal Prediction?" PACIS 2014 Proceedings. 2014.
[46]. Lin, Xiaowei, Zehong Yang, and Yixu Song. "Intelligent stock trading system based on improved technical analysis and Echo State Network." Expert systems with Applications 38.9(2011): 11347-11354.
[47]. Liu, Jiafen, et al. "Looking for Gold in the Sands: Stock Prediction Using Financial News and Social Media." PACIS. 2013.
[48]. LOS, Cornelis A. "Nonparametric efficiency testing of Asian stock markets using weekly data." Centre for Research in Financial Services Working Paper(1998):99-01.
[49]. Malkiel, Burton Gordon. "A random walk down Wall Street: including a life-cycle guide to personal investing." WW Norton & Company. 1999.
[50]. Mohapatra, Puspanjali, and Soumya Das. "Stock Market Prediction Using Bio-Inspired Computing: A Survey." International Journal of Engineering Science & Technology 5.4(2013).
[51]. Mohanram, Partha S. "Separating Winners from Losers among LowBook-to-Market Stocks using Financial Statement Analysis." Review of Accounting Studies 10.2-3(2005): 133-170.
[52]. Murphy, John J. "Technical analysis of the financial markets: A comprehensive guide to trading methods and applications." Penguin. 1999.
[53]. Neely, Christopher J., and Paul A. Weller. "Technical analysis and central bank intervention." Journal of International Money and Finance 20.7(2001): 949-970.
[54]. Negnevitsky, Michael. "Artificial intelligence: a guide to intelligent systems." Pearson Education, 2005.
[55]. Ohkawa, Etsushi, et al. "Evaluation of varying portfolio construction of stocks using Genetic Network Programming with control nodes." SICE Annual Conference, 2008. IEEE. 2008.
[56]. O'Leary, Daniel E. "BIG DATA’, The ‘Internet of Things’ and The ‘Internet of Signs." Intelligent Systems in Accounting, Finance and Management 20.1(2013): 53-65.
[57]. Osler, Carol L. "Support for resistance: technical analysis and intraday exchange rates." Economic Policy Review 6.2(2000).
[58]. Pang, Bo, and Lillian Lee. "Opinion mining and sentiment analysis."Foundations and trends in information retrieval 2.1-2(2008): 1-135.
[59]. Palvia, Prashant, et al. "Research methodologies in MIS: an update." Communications of the Association for Information Systems 14.1(2004): 24.
[60]. Peng, Hsin-Tsung, Hahn-Ming Lee, and Jan-Ming Ho. "Trading decision maker: Stock trading decision by price series smoothing and tendency transition inference." e-Technology, e-Commerce and e-Service, 2005. EEE'05. Proceedings. The 2005 IEEE International Conference on. IEEE. 2005.
[61]. Petchamé, Jordi, Àngela Nebot, and René Alquézar. "Quantitative and qualitative approaches for stock movement prediction." CCIA. 2012.
[62]. Pham, Hai V., et al. "Hybrid Kansei-SOM model using risk management and company assessment for stock trading." Information Sciences 256(2014): 8-24.
[63]. Piotroski, Joseph D. "Value investing: The use of historical financial statement information to separate winners from losers." Journal of Accounting Research(2000): 1-41.
[64]. Piotroski, Joseph D., and Eric C. So. "Identifying expectation errors in value/glamour strategies: A fundamental analysis approach." Review of Financial Studies(2012): hhs061.
[65]. Porshnev, Alexander, and Ilya Redkin. "Analysis of Twitter Users’ Mood for Prediction of Gold and Silver Prices in the Stock Market." Analysis of Images, Social Networks and Texts. Springer International Publishing(2014): 190-197.
[66]. Rodríguez-González, Alejandro, et al. "CAST: Using neural networks to improve trading systems based on technical analysis by means of the RSI financial indicator." Expert systems with Applications 38.9(2011): 11489-11500.
[67]. Rostami, Neda, et al. "Comparison Models of Corporate Bankruptcy Prediction in Companies Listed in Tehran Stock Exchange by Particle Swarm Optimization Algorithm and Support Vector Machine Algorithm." Journal of American Science 9.5s(2013).
[68]. Saha, Sugandha. "Comparison of Performance Analysis using Different Neural Network and Fuzzy Logic Models for Prediction of Stock Price. " Diss. 2013.
[69]. Shih, Jen-Ying, Yu-Jung Chang, and Wun-Hwa Chen. "Using GHSOM to construct legal maps for Taiwan’s securities and futures markets." Expert Systems with Applications, 34.2(2008): 850-858.
[70]. Si, Jianfeng, et al. "Exploiting Topic based Twitter Sentiment for Stock Prediction." ACL(2). 2013.
[71]. Sun, BaiQing, et al. "Prediction of stock index futures prices based on fuzzy sets and multivariate fuzzy time series." Neurocomputing 151(2015): 1528-1536
[72]. Teixeira, Lamartine Almeida, and Adriano Lorena Inacio De Oliveira. "A method for automatic stock trading combining technical analysis and nearest neighbor classification." Expert systems with applications 37.10(2010): 6885-6890.
[73]. Wang, Lipo, and Shekhar Gupta. "Neural networks and wavelet de-noising for stock trading and prediction." Time Series Analysis, Modeling and Applications. Springer Berlin Heidelberg(2013): 229-247.
[74]. Wang, Jar-Long, and Shu-Hui Chan. "Stock market trading rule discovery using pattern recognition and technical analysis." Expert Systems with Applications 33.2(2007): 304-315.
[75]. Wu, Xindong, et al. "Data mining with big data." Knowledge and Data Engineering, IEEE Transactions on 26.1(2014): 97-107.
[76]. Yang, Yahui, Dianbo Jiang, and Min Xia. "Using improved GHSOM for intrusion detection." Journal of Information Assurance and Security 5(2010): 232-239.
[77]. Yang, Yu-Hsiang, and Rua-Huan Tsaih. "An investigation of research on evolution of altruism using informetric methods and the growing hierarchical self-organizing map." Malaysian Journal of Library & Information Science 15.3(2010):1-17.
[78]. Zikopoulos, Paul C., et al. "Understanding big data." New York et al: McGraw-Hill. 2012.
[79]. Zikopoulos, Paul, et al. "Harness the Power of Big Data The IBM Big Data Platform." McGraw Hill Professional. 2012.
[80]. source:http://finance.yahoo.com/charts?s=TSM#symbol=TSM;range=1d.
[81]. EMC研究院、Vmware研究團隊,「巨量資料的下一步:Big Data新戰略、技術及大型網站應用實錄」,『上奇時代出版社』,2014。
[82]. 鍾華,「巴菲特股票穩健估值投資策略應用之研究」,『證券雙月刊』,586期,80-109頁,2011。
[83]. 洪瑞泰,「巴菲特選股魔法書」,『Smart智富出版社』,2004。
Description: 博士
國立政治大學
資訊管理學系
96356509
Source URI: http://thesis.lib.nccu.edu.tw/record/#G0096356509
Data Type: thesis
Appears in Collections:[資訊管理學系] 學位論文

Files in This Item:

File SizeFormat
650901.pdf3473KbAdobe PDF104View/Open


All items in 學術集成 are protected by copyright, with all rights reserved.


社群 sharing