Please use this identifier to cite or link to this item: https://ah.nccu.edu.tw/handle/140.119/101249


Title: 透過時間序列的波動特徵分群協助資料分類 -以公司危機事件為例
Achieve Efficient Data Classification by Time Series Wave Decomposition Pattern Clustering: Financial Distress as an Example
Authors: 陳郁婷
Chen, Yu Ting
Contributors: 胡毓忠
Hu,Yuh Jong
陳郁婷
Chen,Yu Ting
Keywords: R
時間序列
Spark
財務危機
R
Time Series
Spark
financial distress
Date: 2016
Issue Date: 2016-09-02 01:32:04 (UTC+8)
Abstract: 本研究透過時間序列拆解方法分析股價報酬率因數,取出趨勢波動特徵進行分群演算,將分群結果視為特徵值,進行更進一步資料分類。時間序列波形特徵,可對該序列做未來趨勢預測。本研究則將趨勢波形做為資料分群的特徵值,藉以輔助分類。本研究案例為財務危機公司,區分具實質財務危機或非實質財務危機,並整合公司其它財務與非財務相關分析。使用R 語言時間序列拆解工具找出趨勢波形並進行分群。採用Spark平行化計算架構的節點擴充運算能力與叢集式容錯處理以及RDD 的高效能運算。本研究並採用隨機決策森林的組合式(Ensemble)學習演算法進行公司危機型態的分類預測系統實驗。
The purpose of the study was to analyze rate of return factor by Time Series Wave Decomposition, to take Trend wave features to proceed clustering, then taking the clustering result as feature to achieve efficient data classification. Time Series Wave feature can be a predictor for future trend; however, this study took Time Series Wave as a classification feature and took Financial Distress company as an example to distinguish the financial distress to integrate relative financial analysis factors. Adopting Spark process data in parallel in standalone cluster mode with Resilient Distributed Dataset (RDD) to improve the computing performance. The study adopted random forest ensemble machine learning to proceed financial distress company classification prediction.
Reference: 【1】 戴國良(2005)。「財務管理-最新實務導向與本土企業案例」。頁424。台灣:五南
【2】 天下雜誌: http://www.cw.com.tw/article/article.action?id=5076028
【3】 林麗雪、龍邵琪(2007)資料探勘技術應用於財務危機模式之建構,中華管理評論國際學報
【4】 Beaver, W.H.,( 1966), Financial ratios as predictors of failure, Journal of Accounting Research 4, 71-111.
【5】 Altman E.I.(1968). Financial Ratios, Discriminant Analysis and The Prediction of Corporate Bankruptcy ,The Journal of Finance, 23, pp. 589-609
【6】 Ohlson, J.A.,(1980), Financial ratios and the probabilistic prediction of bankruptcy, Journal of Accounting Research .18, 109-131.
【7】 Saeed Aghabozorgi,Ali Seyed Shirkhorshidi,Teh Ying Wah.(2015).Time-series clustering – A decade review.Information Systems,53,16-38
【8】 M. Chi, S. Banerjee, A.E. Hassanien.(2009).Clustering time series data: an evolutionary approach Found. Comput. Intell., 6 (1) , pp. 193–207
【9】 E. Keogh, S. Kasetty.(2003). On the need for time series data mining benchmarks: a survey and empirical demonstration, Data Min.Knowl. Discov. 7 (4) (2003) 349–371
【10】 Rosas-Romero, Roberto ; Díaz-Torres, Alejandro ; Etcheverry, Gibran.(2016).Forecasting of stock return prices with sparse representation of financial time series over redundant dictionaries.Expert Systems With Applications.57, pp.37-48
【11】 Keogh, Eamonn ; Lin, Jessica.(2005).Clustering of time-series subsequences is meaningless: implications for previous and future research.Knowledge .Information Systems, Vol.8(2), pp.154-177
【12】 X. Wang,K.Smith,R.Hyndman. (2006).Characteristic-based clustering for time seriesdata,DataMin.Knowl.Discov.13(3) 335–364.
【13】 J.MacQueen,(1967).Some methods for classification and analysis of multi- variate observations,in:Proceedings of the fifth Berkeley sympo-sium MathematicalStatist.Probability,vol.1,pp.281–297.
【14】 P.S.Bradley,U.Fayyad,C.Reina,(1998).Scaling clustering algorithms to large databases, Knowl.Discov.Data Min.9–15
【15】 台灣經濟新報TEJ資料庫
【16】 M. Kendall and A. Stuart (1983) The Advanced Theory of Statistics,.3, Griffin. pp. 410–414
【17】 Robert B. Cleveland, William S. Cleveland, Jean E. McRae, and Irma erpenning,(1990).STL:A Seasonal-Trend Decomposition Precedure Based on Loess Journal of Official Statistics,.6(1), pp. 3–73
【18】 Zhang,H.,Ho,T.B.,Zhang,Y.,andLin,M.(2006)Unsupervised feature extraction for timeseries clustering using or thogonal wavelet transform. INFORMATICA-LJUBLJANA-,30(3), 305.
【19】 Montero, P and Vilar, J.A. (2014) TSclust: An R Package for Time Series Clustering. Journal of Statistical Software, 62(1), 1-43. http://www.jstatsoft.org/v62/i
【20】 Alonso, A.M., Berrendero, J.R., Hernandez, A. and Justel, A. (2006) Time series clustering based on forecast densities. Comput. Statist. Data Anal.,51,762–776.
【21】 Lin, J., Keogh, E., Lonardi, S. & Chiu, B. (2003). A Symbolic Representation of Time Series, with Implications for Streaming Algorithms. In Proceedings of the 8th ACM SIGMOD Workshop on Research Issues in Data Mining and Knowledge Discovery.
【22】 Keogh, E., Chakrabarti, K., Pazzani, M., & Mehrotra, S. (2001). Dimensionality reduction for fast similarity search in large timeseries databases. Knowledge and information Systems,3(3),263-286.
【23】 El Hennawy, R. H. A, Morris, R. C.(1983). The Significance of Base Year in Developing Failure Prediction Models. Journal of Business Finance and Accounting.10(2), 209-223
【24】 Matei Zaharia, Mosharaf Chowdhury, Michael J Franklin, Scott Shenker, and Ion Stoica.(2010). Spark: cluster computing with working sets. In USENIX conference on Hot topics in cloud computing, p10,
【25】 Vilar, J.A., Alonso, A. M. and Vilar, J.M. (2010) Non-linear time series clustering based on nonparametric forecast densities. Comput. Statist. Data Anal.,54(11), 2850–2865
【26】 PJ Rousseeuw.(1987).Silhouettes: a graphical aid to the interpretation and validation of cluster analysis.Journal of Computational and Applied Mathematics 20, 53-65
【27】 Breiman, L. (2001). Random forests. Machine learning, 45(1), 5-32
【28】 Patel, Jigar ; Shah, Sahil ; Thakkar, Priyank ; Kotecha, K.(2015).Predicting stock and stock price index movement using Trend Deterministic Data Preparation and machine learning techniques. Expert Systems With Applications. 42(1), pp.259-268
【29】 Engle, Robert F.(1982). "Autoregressive Condi- tional Heteroskedasticity with Estimates of the Variance of United Kingdom Inflation." Econo- metrica. 50:4, pp. 987-1007
【30】 Engle, Robert.(2001)..GARCH 101: The Use of ARCH/GARCH Models in Applied Econometrics.Journal of Economic Perspectives, , Vol.15(4), pp.157-168
【31】 Zmijewski, M. E. (1984), “Methodological Issues Related to the Estimation of Financial Distress Prediction Models.”Journal of Accounting Research, 22, pp.59-82
【32】 Platt, H.D. and M. B. Platt.(2002).Predicting Corporate Financial Distress: Reflections on ChoiceBased Sample Bias.”Journal of Economics and Finance, 26, pp.184-199
【33】 Esling,Philippe ; Agon, Carlos.(2013).Time-series data mining..ACM Computing Surveys.Vol.45(1), p.12(34)
【34】 鉅亨網. (2016/08/15): http://news.cnyes.com/news/id/2155746
【35】 Indicator Reference: http://www.fmlabs.com/reference
【36】 Package 'TTR' - CRAN :https://cran.r-project.org/web/packages/TTR/TTR.pdf
【37】 V. López, A. Fernandez, S. García, V. Palade, F. Herrera. (2013). An Insight into Classification with Imbalanced Data: Empirical Results and Current Trends on Using Data Intrinsic Characteristics. Information Sciences .250, 113-141
【38】 P. Ravi Kumar, V. Ravi .(2007).Bankruptcy prediction in banks and firms via statistical and intelligent techniques – A review. European Journal of Operational Research 180(1),pp 1–28
【39】 Sun, J., & Li, H. (2008b). Data mining method for listed companies’ financial distress prediction. Knowledge-Based Systems, 21(1), 1–5.
【40】 Sankoff, D., and Kruskal, J. B., eds. (1983). Time Warps, String Edits, and Macromolecules: Theory and Practice of Sequence Comparisons. Reading, MA: Addison-Wesley Publishing Company
【41】 Rosas-Romero, Roberto ; Díaz-Torres, Alejandro .(2006) Etcheverry, Gibran.Forecasting of stock return prices with sparse representation of financial time series over redundant dictionaries.Expert Systems With Applications. Vol.57, pp.37-48
【42】 Lahmiri, Salim.(2016).A variational mode decompoisition approach for analysis and forecasting of economic and financial time series.Expert Systems With Applications, .(55), pp.268-273 [Peer Reviewed Journal]
【43】 Hájek, Petr ; Neri, Filippo.An Introduction to the special issue on computational techniques for trading systems, time series forecasting, stock market modeling, financial assets modeling (print)
【44】 Palivonaite, Rita ; Lukoseviciute, Kristina ; Ragulskis, Minvydas.(2016).Short-term time series algebraic forecasting with mixed smoothing Neurocomputing, l(171). pp.854-865
【45】 Box, G. E. P. and Jenkins, G. M. (1976), Time Series Analysis: Forecasting and Control, San Francisco: Holden-Day.
【46】 Liao, T.W.(2005). Clustering of time series data: a survey. Pattern Recognit. 38, 1857-1874
Description: 碩士
國立政治大學
資訊科學系碩士在職專班
98971006
Source URI: http://thesis.lib.nccu.edu.tw/record/#G0098971006
Data Type: thesis
Appears in Collections:[資訊科學系碩士在職專班] 學位論文

Files in This Item:

File SizeFormat
100601.pdf3080KbAdobe PDF23View/Open


All items in 學術集成 are protected by copyright, with all rights reserved.


社群 sharing