學術產出-Theses
Article View/Open
Publication Export
-
題名 於系統日誌使用語言模型的異常分析
Anomaly Detection on System Log with Language Modeling作者 曾志中
Tseng, Chih-Chung貢獻者 蕭舜文
Hsiao, Shun-Wen
曾志中
Tseng, Chih-Chung關鍵詞 系統日誌分析
異常分析
深度學習
log data analysis
anomaly detection
deep learning日期 2022 上傳時間 2-Sep-2022 14:48:21 (UTC+8) 摘要 為管理系統服務品質,系統日誌廣泛地存在於應用軟體之中,而其中的異常行為與錯誤可能導致軟體漏洞的產生,並使服務暴露於危險之中。因此,系統維運人員通常採用異常偵測以及時發現不尋常的事件發生。隨著自然語言處理在近年的發展,分析系統日誌的研究開始採納語言表徵模型,讓預測模型也能考慮系統日誌背後的語意。這樣的方法使預測模型更能應付不斷變化的日誌格式。我們提出一個具有重建閘且基於BERT的單類別預測模型,於不同層級下學習系統日誌的正常行為。我們的方法結合了異常分析的訓練目標與語意的表徵,且透過組合的惡意分數,來反映連續事件中細微的異常。我們以兩個截然不同的資料集來評估我們的方法,而實驗結果展現出此模型對於複雜的系統日誌具有優秀的適應能力,並透過序列分析中的統計數據來解釋我們的成果。
System log is generally existing in software applications to help operators manage their services. Misbehavior and bugs in a system can cause vulnerabilities and put services in danger. Therefore, anomaly detection is adopted to aid operators to discover anomalous events in system log. With the development of deep learning models in Natural Language Processing (NLP), recent researches utilize language representation models to take semantics behind the log into consideration. The approach strengthens the adaptability of an anomaly detection model to log events with changing formats. We propose the Bert-based One-class classification with an explicit Reconstruction Gate (BORG) to recognize the benign session behavior of system log in different levels. Our method integrates the anomaly detection objective with language representation, and comprise a composite malicious score in the detection phase to reflect the abnormality in trivial events. We evaluate our concept under two log data sets with contrasting statistic properties. The result shows the robustness of our method to challenging log data. The experiments and analysis are also presented to explain our outcomes.參考文獻 K. Yamanishi and Y. Maruyama, “Dynamic syslog mining for network failure monitoring,” in Proceedings of the eleventh ACM SIGKDD international conference on Knowledge discovery in data mining, 2005, pp. 499–508.W. Xu, L. Huang, A. Fox, D. Patterson, and M. I. Jordan, “Detecting large-scale system problems by mining console logs,” in Proceedings of the ACM SIGOPS 22nd symposium on Operating systems principles, 2009, pp. 117–132.J.-G. Lou, Q. Fu, S. Yang, Y. Xu, and J. Li, “Mining invariants from console logs for system problem detection,” in 2010 USENIX Annual Technical Conference (USENIX ATC 10), 2010.K. Zhang, J. Xu, M. R. Min, G. Jiang, K. Pelechrinis, and H. Zhang, “Automated it system failure prediction: A deep learning approach,” in 2016 IEEE International Conference on Big Data (Big Data). IEEE, 2016, pp. 1291–1300.M. Du, F. Li, G. Zheng, and V. Srikumar, “Deeplog: Anomaly detection and diagnosis from system logs through deep learning,” in Proceedings of the 2017 ACM SIGSAC conference on computer and communications security, 2017, pp. 1285–1298.H. Guo, S. Yuan, and X. Wu, “Logbert: Log anomaly detection via bert,” in 2021 International Joint Conference on Neural Networks (IJCNN). IEEE, 2021, pp. 1–8.Z. Wang, Z. Chen, J. Ni, H. Liu, H. Chen, and J. Tang, “Multi-scale one-class recurrent neural networks for discrete event sequence anomaly detection,” in Proceedings of the 27th ACM SIGKDD Conference on Knowledge Discovery & Data Mining, 2021, pp. 3726–3734.W. Meng, Y. Liu, Y. Zhu, S. Zhang, D. Pei, Y. Liu, Y. Chen, R. Zhang, S. Tao, P. Sun et al., “Loganomaly: Unsupervised detection of sequential and quantitative anomalies in unstructured logs.” in IJCAI, vol. 19, no. 7, 2019, pp. 4739–4745.X. Zhang, Y. Xu, Q. Lin, B. Qiao, H. Zhang, Y. Dang, C. Xie, X. Yang, Q. Cheng, Z. Li et al., “Robust log-based anomaly detection on unstable log data,” in Proceedings of the 2019 27th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering, 2019, pp. 807–817.S. Huang, Y. Liu, C. Fung, R. He, Y. Zhao, H. Yang, and Z. Luan, “Hitanomaly: Hierarchical transformers for anomaly detection in system log,” IEEE Transactions on Network and Service Management, vol. 17, no. 4, pp. 2064–2076, 2020.J. Devlin, M.-W. Chang, K. Lee, and K. Toutanova, “Bert: Pre-training of deep bidirectional transformers for language understanding,” arXiv preprint arXiv:1810.04805, 2018.A. R. Tuor, R. Baerwolf, N. Knowles, B. Hutchinson, N. Nichols, and R. Jasper, “Recurrent neural network language models for open vocabulary event-level cyber anomaly detection,” in Workshops at the thirty-second AAAI conference on artificial intelligence, 2018.A. Brown, A. Tuor, B. Hutchinson, and N. Nichols, “Recurrent neural network attention mechanisms for interpretable system log anomaly detection,” in Proceedings of the First Workshop on Machine Learning for Computing Systems, 2018, pp. 1–8.S. Nedelkoski, J. Bogatinovski, A. Acker, J. Cardoso, and O. Kao, “Self-attentive classification-based anomaly detection in unstructured logs,” in 2020 IEEE International Conference on Data Mining (ICDM). IEEE, 2020, pp. 1196–1201.Y. Lee, J. Kim, and P. Kang, “Lanobert: System log anomaly detection based on bert masked language model,” arXiv preprint arXiv:2111.09564, 2021.A. Vaswani, N. Shazeer, N. Parmar, J. Uszkoreit, L. Jones, A. N. Gomez, Ł. Kaiser, and I. Polosukhin, “Attention is all you need,” Advances in neural information processing systems, vol. 30, 2017.W. L. Taylor, ““cloze procedure”: A new tool for measuring readability,” Journalism quarterly, vol. 30, no. 4, pp. 415–433, 1953.L. Ruff, R. Vandermeulen, N. Goernitz, L. Deecke, S. A. Siddiqui, A. Binder, E. Müller, and M. Kloft, “Deep one-class classification,” in International conference on machine learning. PMLR, 2018, pp. 4393–4402.Y. Wu, M. Schuster, Z. Chen, Q. V. Le, M. Norouzi, W. Macherey, M. Krikun, Y. Cao, Q. Gao, K. Macherey et al., “Google’s neural machine translation system: Bridging the gap between human and machine translation,” arXiv preprint arXiv:1609.08144, 2016.R. Collobert and J. Weston, “A unified architecture for natural language processing: Deep neural networks with multitask learning,” in Proceedings of the 25th international conference on Machine learning, 2008, pp. 160–167.T. Mikolov, K. Chen, G. Corrado, and J. Dean, “Efficient estimation of word representations in vector space,” arXiv preprint arXiv:1301.3781, 2013.A. Joulin, E. Grave, P. Bojanowski, M. Douze, H. Jégou, and T. Mikolov, “Fasttext. zip: Compressing text classification models,” arXiv preprint arXiv:1612.03651, 2016.S. Hochreiter and J. Schmidhuber, “Long short-term memory,” Neural computation, vol. 9, no. 8, pp. 1735–1780, 1997.K. Cho, B. Van Merriënboer, D. Bahdanau, and Y. Bengio, “On the properties of neural machine translation: Encoder-decoder approaches,” arXiv preprint arXiv:1409.1259, 2014.M. E. Peters, M. Neumann, M. Iyyer, M. Gardner, C. Clark, K. Lee, and L. Zettlemoyer, “Deep contextualized word representations,” in Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers). New Orleans, Louisiana: Association for Computational Linguistics, Jun. 2018, pp. 2227–2237. [Online]. Available: https://aclanthology.org/N18-1202M. Du and F. Li, “Spell: Streaming parsing of system event logs,” in 2016 IEEE 16th International Conference on Data Mining (ICDM). IEEE, 2016, pp. 859–864.P. He, J. Zhu, Z. Zheng, and M. R. Lyu, “Drain: An online log parsing approach with fixed depth tree,” in 2017 IEEE international conference on web services (ICWS). IEEE, 2017, pp. 33–40.E. Loper and S. Bird, “Nltk: The natural language toolkit,” arXiv preprint cs/0205028, 2002.R. Vaarandi, “A data clustering algorithm for mining patterns from event logs,” in Proceedings of the 3rd IEEE Workshop on IP Operations & Management (IPOM 2003)(IEEE Cat. No. 03EX764). Ieee, 2003, pp. 119–126.D. M. Tax and R. P. Duin, “Support vector data description,” Machine learning, vol. 54, no. 1, pp. 45–66, 2004.P. Lippe, “Tutorial 5: Transformers and multi-head attention,” Apr 2022. [Online]. Available: https://pytorch-lightning.readthedocs.io/en/latest/notebooks/course_UvA-DL/05-transformers-and-MH-attention.htmlI. Turc, M. Chang, K. Lee, and K. Toutanova, “Well-read students learn better: The impact of student initialization on knowledge distillation,” CoRR, vol. abs/1908.08962, 2019. [Online]. Available: http://arxiv.org/abs/1908.08962S. He, J. Zhu, P. He, and M. R. Lyu, “Loghub: a large collection of system log datasets towards automated log analytics,” arXiv preprint arXiv:2008.06448, 2020.A. Oliner and J. Stearley, “What supercomputers say: A study of five system logs,”in 37th annual IEEE/IFIP international conference on dependable systems and networks (DSN’07). IEEE, 2007, pp. 575–584. 描述 碩士
國立政治大學
資訊管理學系
109356019資料來源 http://thesis.lib.nccu.edu.tw/record/#G0109356019 資料類型 thesis dc.contributor.advisor 蕭舜文 zh_TW dc.contributor.advisor Hsiao, Shun-Wen en_US dc.contributor.author (Authors) 曾志中 zh_TW dc.contributor.author (Authors) Tseng, Chih-Chung en_US dc.creator (作者) 曾志中 zh_TW dc.creator (作者) Tseng, Chih-Chung en_US dc.date (日期) 2022 en_US dc.date.accessioned 2-Sep-2022 14:48:21 (UTC+8) - dc.date.available 2-Sep-2022 14:48:21 (UTC+8) - dc.date.issued (上傳時間) 2-Sep-2022 14:48:21 (UTC+8) - dc.identifier (Other Identifiers) G0109356019 en_US dc.identifier.uri (URI) http://nccur.lib.nccu.edu.tw/handle/140.119/141558 - dc.description (描述) 碩士 zh_TW dc.description (描述) 國立政治大學 zh_TW dc.description (描述) 資訊管理學系 zh_TW dc.description (描述) 109356019 zh_TW dc.description.abstract (摘要) 為管理系統服務品質,系統日誌廣泛地存在於應用軟體之中,而其中的異常行為與錯誤可能導致軟體漏洞的產生,並使服務暴露於危險之中。因此,系統維運人員通常採用異常偵測以及時發現不尋常的事件發生。隨著自然語言處理在近年的發展,分析系統日誌的研究開始採納語言表徵模型,讓預測模型也能考慮系統日誌背後的語意。這樣的方法使預測模型更能應付不斷變化的日誌格式。我們提出一個具有重建閘且基於BERT的單類別預測模型,於不同層級下學習系統日誌的正常行為。我們的方法結合了異常分析的訓練目標與語意的表徵,且透過組合的惡意分數,來反映連續事件中細微的異常。我們以兩個截然不同的資料集來評估我們的方法,而實驗結果展現出此模型對於複雜的系統日誌具有優秀的適應能力,並透過序列分析中的統計數據來解釋我們的成果。 zh_TW dc.description.abstract (摘要) System log is generally existing in software applications to help operators manage their services. Misbehavior and bugs in a system can cause vulnerabilities and put services in danger. Therefore, anomaly detection is adopted to aid operators to discover anomalous events in system log. With the development of deep learning models in Natural Language Processing (NLP), recent researches utilize language representation models to take semantics behind the log into consideration. The approach strengthens the adaptability of an anomaly detection model to log events with changing formats. We propose the Bert-based One-class classification with an explicit Reconstruction Gate (BORG) to recognize the benign session behavior of system log in different levels. Our method integrates the anomaly detection objective with language representation, and comprise a composite malicious score in the detection phase to reflect the abnormality in trivial events. We evaluate our concept under two log data sets with contrasting statistic properties. The result shows the robustness of our method to challenging log data. The experiments and analysis are also presented to explain our outcomes. en_US dc.description.tableofcontents Abstract i摘要 iiContents iiiList of Figures vList of Tables vi1 Introduction 12 Related Work 52.1 Background: Natural Language Processing 52.2 Related Work: Log Analysis 62.3 Comparison 83 Design of Our Method 93.1 Overview 93.2 Domain Preprocessing 113.3 Event Reconstruction Model 113.4 BERT-based One-class Classification Model 143.5 Composite Malicious Score 174 Experiment 184.1 Data set 184.1.1 Hadoop Distributed File System 184.1.2 BlueGene/L Supercomputer System 19iii4.2 Implementation 194.3 Model Selection 204.4 Evaluation 224.4.1 Performance Comparison 224.4.2 Performance Analysis 245 Discussion 276 Conclusion 28Reference 29 zh_TW dc.format.extent 1010399 bytes - dc.format.mimetype application/pdf - dc.source.uri (資料來源) http://thesis.lib.nccu.edu.tw/record/#G0109356019 en_US dc.subject (關鍵詞) 系統日誌分析 zh_TW dc.subject (關鍵詞) 異常分析 zh_TW dc.subject (關鍵詞) 深度學習 zh_TW dc.subject (關鍵詞) log data analysis en_US dc.subject (關鍵詞) anomaly detection en_US dc.subject (關鍵詞) deep learning en_US dc.title (題名) 於系統日誌使用語言模型的異常分析 zh_TW dc.title (題名) Anomaly Detection on System Log with Language Modeling en_US dc.type (資料類型) thesis en_US dc.relation.reference (參考文獻) K. Yamanishi and Y. Maruyama, “Dynamic syslog mining for network failure monitoring,” in Proceedings of the eleventh ACM SIGKDD international conference on Knowledge discovery in data mining, 2005, pp. 499–508.W. Xu, L. Huang, A. Fox, D. Patterson, and M. I. Jordan, “Detecting large-scale system problems by mining console logs,” in Proceedings of the ACM SIGOPS 22nd symposium on Operating systems principles, 2009, pp. 117–132.J.-G. Lou, Q. Fu, S. Yang, Y. Xu, and J. Li, “Mining invariants from console logs for system problem detection,” in 2010 USENIX Annual Technical Conference (USENIX ATC 10), 2010.K. Zhang, J. Xu, M. R. Min, G. Jiang, K. Pelechrinis, and H. Zhang, “Automated it system failure prediction: A deep learning approach,” in 2016 IEEE International Conference on Big Data (Big Data). IEEE, 2016, pp. 1291–1300.M. Du, F. Li, G. Zheng, and V. Srikumar, “Deeplog: Anomaly detection and diagnosis from system logs through deep learning,” in Proceedings of the 2017 ACM SIGSAC conference on computer and communications security, 2017, pp. 1285–1298.H. Guo, S. Yuan, and X. Wu, “Logbert: Log anomaly detection via bert,” in 2021 International Joint Conference on Neural Networks (IJCNN). IEEE, 2021, pp. 1–8.Z. Wang, Z. Chen, J. Ni, H. Liu, H. Chen, and J. Tang, “Multi-scale one-class recurrent neural networks for discrete event sequence anomaly detection,” in Proceedings of the 27th ACM SIGKDD Conference on Knowledge Discovery & Data Mining, 2021, pp. 3726–3734.W. Meng, Y. Liu, Y. Zhu, S. Zhang, D. Pei, Y. Liu, Y. Chen, R. Zhang, S. Tao, P. Sun et al., “Loganomaly: Unsupervised detection of sequential and quantitative anomalies in unstructured logs.” in IJCAI, vol. 19, no. 7, 2019, pp. 4739–4745.X. Zhang, Y. Xu, Q. Lin, B. Qiao, H. Zhang, Y. Dang, C. Xie, X. Yang, Q. Cheng, Z. Li et al., “Robust log-based anomaly detection on unstable log data,” in Proceedings of the 2019 27th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering, 2019, pp. 807–817.S. Huang, Y. Liu, C. Fung, R. He, Y. Zhao, H. Yang, and Z. Luan, “Hitanomaly: Hierarchical transformers for anomaly detection in system log,” IEEE Transactions on Network and Service Management, vol. 17, no. 4, pp. 2064–2076, 2020.J. Devlin, M.-W. Chang, K. Lee, and K. Toutanova, “Bert: Pre-training of deep bidirectional transformers for language understanding,” arXiv preprint arXiv:1810.04805, 2018.A. R. Tuor, R. Baerwolf, N. Knowles, B. Hutchinson, N. Nichols, and R. Jasper, “Recurrent neural network language models for open vocabulary event-level cyber anomaly detection,” in Workshops at the thirty-second AAAI conference on artificial intelligence, 2018.A. Brown, A. Tuor, B. Hutchinson, and N. Nichols, “Recurrent neural network attention mechanisms for interpretable system log anomaly detection,” in Proceedings of the First Workshop on Machine Learning for Computing Systems, 2018, pp. 1–8.S. Nedelkoski, J. Bogatinovski, A. Acker, J. Cardoso, and O. Kao, “Self-attentive classification-based anomaly detection in unstructured logs,” in 2020 IEEE International Conference on Data Mining (ICDM). IEEE, 2020, pp. 1196–1201.Y. Lee, J. Kim, and P. Kang, “Lanobert: System log anomaly detection based on bert masked language model,” arXiv preprint arXiv:2111.09564, 2021.A. Vaswani, N. Shazeer, N. Parmar, J. Uszkoreit, L. Jones, A. N. Gomez, Ł. Kaiser, and I. Polosukhin, “Attention is all you need,” Advances in neural information processing systems, vol. 30, 2017.W. L. Taylor, ““cloze procedure”: A new tool for measuring readability,” Journalism quarterly, vol. 30, no. 4, pp. 415–433, 1953.L. Ruff, R. Vandermeulen, N. Goernitz, L. Deecke, S. A. Siddiqui, A. Binder, E. Müller, and M. Kloft, “Deep one-class classification,” in International conference on machine learning. PMLR, 2018, pp. 4393–4402.Y. Wu, M. Schuster, Z. Chen, Q. V. Le, M. Norouzi, W. Macherey, M. Krikun, Y. Cao, Q. Gao, K. Macherey et al., “Google’s neural machine translation system: Bridging the gap between human and machine translation,” arXiv preprint arXiv:1609.08144, 2016.R. Collobert and J. Weston, “A unified architecture for natural language processing: Deep neural networks with multitask learning,” in Proceedings of the 25th international conference on Machine learning, 2008, pp. 160–167.T. Mikolov, K. Chen, G. Corrado, and J. Dean, “Efficient estimation of word representations in vector space,” arXiv preprint arXiv:1301.3781, 2013.A. Joulin, E. Grave, P. Bojanowski, M. Douze, H. Jégou, and T. Mikolov, “Fasttext. zip: Compressing text classification models,” arXiv preprint arXiv:1612.03651, 2016.S. Hochreiter and J. Schmidhuber, “Long short-term memory,” Neural computation, vol. 9, no. 8, pp. 1735–1780, 1997.K. Cho, B. Van Merriënboer, D. Bahdanau, and Y. Bengio, “On the properties of neural machine translation: Encoder-decoder approaches,” arXiv preprint arXiv:1409.1259, 2014.M. E. Peters, M. Neumann, M. Iyyer, M. Gardner, C. Clark, K. Lee, and L. Zettlemoyer, “Deep contextualized word representations,” in Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers). New Orleans, Louisiana: Association for Computational Linguistics, Jun. 2018, pp. 2227–2237. [Online]. Available: https://aclanthology.org/N18-1202M. Du and F. Li, “Spell: Streaming parsing of system event logs,” in 2016 IEEE 16th International Conference on Data Mining (ICDM). IEEE, 2016, pp. 859–864.P. He, J. Zhu, Z. Zheng, and M. R. Lyu, “Drain: An online log parsing approach with fixed depth tree,” in 2017 IEEE international conference on web services (ICWS). IEEE, 2017, pp. 33–40.E. Loper and S. Bird, “Nltk: The natural language toolkit,” arXiv preprint cs/0205028, 2002.R. Vaarandi, “A data clustering algorithm for mining patterns from event logs,” in Proceedings of the 3rd IEEE Workshop on IP Operations & Management (IPOM 2003)(IEEE Cat. No. 03EX764). Ieee, 2003, pp. 119–126.D. M. Tax and R. P. Duin, “Support vector data description,” Machine learning, vol. 54, no. 1, pp. 45–66, 2004.P. Lippe, “Tutorial 5: Transformers and multi-head attention,” Apr 2022. [Online]. Available: https://pytorch-lightning.readthedocs.io/en/latest/notebooks/course_UvA-DL/05-transformers-and-MH-attention.htmlI. Turc, M. Chang, K. Lee, and K. Toutanova, “Well-read students learn better: The impact of student initialization on knowledge distillation,” CoRR, vol. abs/1908.08962, 2019. [Online]. Available: http://arxiv.org/abs/1908.08962S. He, J. Zhu, P. He, and M. R. Lyu, “Loghub: a large collection of system log datasets towards automated log analytics,” arXiv preprint arXiv:2008.06448, 2020.A. Oliner and J. Stearley, “What supercomputers say: A study of five system logs,”in 37th annual IEEE/IFIP international conference on dependable systems and networks (DSN’07). IEEE, 2007, pp. 575–584. zh_TW dc.identifier.doi (DOI) 10.6814/NCCU202201200 en_US