學術產出-Theses

Article View/Open

Publication Export

Google ScholarTM

政大圖書館

Citation Infomation

題名 於系統日誌使用語言模型的異常分析
Anomaly Detection on System Log with Language Modeling
作者 曾志中
Tseng, Chih-Chung
貢獻者 蕭舜文
Hsiao, Shun-Wen
曾志中
Tseng, Chih-Chung
關鍵詞 系統日誌分析
異常分析
深度學習
log data analysis
anomaly detection
deep learning
日期 2022
上傳時間 2-Sep-2022 14:48:21 (UTC+8)
摘要 為管理系統服務品質,系統日誌廣泛地存在於應用軟體之中,而其中的異常行為與錯誤可能導致軟體漏洞的產生,並使服務暴露於危險之中。因此,系統維運人員通常採用異常偵測以及時發現不尋常的事件發生。隨著自然語言處理在近年的發展,分析系統日誌的研究開始採納語言表徵模型,讓預測模型也能考慮系統日誌背後的語意。這樣的方法使預測模型更能應付不斷變化的日誌格式。我們提出一個具有重建閘且基於BERT的單類別預測模型,於不同層級下學習系統日誌的正常行為。我們的方法結合了異常分析的訓練目標與語意的表徵,且透過組合的惡意分數,來反映連續事件中細微的異常。我們以兩個截然不同的資料集來評估我們的方法,而實驗結果展現出此模型對於複雜的系統日誌具有優秀的適應能力,並透過序列分析中的統計數據來解釋我們的成果。
System log is generally existing in software applications to help operators manage their services. Misbehavior and bugs in a system can cause vulnerabilities and put services in danger. Therefore, anomaly detection is adopted to aid operators to discover anomalous events in system log. With the development of deep learning models in Natural Language Processing (NLP), recent researches utilize language representation models to take semantics behind the log into consideration. The approach strengthens the adaptability of an anomaly detection model to log events with changing formats. We propose the Bert-based One-class classification with an explicit Reconstruction Gate (BORG) to recognize the benign session behavior of system log in different levels. Our method integrates the anomaly detection objective with language representation, and comprise a composite malicious score in the detection phase to reflect the abnormality in trivial events. We evaluate our concept under two log data sets with contrasting statistic properties. The result shows the robustness of our method to challenging log data. The experiments and analysis are also presented to explain our outcomes.
參考文獻 K. Yamanishi and Y. Maruyama, “Dynamic syslog mining for network failure monitoring,” in Proceedings of the eleventh ACM SIGKDD international conference on Knowledge discovery in data mining, 2005, pp. 499–508.
W. Xu, L. Huang, A. Fox, D. Patterson, and M. I. Jordan, “Detecting large-scale system problems by mining console logs,” in Proceedings of the ACM SIGOPS 22nd symposium on Operating systems principles, 2009, pp. 117–132.
J.-G. Lou, Q. Fu, S. Yang, Y. Xu, and J. Li, “Mining invariants from console logs for system problem detection,” in 2010 USENIX Annual Technical Conference (USENIX ATC 10), 2010.
K. Zhang, J. Xu, M. R. Min, G. Jiang, K. Pelechrinis, and H. Zhang, “Automated it system failure prediction: A deep learning approach,” in 2016 IEEE International Conference on Big Data (Big Data). IEEE, 2016, pp. 1291–1300.
M. Du, F. Li, G. Zheng, and V. Srikumar, “Deeplog: Anomaly detection and diagnosis from system logs through deep learning,” in Proceedings of the 2017 ACM SIGSAC conference on computer and communications security, 2017, pp. 1285–1298.
H. Guo, S. Yuan, and X. Wu, “Logbert: Log anomaly detection via bert,” in 2021 International Joint Conference on Neural Networks (IJCNN). IEEE, 2021, pp. 1–8.
Z. Wang, Z. Chen, J. Ni, H. Liu, H. Chen, and J. Tang, “Multi-scale one-class recurrent neural networks for discrete event sequence anomaly detection,” in Proceedings of the 27th ACM SIGKDD Conference on Knowledge Discovery & Data Mining, 2021, pp. 3726–3734.
W. Meng, Y. Liu, Y. Zhu, S. Zhang, D. Pei, Y. Liu, Y. Chen, R. Zhang, S. Tao, P. Sun et al., “Loganomaly: Unsupervised detection of sequential and quantitative anomalies in unstructured logs.” in IJCAI, vol. 19, no. 7, 2019, pp. 4739–4745.
X. Zhang, Y. Xu, Q. Lin, B. Qiao, H. Zhang, Y. Dang, C. Xie, X. Yang, Q. Cheng, Z. Li et al., “Robust log-based anomaly detection on unstable log data,” in Proceedings of the 2019 27th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering, 2019, pp. 807–817.
S. Huang, Y. Liu, C. Fung, R. He, Y. Zhao, H. Yang, and Z. Luan, “Hitanomaly: Hierarchical transformers for anomaly detection in system log,” IEEE Transactions on Network and Service Management, vol. 17, no. 4, pp. 2064–2076, 2020.
J. Devlin, M.-W. Chang, K. Lee, and K. Toutanova, “Bert: Pre-training of deep bidirectional transformers for language understanding,” arXiv preprint arXiv:1810.04805, 2018.
A. R. Tuor, R. Baerwolf, N. Knowles, B. Hutchinson, N. Nichols, and R. Jasper, “Recurrent neural network language models for open vocabulary event-level cyber anomaly detection,” in Workshops at the thirty-second AAAI conference on artificial intelligence, 2018.
A. Brown, A. Tuor, B. Hutchinson, and N. Nichols, “Recurrent neural network attention mechanisms for interpretable system log anomaly detection,” in Proceedings of the First Workshop on Machine Learning for Computing Systems, 2018, pp. 1–8.
S. Nedelkoski, J. Bogatinovski, A. Acker, J. Cardoso, and O. Kao, “Self-attentive classification-based anomaly detection in unstructured logs,” in 2020 IEEE International Conference on Data Mining (ICDM). IEEE, 2020, pp. 1196–1201.
Y. Lee, J. Kim, and P. Kang, “Lanobert: System log anomaly detection based on bert masked language model,” arXiv preprint arXiv:2111.09564, 2021.
A. Vaswani, N. Shazeer, N. Parmar, J. Uszkoreit, L. Jones, A. N. Gomez, Ł. Kaiser, and I. Polosukhin, “Attention is all you need,” Advances in neural information processing systems, vol. 30, 2017.
W. L. Taylor, ““cloze procedure”: A new tool for measuring readability,” Journalism quarterly, vol. 30, no. 4, pp. 415–433, 1953.
L. Ruff, R. Vandermeulen, N. Goernitz, L. Deecke, S. A. Siddiqui, A. Binder, E. Müller, and M. Kloft, “Deep one-class classification,” in International conference on machine learning. PMLR, 2018, pp. 4393–4402.
Y. Wu, M. Schuster, Z. Chen, Q. V. Le, M. Norouzi, W. Macherey, M. Krikun, Y. Cao, Q. Gao, K. Macherey et al., “Google’s neural machine translation system: Bridging the gap between human and machine translation,” arXiv preprint arXiv:1609.08144, 2016.
R. Collobert and J. Weston, “A unified architecture for natural language processing: Deep neural networks with multitask learning,” in Proceedings of the 25th international conference on Machine learning, 2008, pp. 160–167.
T. Mikolov, K. Chen, G. Corrado, and J. Dean, “Efficient estimation of word representations in vector space,” arXiv preprint arXiv:1301.3781, 2013.
A. Joulin, E. Grave, P. Bojanowski, M. Douze, H. Jégou, and T. Mikolov, “Fasttext. zip: Compressing text classification models,” arXiv preprint arXiv:1612.03651, 2016.
S. Hochreiter and J. Schmidhuber, “Long short-term memory,” Neural computation, vol. 9, no. 8, pp. 1735–1780, 1997.
K. Cho, B. Van Merriënboer, D. Bahdanau, and Y. Bengio, “On the properties of neural machine translation: Encoder-decoder approaches,” arXiv preprint arXiv:1409.1259, 2014.
M. E. Peters, M. Neumann, M. Iyyer, M. Gardner, C. Clark, K. Lee, and L. Zettlemoyer, “Deep contextualized word representations,” in Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers). New Orleans, Louisiana: Association for Computational Linguistics, Jun. 2018, pp. 2227–2237. [Online]. Available: https://aclanthology.org/N18-1202
M. Du and F. Li, “Spell: Streaming parsing of system event logs,” in 2016 IEEE 16th International Conference on Data Mining (ICDM). IEEE, 2016, pp. 859–864.
P. He, J. Zhu, Z. Zheng, and M. R. Lyu, “Drain: An online log parsing approach with fixed depth tree,” in 2017 IEEE international conference on web services (ICWS). IEEE, 2017, pp. 33–40.
E. Loper and S. Bird, “Nltk: The natural language toolkit,” arXiv preprint cs/0205028, 2002.
R. Vaarandi, “A data clustering algorithm for mining patterns from event logs,” in Proceedings of the 3rd IEEE Workshop on IP Operations & Management (IPOM 2003)(IEEE Cat. No. 03EX764). Ieee, 2003, pp. 119–126.
D. M. Tax and R. P. Duin, “Support vector data description,” Machine learning, vol. 54, no. 1, pp. 45–66, 2004.
P. Lippe, “Tutorial 5: Transformers and multi-head attention,” Apr 2022. [Online]. Available: https://pytorch-lightning.readthedocs.io/en/latest/notebooks/course_UvA-DL/05-transformers-and-MH-attention.html
I. Turc, M. Chang, K. Lee, and K. Toutanova, “Well-read students learn better: The impact of student initialization on knowledge distillation,” CoRR, vol. abs/1908.08962, 2019. [Online]. Available: http://arxiv.org/abs/1908.08962
S. He, J. Zhu, P. He, and M. R. Lyu, “Loghub: a large collection of system log datasets towards automated log analytics,” arXiv preprint arXiv:2008.06448, 2020.
A. Oliner and J. Stearley, “What supercomputers say: A study of five system logs,”in 37th annual IEEE/IFIP international conference on dependable systems and networks (DSN’07). IEEE, 2007, pp. 575–584.
描述 碩士
國立政治大學
資訊管理學系
109356019
資料來源 http://thesis.lib.nccu.edu.tw/record/#G0109356019
資料類型 thesis
dc.contributor.advisor 蕭舜文zh_TW
dc.contributor.advisor Hsiao, Shun-Wenen_US
dc.contributor.author (Authors) 曾志中zh_TW
dc.contributor.author (Authors) Tseng, Chih-Chungen_US
dc.creator (作者) 曾志中zh_TW
dc.creator (作者) Tseng, Chih-Chungen_US
dc.date (日期) 2022en_US
dc.date.accessioned 2-Sep-2022 14:48:21 (UTC+8)-
dc.date.available 2-Sep-2022 14:48:21 (UTC+8)-
dc.date.issued (上傳時間) 2-Sep-2022 14:48:21 (UTC+8)-
dc.identifier (Other Identifiers) G0109356019en_US
dc.identifier.uri (URI) http://nccur.lib.nccu.edu.tw/handle/140.119/141558-
dc.description (描述) 碩士zh_TW
dc.description (描述) 國立政治大學zh_TW
dc.description (描述) 資訊管理學系zh_TW
dc.description (描述) 109356019zh_TW
dc.description.abstract (摘要) 為管理系統服務品質,系統日誌廣泛地存在於應用軟體之中,而其中的異常行為與錯誤可能導致軟體漏洞的產生,並使服務暴露於危險之中。因此,系統維運人員通常採用異常偵測以及時發現不尋常的事件發生。隨著自然語言處理在近年的發展,分析系統日誌的研究開始採納語言表徵模型,讓預測模型也能考慮系統日誌背後的語意。這樣的方法使預測模型更能應付不斷變化的日誌格式。我們提出一個具有重建閘且基於BERT的單類別預測模型,於不同層級下學習系統日誌的正常行為。我們的方法結合了異常分析的訓練目標與語意的表徵,且透過組合的惡意分數,來反映連續事件中細微的異常。我們以兩個截然不同的資料集來評估我們的方法,而實驗結果展現出此模型對於複雜的系統日誌具有優秀的適應能力,並透過序列分析中的統計數據來解釋我們的成果。zh_TW
dc.description.abstract (摘要) System log is generally existing in software applications to help operators manage their services. Misbehavior and bugs in a system can cause vulnerabilities and put services in danger. Therefore, anomaly detection is adopted to aid operators to discover anomalous events in system log. With the development of deep learning models in Natural Language Processing (NLP), recent researches utilize language representation models to take semantics behind the log into consideration. The approach strengthens the adaptability of an anomaly detection model to log events with changing formats. We propose the Bert-based One-class classification with an explicit Reconstruction Gate (BORG) to recognize the benign session behavior of system log in different levels. Our method integrates the anomaly detection objective with language representation, and comprise a composite malicious score in the detection phase to reflect the abnormality in trivial events. We evaluate our concept under two log data sets with contrasting statistic properties. The result shows the robustness of our method to challenging log data. The experiments and analysis are also presented to explain our outcomes.en_US
dc.description.tableofcontents Abstract i
摘要 ii
Contents iii
List of Figures v
List of Tables vi
1 Introduction 1
2 Related Work 5
2.1 Background: Natural Language Processing 5
2.2 Related Work: Log Analysis 6
2.3 Comparison 8
3 Design of Our Method 9
3.1 Overview 9
3.2 Domain Preprocessing 11
3.3 Event Reconstruction Model 11
3.4 BERT-based One-class Classification Model 14
3.5 Composite Malicious Score 17
4 Experiment 18
4.1 Data set 18
4.1.1 Hadoop Distributed File System 18
4.1.2 BlueGene/L Supercomputer System 19
iii
4.2 Implementation 19
4.3 Model Selection 20
4.4 Evaluation 22
4.4.1 Performance Comparison 22
4.4.2 Performance Analysis 24
5 Discussion 27
6 Conclusion 28
Reference 29
zh_TW
dc.format.extent 1010399 bytes-
dc.format.mimetype application/pdf-
dc.source.uri (資料來源) http://thesis.lib.nccu.edu.tw/record/#G0109356019en_US
dc.subject (關鍵詞) 系統日誌分析zh_TW
dc.subject (關鍵詞) 異常分析zh_TW
dc.subject (關鍵詞) 深度學習zh_TW
dc.subject (關鍵詞) log data analysisen_US
dc.subject (關鍵詞) anomaly detectionen_US
dc.subject (關鍵詞) deep learningen_US
dc.title (題名) 於系統日誌使用語言模型的異常分析zh_TW
dc.title (題名) Anomaly Detection on System Log with Language Modelingen_US
dc.type (資料類型) thesisen_US
dc.relation.reference (參考文獻) K. Yamanishi and Y. Maruyama, “Dynamic syslog mining for network failure monitoring,” in Proceedings of the eleventh ACM SIGKDD international conference on Knowledge discovery in data mining, 2005, pp. 499–508.
W. Xu, L. Huang, A. Fox, D. Patterson, and M. I. Jordan, “Detecting large-scale system problems by mining console logs,” in Proceedings of the ACM SIGOPS 22nd symposium on Operating systems principles, 2009, pp. 117–132.
J.-G. Lou, Q. Fu, S. Yang, Y. Xu, and J. Li, “Mining invariants from console logs for system problem detection,” in 2010 USENIX Annual Technical Conference (USENIX ATC 10), 2010.
K. Zhang, J. Xu, M. R. Min, G. Jiang, K. Pelechrinis, and H. Zhang, “Automated it system failure prediction: A deep learning approach,” in 2016 IEEE International Conference on Big Data (Big Data). IEEE, 2016, pp. 1291–1300.
M. Du, F. Li, G. Zheng, and V. Srikumar, “Deeplog: Anomaly detection and diagnosis from system logs through deep learning,” in Proceedings of the 2017 ACM SIGSAC conference on computer and communications security, 2017, pp. 1285–1298.
H. Guo, S. Yuan, and X. Wu, “Logbert: Log anomaly detection via bert,” in 2021 International Joint Conference on Neural Networks (IJCNN). IEEE, 2021, pp. 1–8.
Z. Wang, Z. Chen, J. Ni, H. Liu, H. Chen, and J. Tang, “Multi-scale one-class recurrent neural networks for discrete event sequence anomaly detection,” in Proceedings of the 27th ACM SIGKDD Conference on Knowledge Discovery & Data Mining, 2021, pp. 3726–3734.
W. Meng, Y. Liu, Y. Zhu, S. Zhang, D. Pei, Y. Liu, Y. Chen, R. Zhang, S. Tao, P. Sun et al., “Loganomaly: Unsupervised detection of sequential and quantitative anomalies in unstructured logs.” in IJCAI, vol. 19, no. 7, 2019, pp. 4739–4745.
X. Zhang, Y. Xu, Q. Lin, B. Qiao, H. Zhang, Y. Dang, C. Xie, X. Yang, Q. Cheng, Z. Li et al., “Robust log-based anomaly detection on unstable log data,” in Proceedings of the 2019 27th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering, 2019, pp. 807–817.
S. Huang, Y. Liu, C. Fung, R. He, Y. Zhao, H. Yang, and Z. Luan, “Hitanomaly: Hierarchical transformers for anomaly detection in system log,” IEEE Transactions on Network and Service Management, vol. 17, no. 4, pp. 2064–2076, 2020.
J. Devlin, M.-W. Chang, K. Lee, and K. Toutanova, “Bert: Pre-training of deep bidirectional transformers for language understanding,” arXiv preprint arXiv:1810.04805, 2018.
A. R. Tuor, R. Baerwolf, N. Knowles, B. Hutchinson, N. Nichols, and R. Jasper, “Recurrent neural network language models for open vocabulary event-level cyber anomaly detection,” in Workshops at the thirty-second AAAI conference on artificial intelligence, 2018.
A. Brown, A. Tuor, B. Hutchinson, and N. Nichols, “Recurrent neural network attention mechanisms for interpretable system log anomaly detection,” in Proceedings of the First Workshop on Machine Learning for Computing Systems, 2018, pp. 1–8.
S. Nedelkoski, J. Bogatinovski, A. Acker, J. Cardoso, and O. Kao, “Self-attentive classification-based anomaly detection in unstructured logs,” in 2020 IEEE International Conference on Data Mining (ICDM). IEEE, 2020, pp. 1196–1201.
Y. Lee, J. Kim, and P. Kang, “Lanobert: System log anomaly detection based on bert masked language model,” arXiv preprint arXiv:2111.09564, 2021.
A. Vaswani, N. Shazeer, N. Parmar, J. Uszkoreit, L. Jones, A. N. Gomez, Ł. Kaiser, and I. Polosukhin, “Attention is all you need,” Advances in neural information processing systems, vol. 30, 2017.
W. L. Taylor, ““cloze procedure”: A new tool for measuring readability,” Journalism quarterly, vol. 30, no. 4, pp. 415–433, 1953.
L. Ruff, R. Vandermeulen, N. Goernitz, L. Deecke, S. A. Siddiqui, A. Binder, E. Müller, and M. Kloft, “Deep one-class classification,” in International conference on machine learning. PMLR, 2018, pp. 4393–4402.
Y. Wu, M. Schuster, Z. Chen, Q. V. Le, M. Norouzi, W. Macherey, M. Krikun, Y. Cao, Q. Gao, K. Macherey et al., “Google’s neural machine translation system: Bridging the gap between human and machine translation,” arXiv preprint arXiv:1609.08144, 2016.
R. Collobert and J. Weston, “A unified architecture for natural language processing: Deep neural networks with multitask learning,” in Proceedings of the 25th international conference on Machine learning, 2008, pp. 160–167.
T. Mikolov, K. Chen, G. Corrado, and J. Dean, “Efficient estimation of word representations in vector space,” arXiv preprint arXiv:1301.3781, 2013.
A. Joulin, E. Grave, P. Bojanowski, M. Douze, H. Jégou, and T. Mikolov, “Fasttext. zip: Compressing text classification models,” arXiv preprint arXiv:1612.03651, 2016.
S. Hochreiter and J. Schmidhuber, “Long short-term memory,” Neural computation, vol. 9, no. 8, pp. 1735–1780, 1997.
K. Cho, B. Van Merriënboer, D. Bahdanau, and Y. Bengio, “On the properties of neural machine translation: Encoder-decoder approaches,” arXiv preprint arXiv:1409.1259, 2014.
M. E. Peters, M. Neumann, M. Iyyer, M. Gardner, C. Clark, K. Lee, and L. Zettlemoyer, “Deep contextualized word representations,” in Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers). New Orleans, Louisiana: Association for Computational Linguistics, Jun. 2018, pp. 2227–2237. [Online]. Available: https://aclanthology.org/N18-1202
M. Du and F. Li, “Spell: Streaming parsing of system event logs,” in 2016 IEEE 16th International Conference on Data Mining (ICDM). IEEE, 2016, pp. 859–864.
P. He, J. Zhu, Z. Zheng, and M. R. Lyu, “Drain: An online log parsing approach with fixed depth tree,” in 2017 IEEE international conference on web services (ICWS). IEEE, 2017, pp. 33–40.
E. Loper and S. Bird, “Nltk: The natural language toolkit,” arXiv preprint cs/0205028, 2002.
R. Vaarandi, “A data clustering algorithm for mining patterns from event logs,” in Proceedings of the 3rd IEEE Workshop on IP Operations & Management (IPOM 2003)(IEEE Cat. No. 03EX764). Ieee, 2003, pp. 119–126.
D. M. Tax and R. P. Duin, “Support vector data description,” Machine learning, vol. 54, no. 1, pp. 45–66, 2004.
P. Lippe, “Tutorial 5: Transformers and multi-head attention,” Apr 2022. [Online]. Available: https://pytorch-lightning.readthedocs.io/en/latest/notebooks/course_UvA-DL/05-transformers-and-MH-attention.html
I. Turc, M. Chang, K. Lee, and K. Toutanova, “Well-read students learn better: The impact of student initialization on knowledge distillation,” CoRR, vol. abs/1908.08962, 2019. [Online]. Available: http://arxiv.org/abs/1908.08962
S. He, J. Zhu, P. He, and M. R. Lyu, “Loghub: a large collection of system log datasets towards automated log analytics,” arXiv preprint arXiv:2008.06448, 2020.
A. Oliner and J. Stearley, “What supercomputers say: A study of five system logs,”in 37th annual IEEE/IFIP international conference on dependable systems and networks (DSN’07). IEEE, 2007, pp. 575–584.
zh_TW
dc.identifier.doi (DOI) 10.6814/NCCU202201200en_US