Publications-Theses
Article View/Open
Publication Export
-
題名 物聯網惡意軟體動態分析監控系統與其家族行為分析
IoT Malware Dynamic Analysis Profiling System and Family Behavior Analysis作者 陳呈祐
Chen, Cheng-Yu貢獻者 蕭舜文
Hsiao, Shun-Wen
陳呈祐
Chen, Cheng-Yu關鍵詞 物聯網惡意程式
虛擬機器內省
順序資料
QEMU
動態分析
圖形分析
馬可夫模型
IoT malware
Virtual Machine Introspection
Sequential Data
QEMU
Dynamic Analysis
Graph Analysis
Markov Model日期 2020 上傳時間 4-Aug-2021 14:46:07 (UTC+8) 摘要 最近不只物聯網設備的數量遽增,連帶物聯網惡意程式也大量出現。本研究希望了解物聯網惡意程式所帶來的威脅但現今缺乏方法來觀測、分析與偵測物聯網惡意程式。因此,我們設計了一個自動化的虛擬監控系統來蒐集物聯網惡意程式的行為,例如:API call invocation, system call execution等。除了傳統的監控方式 (Strace與封包側錄) 外,本研究提出一個監控系統使用虛擬機內省機制的C library hooking技術來擷取物聯網惡意程式所呼叫的C library call以避免遭到物聯網惡意程式的偵測。在所蒐集到的物聯網惡意程式行為中,本研究發現不只在各個惡意程式間有相似,在同一個惡意程式家族中也存有變異。因此,本研究認為在物聯網惡意程式中有家族並且物聯網惡意程式家族中也含有子家族。本研究提出一個家族行為分析系統透過馬可夫模型與Doc2Vec來分析物聯網惡意程式的順序資料並萃取向量化特徵、尋找子家族與子家族代表之圖形。
Not only the number of deployed IoT devices increases but also that of IoT malware. We are eager to understand the threat made by IoT malware, but we lack the tools to observe, analyze and detect them. Therefore, we design and implement an automatic, virtual machine-based profiling system to collect valuable IoT malware behavior, such as API call invocation, system call execution, etc. In addition to conventional profiling methods (e.g., Strace and packet capture), we proposed a profiling system that adapts virtual machine introspection based C library hooking technique to intercept C library call invocation by malware so that our introspection would not be detected by IoT malware. In the profiles we collected, we observe not only similarities between profiles but also variants in IoT family malware. Therefore, we anticipate that there are families in IoT malware and subfamily in the IoT malware family. We then propose a family behavior analysis system to analyze the multiple sequential data (C library calls) by the Markov model and Doc2Vec to extract vectorized malware features, discover subfamily and generate subfamily representative behavior graph.參考文獻 [1] B. Anderson, D. Quist, J. Neil, C. Storlie, and T. Lane, “Graph-based malware detection using dynamic analysis”, Journal in computer Virology, vol. 7, no. 4, pp. 247–258, 2011[2] X. Jiang, X. Wang, and D. Xu, “Stealthy malware detection through vmm-based out-of-the-box semantic view reconstruction”, in Proceedings of the 2007 ACM Conference on Computer and Communications Security, 2007, pp. 128-138.[3] A. Grover, and J. Leskovec, “node2vec: Scalable Feature Learning for Networks”, Proceedings of the 22nd ACM SIGKDD international conference on Knowledge discovery and data mining, 2016, pp.885-864.[4] A. Costin, and J. Zaddach, “IoT Malware: Comprehensive Survey, Analysis Framework and Case Studies”, BlackHat USA[5] M. Hossain, J. Wang, R. Sekar, and S. Stoller, “Dependence-Preserving Data Compaction for Scalable Forensic Analysis”, 27th USENIX Security Symposium (USENIX Security 18),2018, pp.1723-1740.[6] G. Davis, “2020: Life with 50 billion connected devices”, 2018 IEEE International Conference on Consumer Electronics, 2018, pp.1–1.[7] B. Vignau, R. Khoury, and S. Hallé, “10 Years of IoT Malware: a Feature-Based Taxonomy”, 2019 IEEE 19th International Conference on Software Quality, Reliability and SecurityCompanion, 2019, pp.458-465.[8] S. W. Hsiao, Y. S. Sun, and M. S. Chen, “Hardware-Assisted MMU Redirection for In-Guest Monitoring and API Profiling”, IEEE Transactions on Information Forensics and Security, 2020, pp.2402-2416.[9] Y. Hebbal, S. Laniepce, and J. M. Menaud, “Virtual machine introspection: Techniques and applications”, 2015 10th international conference on availability, reliability and security,2015, pp.676-685.[10] Q. Le, and T. Mikolov, “Distributed representations of sentences and documents”, International conference on machine learning, 2014, pp.1188-1196.[11] Y. Yang, L. Wu, G. Yin, and L. Li, “A survey on security and privacy issues in Internet-ofThings”, IEEE Internet of Things Journal, 2017, pp.1250-1258.[12] T. Garfinkel, and M. Rosenblum, “A Virtual Machine Introspection Based Architecture for Intrusion Detection”, Ndss, 2003, pp.191-206.[13] W. L. Hamilton, R. Ying, and J. Leskovec, “Representation learning on graphs: Methods and applications”, arXiv preprint, 2017, arXiv:1709.05584.[14] P. Goyal, and E. Ferrara, “Graph embedding techniques, applications, and performance: A survey”, Knowledge-Based Systems, 2018, pp.78-94.[15] B. PEROZZI, R. AL-RFOU, and S. SKIENA, “Deepwalk: Online learning of socialrepresentations”, Proceedings of the 20th ACM SIGKDD international conference on Knowledge discovery and data mining, 2014, pp.701-710.[16] D. Zhang, J. Yin, X. Zhu, and C. Zhang, “Network representation learning: A survey”, IEEE transactions on Big Data, 2018[17] T. Mikolov, K. Chen, G. Corrado, and J. Dean, “Efficient estimation of word representations in vector space”, arXiv preprint, 2013, arXiv:1301.3781.[18] S. W. Hsiao, Y. S. Sun, and M. C. Chen, “Virtual machine introspection based malware behavior profiling and family grouping”, arXiv preprint, 2017, arXiv:1705.01697.[19] S. Vogl, and C. Eckert, “Using hardware performance events for instruction-level monitoring on the X86 architecture”, Proc. Eur. Workshop Syst. Secur. (EuroSec), 2012, pp. 1–6.[20] A. Dinaburg, P. Royal, M. Sharif, and W. Lee, “Ether: Malware analysis via hardware virtualization extensions”, Proc. ACM Conf. Comput. Commun. Secur., 2008, pp. 51–62.[21] J. Pfoh, C. Schneider, and C. Eckert, “Nitro: Hardware-based system call tracing for virtual machines”, Advances in Information and Computer Security (Lecture Notes in Computer Science), 2011, pp. 96–112.[22] M. I. Sharif, W. Lee, W. Cui, and A. Lanzi, “Secure in-VM monitoring using hardware virtualization”, Proc. ACM Conf. Comput. Commun. Secur., 2009, pp. 477–487.[23] C. Willems, R. Hund, and T. Holz, “CXPInspector: Hypervisor-based, hardware-assisted system monitoring”, Ruhr-Univ. Bochum, Bochum, Germany, Tech. Rep. TR-HGI-2012-002, Nov. 26, 2012.[24] B. D. Payne, M. Carbone, M. Sharif, and W. Lee, “Lares: An architecture for secure active monitoring using virtualization”, Proc. IEEE Symp. Secur. Privacy, 2008, pp. 233–247.[25] Y. Fu and Z. Lin, “Exterior: Using a dual-VM based external shell for guest-OS introspection, configuration, and recovery”, ACM SIGPLAN Notices, 2013, pp. 97–110.[26] D. Song et al. , “BitBlaze: A new approach to computer security via binary analysis”, Information Systems Security (Lecture Notes in Computer Science), 2008, pp. 1–25.[27] CuckooSandbox,Accessed:Sep.20,2019.Online].Available:http://www.cuckoosandbox.org 描述 碩士
國立政治大學
資訊管理學系
107356035資料來源 http://thesis.lib.nccu.edu.tw/record/#G0107356035 資料類型 thesis dc.contributor.advisor 蕭舜文 zh_TW dc.contributor.advisor Hsiao, Shun-Wen en_US dc.contributor.author (Authors) 陳呈祐 zh_TW dc.contributor.author (Authors) Chen, Cheng-Yu en_US dc.creator (作者) 陳呈祐 zh_TW dc.creator (作者) Chen, Cheng-Yu en_US dc.date (日期) 2020 en_US dc.date.accessioned 4-Aug-2021 14:46:07 (UTC+8) - dc.date.available 4-Aug-2021 14:46:07 (UTC+8) - dc.date.issued (上傳時間) 4-Aug-2021 14:46:07 (UTC+8) - dc.identifier (Other Identifiers) G0107356035 en_US dc.identifier.uri (URI) http://nccur.lib.nccu.edu.tw/handle/140.119/136337 - dc.description (描述) 碩士 zh_TW dc.description (描述) 國立政治大學 zh_TW dc.description (描述) 資訊管理學系 zh_TW dc.description (描述) 107356035 zh_TW dc.description.abstract (摘要) 最近不只物聯網設備的數量遽增,連帶物聯網惡意程式也大量出現。本研究希望了解物聯網惡意程式所帶來的威脅但現今缺乏方法來觀測、分析與偵測物聯網惡意程式。因此,我們設計了一個自動化的虛擬監控系統來蒐集物聯網惡意程式的行為,例如:API call invocation, system call execution等。除了傳統的監控方式 (Strace與封包側錄) 外,本研究提出一個監控系統使用虛擬機內省機制的C library hooking技術來擷取物聯網惡意程式所呼叫的C library call以避免遭到物聯網惡意程式的偵測。在所蒐集到的物聯網惡意程式行為中,本研究發現不只在各個惡意程式間有相似,在同一個惡意程式家族中也存有變異。因此,本研究認為在物聯網惡意程式中有家族並且物聯網惡意程式家族中也含有子家族。本研究提出一個家族行為分析系統透過馬可夫模型與Doc2Vec來分析物聯網惡意程式的順序資料並萃取向量化特徵、尋找子家族與子家族代表之圖形。 zh_TW dc.description.abstract (摘要) Not only the number of deployed IoT devices increases but also that of IoT malware. We are eager to understand the threat made by IoT malware, but we lack the tools to observe, analyze and detect them. Therefore, we design and implement an automatic, virtual machine-based profiling system to collect valuable IoT malware behavior, such as API call invocation, system call execution, etc. In addition to conventional profiling methods (e.g., Strace and packet capture), we proposed a profiling system that adapts virtual machine introspection based C library hooking technique to intercept C library call invocation by malware so that our introspection would not be detected by IoT malware. In the profiles we collected, we observe not only similarities between profiles but also variants in IoT family malware. Therefore, we anticipate that there are families in IoT malware and subfamily in the IoT malware family. We then propose a family behavior analysis system to analyze the multiple sequential data (C library calls) by the Markov model and Doc2Vec to extract vectorized malware features, discover subfamily and generate subfamily representative behavior graph. en_US dc.description.tableofcontents 1 INTRODUCTION 92 RELATED WORK 112.1 IoT Malware 112.2 Virtual Machine Introspection 122.3 Markov Model 132.4 Graph Embedding 133 PROPOSED VMI PROFILING SYSTEM 163.1 Overview 163.2 IoT dataset 163.3 System Platform 173.4 Traditional profiling 173.5 QEMU + Proposed VMI Hook Plugin 184 PROPOSED FAMILY BEHAVIOR ANALYSIS 194.1 Overview 194.2 Notation 204.3 Markov Model PVDM Module 214.4 Family Classifying Module 234.5 Behavior Clustering Per Family Module 244.6 Subfamily Representative Graph Generator 245 EVALUATION 265.1 QEMU + Proposed VMI Hook Plugin 265.2 Dataset 275.3 Preprocess 295.4 Feature Engineering 335.5 Model representation Comparison 335.6 Family Classification Comparison 385.7 Subfamily Clustering 405.8 Subfamily Representing 476 CONCLUSION 71Reference 73 zh_TW dc.format.extent 10732513 bytes - dc.format.mimetype application/pdf - dc.source.uri (資料來源) http://thesis.lib.nccu.edu.tw/record/#G0107356035 en_US dc.subject (關鍵詞) 物聯網惡意程式 zh_TW dc.subject (關鍵詞) 虛擬機器內省 zh_TW dc.subject (關鍵詞) 順序資料 zh_TW dc.subject (關鍵詞) QEMU zh_TW dc.subject (關鍵詞) 動態分析 zh_TW dc.subject (關鍵詞) 圖形分析 zh_TW dc.subject (關鍵詞) 馬可夫模型 zh_TW dc.subject (關鍵詞) IoT malware en_US dc.subject (關鍵詞) Virtual Machine Introspection en_US dc.subject (關鍵詞) Sequential Data en_US dc.subject (關鍵詞) QEMU en_US dc.subject (關鍵詞) Dynamic Analysis en_US dc.subject (關鍵詞) Graph Analysis en_US dc.subject (關鍵詞) Markov Model en_US dc.title (題名) 物聯網惡意軟體動態分析監控系統與其家族行為分析 zh_TW dc.title (題名) IoT Malware Dynamic Analysis Profiling System and Family Behavior Analysis en_US dc.type (資料類型) thesis en_US dc.relation.reference (參考文獻) [1] B. Anderson, D. Quist, J. Neil, C. Storlie, and T. Lane, “Graph-based malware detection using dynamic analysis”, Journal in computer Virology, vol. 7, no. 4, pp. 247–258, 2011[2] X. Jiang, X. Wang, and D. Xu, “Stealthy malware detection through vmm-based out-of-the-box semantic view reconstruction”, in Proceedings of the 2007 ACM Conference on Computer and Communications Security, 2007, pp. 128-138.[3] A. Grover, and J. Leskovec, “node2vec: Scalable Feature Learning for Networks”, Proceedings of the 22nd ACM SIGKDD international conference on Knowledge discovery and data mining, 2016, pp.885-864.[4] A. Costin, and J. Zaddach, “IoT Malware: Comprehensive Survey, Analysis Framework and Case Studies”, BlackHat USA[5] M. Hossain, J. Wang, R. Sekar, and S. Stoller, “Dependence-Preserving Data Compaction for Scalable Forensic Analysis”, 27th USENIX Security Symposium (USENIX Security 18),2018, pp.1723-1740.[6] G. Davis, “2020: Life with 50 billion connected devices”, 2018 IEEE International Conference on Consumer Electronics, 2018, pp.1–1.[7] B. Vignau, R. Khoury, and S. Hallé, “10 Years of IoT Malware: a Feature-Based Taxonomy”, 2019 IEEE 19th International Conference on Software Quality, Reliability and SecurityCompanion, 2019, pp.458-465.[8] S. W. Hsiao, Y. S. Sun, and M. S. Chen, “Hardware-Assisted MMU Redirection for In-Guest Monitoring and API Profiling”, IEEE Transactions on Information Forensics and Security, 2020, pp.2402-2416.[9] Y. Hebbal, S. Laniepce, and J. M. Menaud, “Virtual machine introspection: Techniques and applications”, 2015 10th international conference on availability, reliability and security,2015, pp.676-685.[10] Q. Le, and T. Mikolov, “Distributed representations of sentences and documents”, International conference on machine learning, 2014, pp.1188-1196.[11] Y. Yang, L. Wu, G. Yin, and L. Li, “A survey on security and privacy issues in Internet-ofThings”, IEEE Internet of Things Journal, 2017, pp.1250-1258.[12] T. Garfinkel, and M. Rosenblum, “A Virtual Machine Introspection Based Architecture for Intrusion Detection”, Ndss, 2003, pp.191-206.[13] W. L. Hamilton, R. Ying, and J. Leskovec, “Representation learning on graphs: Methods and applications”, arXiv preprint, 2017, arXiv:1709.05584.[14] P. Goyal, and E. Ferrara, “Graph embedding techniques, applications, and performance: A survey”, Knowledge-Based Systems, 2018, pp.78-94.[15] B. PEROZZI, R. AL-RFOU, and S. SKIENA, “Deepwalk: Online learning of socialrepresentations”, Proceedings of the 20th ACM SIGKDD international conference on Knowledge discovery and data mining, 2014, pp.701-710.[16] D. Zhang, J. Yin, X. Zhu, and C. Zhang, “Network representation learning: A survey”, IEEE transactions on Big Data, 2018[17] T. Mikolov, K. Chen, G. Corrado, and J. Dean, “Efficient estimation of word representations in vector space”, arXiv preprint, 2013, arXiv:1301.3781.[18] S. W. Hsiao, Y. S. Sun, and M. C. Chen, “Virtual machine introspection based malware behavior profiling and family grouping”, arXiv preprint, 2017, arXiv:1705.01697.[19] S. Vogl, and C. Eckert, “Using hardware performance events for instruction-level monitoring on the X86 architecture”, Proc. Eur. Workshop Syst. Secur. (EuroSec), 2012, pp. 1–6.[20] A. Dinaburg, P. Royal, M. Sharif, and W. Lee, “Ether: Malware analysis via hardware virtualization extensions”, Proc. ACM Conf. Comput. Commun. Secur., 2008, pp. 51–62.[21] J. Pfoh, C. Schneider, and C. Eckert, “Nitro: Hardware-based system call tracing for virtual machines”, Advances in Information and Computer Security (Lecture Notes in Computer Science), 2011, pp. 96–112.[22] M. I. Sharif, W. Lee, W. Cui, and A. Lanzi, “Secure in-VM monitoring using hardware virtualization”, Proc. ACM Conf. Comput. Commun. Secur., 2009, pp. 477–487.[23] C. Willems, R. Hund, and T. Holz, “CXPInspector: Hypervisor-based, hardware-assisted system monitoring”, Ruhr-Univ. Bochum, Bochum, Germany, Tech. Rep. TR-HGI-2012-002, Nov. 26, 2012.[24] B. D. Payne, M. Carbone, M. Sharif, and W. Lee, “Lares: An architecture for secure active monitoring using virtualization”, Proc. IEEE Symp. Secur. Privacy, 2008, pp. 233–247.[25] Y. Fu and Z. Lin, “Exterior: Using a dual-VM based external shell for guest-OS introspection, configuration, and recovery”, ACM SIGPLAN Notices, 2013, pp. 97–110.[26] D. Song et al. , “BitBlaze: A new approach to computer security via binary analysis”, Information Systems Security (Lecture Notes in Computer Science), 2008, pp. 1–25.[27] CuckooSandbox,Accessed:Sep.20,2019.Online].Available:http://www.cuckoosandbox.org zh_TW dc.identifier.doi (DOI) 10.6814/NCCU202101066 en_US