Publications-Theses

Article View/Open

Publication Export

Google ScholarTM

NCCU Library

Citation Infomation

Related Publications in TAIR

題名 物聯網惡意軟體動態分析監控系統與其家族行為分析
IoT Malware Dynamic Analysis Profiling System and Family Behavior Analysis
作者 陳呈祐
Chen, Cheng-Yu
貢獻者 蕭舜文
Hsiao, Shun-Wen
陳呈祐
Chen, Cheng-Yu
關鍵詞 物聯網惡意程式
虛擬機器內省
順序資料
QEMU
動態分析
圖形分析
馬可夫模型
IoT malware
Virtual Machine Introspection
Sequential Data
QEMU
Dynamic Analysis
Graph Analysis
Markov Model
日期 2020
上傳時間 4-Aug-2021 14:46:07 (UTC+8)
摘要 最近不只物聯網設備的數量遽增,連帶物聯網惡意程式也大量出現。本研究希望了解物聯網惡意程式所帶來的威脅但現今缺乏方法來觀測、分析與偵測物聯網惡意程式。因此,我們設計了一個自動化的虛擬監控系統來蒐集物聯網惡意程式的行為,例如:API call invocation, system call execution等。除了傳統的監控方式 (Strace與封包側錄) 外,本研究提出一個監控系統使用虛擬機內省機制的C library hooking技術來擷取物聯網惡意程式所呼叫的C library call以避免遭到物聯網惡意程式的偵測。在所蒐集到的物聯網惡意程式行為中,本研究發現不只在各個惡意程式間有相似,在同一個惡意程式家族中也存有變異。因此,本研究認為在物聯網惡意程式中有家族並且物聯網惡意程式家族中也含有子家族。本研究提出一個家族行為分析系統透過馬可夫模型與Doc2Vec來分析物聯網惡意程式的順序資料並萃取向量化特徵、尋找子家族與子家族代表之圖形。
Not only the number of deployed IoT devices increases but also that of IoT malware. We are eager to understand the threat made by IoT malware, but we lack the tools to observe, analyze and detect them. Therefore, we design and implement an automatic, virtual machine-based profiling system to collect valuable IoT malware behavior, such as API call invocation, system call execution, etc. In addition to conventional profiling methods (e.g., Strace and packet capture), we proposed a profiling system that adapts virtual machine introspection based C library hooking technique to intercept C library call invocation by malware so that our introspection would not be detected by IoT malware. In the profiles we collected, we observe not only similarities between profiles but also variants in IoT family malware. Therefore, we anticipate that there are families in IoT malware and subfamily in the IoT malware family. We then propose a family behavior analysis system to analyze the multiple sequential data (C library calls) by the Markov model and Doc2Vec to extract vectorized malware features, discover subfamily and generate subfamily representative behavior graph.
參考文獻 [1] B. Anderson, D. Quist, J. Neil, C. Storlie, and T. Lane, “Graph-based malware detection using dynamic analysis”, Journal in computer Virology, vol. 7, no. 4, pp. 247–258, 2011
[2] X. Jiang, X. Wang, and D. Xu, “Stealthy malware detection through vmm-based out-of-the-box semantic view reconstruction”, in Proceedings of the 2007 ACM Conference on Computer and Communications Security, 2007, pp. 128-138.
[3] A. Grover, and J. Leskovec, “node2vec: Scalable Feature Learning for Networks”, Proceedings of the 22nd ACM SIGKDD international conference on Knowledge discovery and data mining, 2016, pp.885-864.
[4] A. Costin, and J. Zaddach, “IoT Malware: Comprehensive Survey, Analysis Framework and Case Studies”, BlackHat USA
[5] M. Hossain, J. Wang, R. Sekar, and S. Stoller, “Dependence-Preserving Data Compaction for Scalable Forensic Analysis”, 27th USENIX Security Symposium (USENIX Security 18),
2018, pp.1723-1740.
[6] G. Davis, “2020: Life with 50 billion connected devices”, 2018 IEEE International Conference on Consumer Electronics, 2018, pp.1–1.
[7] B. Vignau, R. Khoury, and S. Hallé, “10 Years of IoT Malware: a Feature-Based Taxonomy”, 2019 IEEE 19th International Conference on Software Quality, Reliability and Security
Companion, 2019, pp.458-465.
[8] S. W. Hsiao, Y. S. Sun, and M. S. Chen, “Hardware-Assisted MMU Redirection for In-Guest Monitoring and API Profiling”, IEEE Transactions on Information Forensics and Security, 2020, pp.2402-2416.
[9] Y. Hebbal, S. Laniepce, and J. M. Menaud, “Virtual machine introspection: Techniques and applications”, 2015 10th international conference on availability, reliability and security,
2015, pp.676-685.
[10] Q. Le, and T. Mikolov, “Distributed representations of sentences and documents”, International conference on machine learning, 2014, pp.1188-1196.
[11] Y. Yang, L. Wu, G. Yin, and L. Li, “A survey on security and privacy issues in Internet-ofThings”, IEEE Internet of Things Journal, 2017, pp.1250-1258.
[12] T. Garfinkel, and M. Rosenblum, “A Virtual Machine Introspection Based Architecture for Intrusion Detection”, Ndss, 2003, pp.191-206.
[13] W. L. Hamilton, R. Ying, and J. Leskovec, “Representation learning on graphs: Methods and applications”, arXiv preprint, 2017, arXiv:1709.05584.
[14] P. Goyal, and E. Ferrara, “Graph embedding techniques, applications, and performance: A survey”, Knowledge-Based Systems, 2018, pp.78-94.
[15] B. PEROZZI, R. AL-RFOU, and S. SKIENA, “Deepwalk: Online learning of social
representations”, Proceedings of the 20th ACM SIGKDD international conference on Knowledge discovery and data mining, 2014, pp.701-710.
[16] D. Zhang, J. Yin, X. Zhu, and C. Zhang, “Network representation learning: A survey”, IEEE transactions on Big Data, 2018
[17] T. Mikolov, K. Chen, G. Corrado, and J. Dean, “Efficient estimation of word representations in vector space”, arXiv preprint, 2013, arXiv:1301.3781.
[18] S. W. Hsiao, Y. S. Sun, and M. C. Chen, “Virtual machine introspection based malware behavior profiling and family grouping”, arXiv preprint, 2017, arXiv:1705.01697.
[19] S. Vogl, and C. Eckert, “Using hardware performance events for instruction-level monitoring on the X86 architecture”, Proc. Eur. Workshop Syst. Secur. (EuroSec), 2012, pp. 1–6.
[20] A. Dinaburg, P. Royal, M. Sharif, and W. Lee, “Ether: Malware analysis via hardware virtualization extensions”, Proc. ACM Conf. Comput. Commun. Secur., 2008, pp. 51–62.
[21] J. Pfoh, C. Schneider, and C. Eckert, “Nitro: Hardware-based system call tracing for virtual machines”, Advances in Information and Computer Security (Lecture Notes in Computer Science), 2011, pp. 96–112.
[22] M. I. Sharif, W. Lee, W. Cui, and A. Lanzi, “Secure in-VM monitoring using hardware virtualization”, Proc. ACM Conf. Comput. Commun. Secur., 2009, pp. 477–487.
[23] C. Willems, R. Hund, and T. Holz, “CXPInspector: Hypervisor-based, hardware-assisted system monitoring”, Ruhr-Univ. Bochum, Bochum, Germany, Tech. Rep. TR-HGI-2012-002, Nov. 26, 2012.
[24] B. D. Payne, M. Carbone, M. Sharif, and W. Lee, “Lares: An architecture for secure active monitoring using virtualization”, Proc. IEEE Symp. Secur. Privacy, 2008, pp. 233–247.
[25] Y. Fu and Z. Lin, “Exterior: Using a dual-VM based external shell for guest-OS introspection, configuration, and recovery”, ACM SIGPLAN Notices, 2013, pp. 97–110.
[26] D. Song et al. , “BitBlaze: A new approach to computer security via binary analysis”, Information Systems Security (Lecture Notes in Computer Science), 2008, pp. 1–25.
[27] CuckooSandbox,Accessed:Sep.20,2019.Online].Available:http://www.cuckoosandbox.org
描述 碩士
國立政治大學
資訊管理學系
107356035
資料來源 http://thesis.lib.nccu.edu.tw/record/#G0107356035
資料類型 thesis
dc.contributor.advisor 蕭舜文zh_TW
dc.contributor.advisor Hsiao, Shun-Wenen_US
dc.contributor.author (Authors) 陳呈祐zh_TW
dc.contributor.author (Authors) Chen, Cheng-Yuen_US
dc.creator (作者) 陳呈祐zh_TW
dc.creator (作者) Chen, Cheng-Yuen_US
dc.date (日期) 2020en_US
dc.date.accessioned 4-Aug-2021 14:46:07 (UTC+8)-
dc.date.available 4-Aug-2021 14:46:07 (UTC+8)-
dc.date.issued (上傳時間) 4-Aug-2021 14:46:07 (UTC+8)-
dc.identifier (Other Identifiers) G0107356035en_US
dc.identifier.uri (URI) http://nccur.lib.nccu.edu.tw/handle/140.119/136337-
dc.description (描述) 碩士zh_TW
dc.description (描述) 國立政治大學zh_TW
dc.description (描述) 資訊管理學系zh_TW
dc.description (描述) 107356035zh_TW
dc.description.abstract (摘要) 最近不只物聯網設備的數量遽增,連帶物聯網惡意程式也大量出現。本研究希望了解物聯網惡意程式所帶來的威脅但現今缺乏方法來觀測、分析與偵測物聯網惡意程式。因此,我們設計了一個自動化的虛擬監控系統來蒐集物聯網惡意程式的行為,例如:API call invocation, system call execution等。除了傳統的監控方式 (Strace與封包側錄) 外,本研究提出一個監控系統使用虛擬機內省機制的C library hooking技術來擷取物聯網惡意程式所呼叫的C library call以避免遭到物聯網惡意程式的偵測。在所蒐集到的物聯網惡意程式行為中,本研究發現不只在各個惡意程式間有相似,在同一個惡意程式家族中也存有變異。因此,本研究認為在物聯網惡意程式中有家族並且物聯網惡意程式家族中也含有子家族。本研究提出一個家族行為分析系統透過馬可夫模型與Doc2Vec來分析物聯網惡意程式的順序資料並萃取向量化特徵、尋找子家族與子家族代表之圖形。zh_TW
dc.description.abstract (摘要) Not only the number of deployed IoT devices increases but also that of IoT malware. We are eager to understand the threat made by IoT malware, but we lack the tools to observe, analyze and detect them. Therefore, we design and implement an automatic, virtual machine-based profiling system to collect valuable IoT malware behavior, such as API call invocation, system call execution, etc. In addition to conventional profiling methods (e.g., Strace and packet capture), we proposed a profiling system that adapts virtual machine introspection based C library hooking technique to intercept C library call invocation by malware so that our introspection would not be detected by IoT malware. In the profiles we collected, we observe not only similarities between profiles but also variants in IoT family malware. Therefore, we anticipate that there are families in IoT malware and subfamily in the IoT malware family. We then propose a family behavior analysis system to analyze the multiple sequential data (C library calls) by the Markov model and Doc2Vec to extract vectorized malware features, discover subfamily and generate subfamily representative behavior graph.en_US
dc.description.tableofcontents 1 INTRODUCTION 9
2 RELATED WORK 11
2.1 IoT Malware 11
2.2 Virtual Machine Introspection 12
2.3 Markov Model 13
2.4 Graph Embedding 13
3 PROPOSED VMI PROFILING SYSTEM 16
3.1 Overview 16
3.2 IoT dataset 16
3.3 System Platform 17
3.4 Traditional profiling 17
3.5 QEMU + Proposed VMI Hook Plugin 18
4 PROPOSED FAMILY BEHAVIOR ANALYSIS 19
4.1 Overview 19
4.2 Notation 20
4.3 Markov Model PVDM Module 21
4.4 Family Classifying Module 23
4.5 Behavior Clustering Per Family Module 24
4.6 Subfamily Representative Graph Generator 24
5 EVALUATION 26
5.1 QEMU + Proposed VMI Hook Plugin 26
5.2 Dataset 27
5.3 Preprocess 29
5.4 Feature Engineering 33
5.5 Model representation Comparison 33
5.6 Family Classification Comparison 38
5.7 Subfamily Clustering 40
5.8 Subfamily Representing 47
6 CONCLUSION 71
Reference 73
zh_TW
dc.format.extent 10732513 bytes-
dc.format.mimetype application/pdf-
dc.source.uri (資料來源) http://thesis.lib.nccu.edu.tw/record/#G0107356035en_US
dc.subject (關鍵詞) 物聯網惡意程式zh_TW
dc.subject (關鍵詞) 虛擬機器內省zh_TW
dc.subject (關鍵詞) 順序資料zh_TW
dc.subject (關鍵詞) QEMUzh_TW
dc.subject (關鍵詞) 動態分析zh_TW
dc.subject (關鍵詞) 圖形分析zh_TW
dc.subject (關鍵詞) 馬可夫模型zh_TW
dc.subject (關鍵詞) IoT malwareen_US
dc.subject (關鍵詞) Virtual Machine Introspectionen_US
dc.subject (關鍵詞) Sequential Dataen_US
dc.subject (關鍵詞) QEMUen_US
dc.subject (關鍵詞) Dynamic Analysisen_US
dc.subject (關鍵詞) Graph Analysisen_US
dc.subject (關鍵詞) Markov Modelen_US
dc.title (題名) 物聯網惡意軟體動態分析監控系統與其家族行為分析zh_TW
dc.title (題名) IoT Malware Dynamic Analysis Profiling System and Family Behavior Analysisen_US
dc.type (資料類型) thesisen_US
dc.relation.reference (參考文獻) [1] B. Anderson, D. Quist, J. Neil, C. Storlie, and T. Lane, “Graph-based malware detection using dynamic analysis”, Journal in computer Virology, vol. 7, no. 4, pp. 247–258, 2011
[2] X. Jiang, X. Wang, and D. Xu, “Stealthy malware detection through vmm-based out-of-the-box semantic view reconstruction”, in Proceedings of the 2007 ACM Conference on Computer and Communications Security, 2007, pp. 128-138.
[3] A. Grover, and J. Leskovec, “node2vec: Scalable Feature Learning for Networks”, Proceedings of the 22nd ACM SIGKDD international conference on Knowledge discovery and data mining, 2016, pp.885-864.
[4] A. Costin, and J. Zaddach, “IoT Malware: Comprehensive Survey, Analysis Framework and Case Studies”, BlackHat USA
[5] M. Hossain, J. Wang, R. Sekar, and S. Stoller, “Dependence-Preserving Data Compaction for Scalable Forensic Analysis”, 27th USENIX Security Symposium (USENIX Security 18),
2018, pp.1723-1740.
[6] G. Davis, “2020: Life with 50 billion connected devices”, 2018 IEEE International Conference on Consumer Electronics, 2018, pp.1–1.
[7] B. Vignau, R. Khoury, and S. Hallé, “10 Years of IoT Malware: a Feature-Based Taxonomy”, 2019 IEEE 19th International Conference on Software Quality, Reliability and Security
Companion, 2019, pp.458-465.
[8] S. W. Hsiao, Y. S. Sun, and M. S. Chen, “Hardware-Assisted MMU Redirection for In-Guest Monitoring and API Profiling”, IEEE Transactions on Information Forensics and Security, 2020, pp.2402-2416.
[9] Y. Hebbal, S. Laniepce, and J. M. Menaud, “Virtual machine introspection: Techniques and applications”, 2015 10th international conference on availability, reliability and security,
2015, pp.676-685.
[10] Q. Le, and T. Mikolov, “Distributed representations of sentences and documents”, International conference on machine learning, 2014, pp.1188-1196.
[11] Y. Yang, L. Wu, G. Yin, and L. Li, “A survey on security and privacy issues in Internet-ofThings”, IEEE Internet of Things Journal, 2017, pp.1250-1258.
[12] T. Garfinkel, and M. Rosenblum, “A Virtual Machine Introspection Based Architecture for Intrusion Detection”, Ndss, 2003, pp.191-206.
[13] W. L. Hamilton, R. Ying, and J. Leskovec, “Representation learning on graphs: Methods and applications”, arXiv preprint, 2017, arXiv:1709.05584.
[14] P. Goyal, and E. Ferrara, “Graph embedding techniques, applications, and performance: A survey”, Knowledge-Based Systems, 2018, pp.78-94.
[15] B. PEROZZI, R. AL-RFOU, and S. SKIENA, “Deepwalk: Online learning of social
representations”, Proceedings of the 20th ACM SIGKDD international conference on Knowledge discovery and data mining, 2014, pp.701-710.
[16] D. Zhang, J. Yin, X. Zhu, and C. Zhang, “Network representation learning: A survey”, IEEE transactions on Big Data, 2018
[17] T. Mikolov, K. Chen, G. Corrado, and J. Dean, “Efficient estimation of word representations in vector space”, arXiv preprint, 2013, arXiv:1301.3781.
[18] S. W. Hsiao, Y. S. Sun, and M. C. Chen, “Virtual machine introspection based malware behavior profiling and family grouping”, arXiv preprint, 2017, arXiv:1705.01697.
[19] S. Vogl, and C. Eckert, “Using hardware performance events for instruction-level monitoring on the X86 architecture”, Proc. Eur. Workshop Syst. Secur. (EuroSec), 2012, pp. 1–6.
[20] A. Dinaburg, P. Royal, M. Sharif, and W. Lee, “Ether: Malware analysis via hardware virtualization extensions”, Proc. ACM Conf. Comput. Commun. Secur., 2008, pp. 51–62.
[21] J. Pfoh, C. Schneider, and C. Eckert, “Nitro: Hardware-based system call tracing for virtual machines”, Advances in Information and Computer Security (Lecture Notes in Computer Science), 2011, pp. 96–112.
[22] M. I. Sharif, W. Lee, W. Cui, and A. Lanzi, “Secure in-VM monitoring using hardware virtualization”, Proc. ACM Conf. Comput. Commun. Secur., 2009, pp. 477–487.
[23] C. Willems, R. Hund, and T. Holz, “CXPInspector: Hypervisor-based, hardware-assisted system monitoring”, Ruhr-Univ. Bochum, Bochum, Germany, Tech. Rep. TR-HGI-2012-002, Nov. 26, 2012.
[24] B. D. Payne, M. Carbone, M. Sharif, and W. Lee, “Lares: An architecture for secure active monitoring using virtualization”, Proc. IEEE Symp. Secur. Privacy, 2008, pp. 233–247.
[25] Y. Fu and Z. Lin, “Exterior: Using a dual-VM based external shell for guest-OS introspection, configuration, and recovery”, ACM SIGPLAN Notices, 2013, pp. 97–110.
[26] D. Song et al. , “BitBlaze: A new approach to computer security via binary analysis”, Information Systems Security (Lecture Notes in Computer Science), 2008, pp. 1–25.
[27] CuckooSandbox,Accessed:Sep.20,2019.Online].Available:http://www.cuckoosandbox.org
zh_TW
dc.identifier.doi (DOI) 10.6814/NCCU202101066en_US