以虛擬化混淆轉換來落實 Python 程式的安全式機器學習 | Publication

Publications-Theses

Article View/Open

pdf(282)

Publication Export

Google Scholar^TM

NCCU Library

Discovery System

Citation Infomation

Related Publications in TAIR

Simple Record
Full Record

題名	以虛擬化混淆轉換來落實 Python 程式的安全式機器學習 Secure machine learning through virtualization obfuscation of Python code
作者	邱怡翔 CHIU, YI-HSIANG
貢獻者	胡毓忠 Hu, Yuh-Jong 邱怡翔 CHIU, YI-HSIANG
關鍵詞	程式碼混淆虛擬化混淆轉換安全式機器學習 Code obfuscation Virtualization obfuscation Secure machine learning
日期	2019
上傳時間	1-Jul-2019 10:59:22 (UTC+8)
摘要	借助機器學習的能力人們可以從資料裡得到許多有用的資訊。當有巨量分析需求的資料時經常以向公有雲平台提供者租用運算資源來進行叢集運算作為處理方式。然而在公有雲進行運算意味著不可信任性,程式資訊有洩漏的可能性。本研究以保護 Python 程式語言撰寫的程式為目的設計程式碼混淆轉換工具,其利用虛擬化混淆演算法作為主要轉換方式來修改程式,轉換後的程式達成程序抽象化,確保模型在訓練及預測階段的運算方式無法被輕易得知。此外,本研究應用簡單化混淆來改寫虛擬化混淆轉換中,直譯器的運作方式來阻饒攻擊者進行靜態及動態的程式分析。在轉換效果評估上,本研究以 Kaggle 預測鐵達尼號事件存亡的競賽資料集準備機器學習程式。機器學習程式在虛擬化轉換後,控制流程被全面地改寫並且使軟體複雜度大幅提高,而這也將使程式執行時間增加 43 到 70 倍。 With the power of machine learning, people can get a lot of useful information from the data. When there is a huge amount of data for analyzing, the cluster computing operation is often carried out by renting computing resources, which is offered by the public cloud platform provider. However, computing in the public cloud means untrustworthiness, and program information has the possibility of leakage. This paper designs a code obfuscation conversion tool for the purpose of protecting programs written in the Python programming language. It uses the Virtualization Obfuscation algorithm as the main conversion method to modify the program, and the converted program achieves program abstraction to ensure that the model is secure in the training and prediction stage. In addition, this study also applies simplicity obfuscation to rewrite the interpreter in the Virtualization Obfuscation transformation, so that the attacker is harder to perform static and dynamic program analysis. In the evaluation of the conversion effect, this study prepares a machine learning program based on the Kaggle competition data set in which predicts the survival of the Titanic event. After the Virtualization Obfuscation transform is performed on the machine learning program, the control flow is completely rewritten and the complexity of the software is greatly improved, but this will also increase the program execution time by 43 to 70 times.
參考文獻	[1] B. Anckaert, M. H. Jakubowski, R. Venkatesan. "Virtualization for diversified tamper resistance." U.S. Patent No. 8,584,109. 12 Nov. 2013. [2] D. Apon, et al. "Implementing Cryptographic Program Obfuscation." IACR Cryptology ePrint Archive 2014 (2014): 779. [3] M. R.Asghar, S.D. Galbraith, G. Russello. "Obfuscation through simplicity." (2016). [4] S. Banescu, et al. "Code obfuscation against symbolic execution attacks." Proceedings of the 32nd Annual Conference on Computer Security Applications. ACM, 2016. [5] S. Banescu, et al. "Vot4cs: A virtualization obfuscation tool for C#" Proceedings of the 2016 ACM Workshop on Software PROtection.ACM, 2016. [6] C. Cadar, D. Dunbar, D. R. Engler. "KLEE: Unassisted andAutomatic Generation of High-Coverage Tests for Complex Systems Programs." OSDI. Vol. 8. 2008. [7] J. Cazalas, et al. "Probing the limits of virtualized software protection." Proceedings of the 4th Program Protection and Reverse Engineering Workshop. ACM, 2014. [8] C. Collberg, C. Thomborson, D. Low.A taxonomy of obfuscating transformations. Department of Computer Science, The University ofAuckland, New Zealand, 1997. [9] C. Collberg, et al. "Distributed application tamper detection via continuous software updates." Proceedings of the 28th Annual Computer Security Applications Conference.ACM, 2012. [10] K. Coogan, G. Lu, S. Debray. "Deobfuscation of virtualization-obfuscated software: a semantics-based approach." Proceedings of the 18th ACM conference on Computer and communications security.ACM, 2011. [11] S. Garg, et al. "Candidate indistinguishability obfuscation and functional encryption for all circuits." SIAM Journal on Computing 45.3 (2016): 882-929. 25 [12] M. H. Halstead. Elements of software science. Vol. 7. New York: Elsevier, 1977. [13] J. Kinder. "Towards static analysis of virtualization-obfuscated binaries." Reverse Engineering (WCRE), 2012 19th Working Conference on. IEEE, 2012. [14] J. C. King. "Symbolic execution and program testing." Communications of the ACM 19.7 (1976): 385-394. [15] T. J. McCabe. "A complexity measure." IEEE Transactions on software Engineering 4 (1976): 308-320. [16] J. Nagra, C. Collberg. Surreptitious Software: Obfuscation, Watermarking, and Tamperproofing for Software Protection. Pearson Education, 2009. [17] T.A. Proebsting. "Optimizing an ANSI C interpreter with superoperators." Proceedings of the 22nd ACM SIGPLAN-SIGACT symposium on Principles of programming languages. ACM, 1995. [18] R. Rolles. "Unpacking virtualization obfuscators." 3rd USENIX Workshop on Offensive Technologies.(WOOT). 2009. [19] S.A. Sebastian, S. Malgaonkar, P. Shah, M. Kapoor and T. Parekhji, "A study & review on code obfuscation," 2016 World Conference on Futuristic Trends in Research and Innovation for Social Welfare (Startup Conclave), Coimbatore, 2016, pp. 1-6. [20] M. Sharif, et al. "Automatic reverse engineering of malware emulators." 2009 30th IEEE Symposium on Security and Privacy. IEEE, 2009. [21] B. Yadegari, et al. "A generic approach to automatic deobfuscation of executable code." 2015 IEEE Symposium on Security and Privacy. IEEE, 2015.
描述	碩士國立政治大學資訊科學系 105753027
資料來源	http://thesis.lib.nccu.edu.tw/record/#G0105753027
資料類型	thesis

dc.contributor.advisor	胡毓忠	zh_TW
dc.contributor.advisor	Hu, Yuh-Jong	en_US
dc.contributor.author (Authors)	邱怡翔	zh_TW
dc.contributor.author (Authors)	CHIU, YI-HSIANG	en_US
dc.creator (作者)	邱怡翔	zh_TW
dc.creator (作者)	CHIU, YI-HSIANG	en_US
dc.date (日期)	2019	en_US
dc.date.accessioned	1-Jul-2019 10:59:22 (UTC+8)	-
dc.date.available	1-Jul-2019 10:59:22 (UTC+8)	-
dc.date.issued (上傳時間)	1-Jul-2019 10:59:22 (UTC+8)	-
dc.identifier (Other Identifiers)	G0105753027	en_US
dc.identifier.uri (URI)	http://nccur.lib.nccu.edu.tw/handle/140.119/124196	-
dc.description (描述)	碩士	zh_TW
dc.description (描述)	國立政治大學	zh_TW
dc.description (描述)	資訊科學系	zh_TW
dc.description (描述)	105753027	zh_TW
dc.description.abstract (摘要)	借助機器學習的能力人們可以從資料裡得到許多有用的資訊。當有巨量分析需求的資料時經常以向公有雲平台提供者租用運算資源來進行叢集運算作為處理方式。然而在公有雲進行運算意味著不可信任性,程式資訊有洩漏的可能性。本研究以保護 Python 程式語言撰寫的程式為目的設計程式碼混淆轉換工具,其利用虛擬化混淆演算法作為主要轉換方式來修改程式,轉換後的程式達成程序抽象化,確保模型在訓練及預測階段的運算方式無法被輕易得知。此外,本研究應用簡單化混淆來改寫虛擬化混淆轉換中,直譯器的運作方式來阻饒攻擊者進行靜態及動態的程式分析。在轉換效果評估上,本研究以 Kaggle 預測鐵達尼號事件存亡的競賽資料集準備機器學習程式。機器學習程式在虛擬化轉換後,控制流程被全面地改寫並且使軟體複雜度大幅提高,而這也將使程式執行時間增加 43 到 70 倍。	zh_TW
dc.description.abstract (摘要)	With the power of machine learning, people can get a lot of useful information from the data. When there is a huge amount of data for analyzing, the cluster computing operation is often carried out by renting computing resources, which is offered by the public cloud platform provider. However, computing in the public cloud means untrustworthiness, and program information has the possibility of leakage. This paper designs a code obfuscation conversion tool for the purpose of protecting programs written in the Python programming language. It uses the Virtualization Obfuscation algorithm as the main conversion method to modify the program, and the converted program achieves program abstraction to ensure that the model is secure in the training and prediction stage. In addition, this study also applies simplicity obfuscation to rewrite the interpreter in the Virtualization Obfuscation transformation, so that the attacker is harder to perform static and dynamic program analysis. In the evaluation of the conversion effect, this study prepares a machine learning program based on the Kaggle competition data set in which predicts the survival of the Titanic event. After the Virtualization Obfuscation transform is performed on the machine learning program, the control flow is completely rewritten and the complexity of the software is greatly improved, but this will also increase the program execution time by 43 to 70 times.	en_US
dc.description.tableofcontents	摘要 i ABSTRACT ii 表目錄 v 圖目錄 vi 第一章導論 1 1.1 研究動機 1 1.2 研究目的 2 1.3 研究成果 2 第二章研究背景 3 2.1 程式碼混淆 3 2.1.1 虛擬化混淆演算法 3 2.1.2 對虛擬化混淆的攻擊 6 2.2 Python 程式 8 第三章相關研究 11 第四章混淆方法與流程 12 4.1 原始程式分析 12 4.2 虛擬化混淆轉換 13 4.2.1 建立混淆版位元組碼 15 4.2.2 建立自訂直譯器 16 4.3 簡單化混淆轉換 17 第五章研究實作 19 5.1 混淆前程式準備 19 5.2 虛擬化混淆後程式實測 20 5.3 虛擬化混淆轉換效力 21 第六章結論與未來研究 23 6.1 研究結論與貢獻 23 6.2 研究限制 23 參考文獻 25	zh_TW
dc.format.extent	2742003 bytes	-
dc.format.mimetype	application/pdf	-
dc.source.uri (資料來源)	http://thesis.lib.nccu.edu.tw/record/#G0105753027	en_US
dc.subject (關鍵詞)	程式碼混淆	zh_TW
dc.subject (關鍵詞)	虛擬化混淆轉換	zh_TW
dc.subject (關鍵詞)	安全式機器學習	zh_TW
dc.subject (關鍵詞)	Code obfuscation	en_US
dc.subject (關鍵詞)	Virtualization obfuscation	en_US
dc.subject (關鍵詞)	Secure machine learning	en_US
dc.title (題名)	以虛擬化混淆轉換來落實 Python 程式的安全式機器學習	zh_TW
dc.title (題名)	Secure machine learning through virtualization obfuscation of Python code	en_US
dc.type (資料類型)	thesis	en_US
dc.relation.reference (參考文獻)	[1] B. Anckaert, M. H. Jakubowski, R. Venkatesan. "Virtualization for diversified tamper resistance." U.S. Patent No. 8,584,109. 12 Nov. 2013. [2] D. Apon, et al. "Implementing Cryptographic Program Obfuscation." IACR Cryptology ePrint Archive 2014 (2014): 779. [3] M. R.Asghar, S.D. Galbraith, G. Russello. "Obfuscation through simplicity." (2016). [4] S. Banescu, et al. "Code obfuscation against symbolic execution attacks." Proceedings of the 32nd Annual Conference on Computer Security Applications. ACM, 2016. [5] S. Banescu, et al. "Vot4cs: A virtualization obfuscation tool for C#" Proceedings of the 2016 ACM Workshop on Software PROtection.ACM, 2016. [6] C. Cadar, D. Dunbar, D. R. Engler. "KLEE: Unassisted andAutomatic Generation of High-Coverage Tests for Complex Systems Programs." OSDI. Vol. 8. 2008. [7] J. Cazalas, et al. "Probing the limits of virtualized software protection." Proceedings of the 4th Program Protection and Reverse Engineering Workshop. ACM, 2014. [8] C. Collberg, C. Thomborson, D. Low.A taxonomy of obfuscating transformations. Department of Computer Science, The University ofAuckland, New Zealand, 1997. [9] C. Collberg, et al. "Distributed application tamper detection via continuous software updates." Proceedings of the 28th Annual Computer Security Applications Conference.ACM, 2012. [10] K. Coogan, G. Lu, S. Debray. "Deobfuscation of virtualization-obfuscated software: a semantics-based approach." Proceedings of the 18th ACM conference on Computer and communications security.ACM, 2011. [11] S. Garg, et al. "Candidate indistinguishability obfuscation and functional encryption for all circuits." SIAM Journal on Computing 45.3 (2016): 882-929. 25 [12] M. H. Halstead. Elements of software science. Vol. 7. New York: Elsevier, 1977. [13] J. Kinder. "Towards static analysis of virtualization-obfuscated binaries." Reverse Engineering (WCRE), 2012 19th Working Conference on. IEEE, 2012. [14] J. C. King. "Symbolic execution and program testing." Communications of the ACM 19.7 (1976): 385-394. [15] T. J. McCabe. "A complexity measure." IEEE Transactions on software Engineering 4 (1976): 308-320. [16] J. Nagra, C. Collberg. Surreptitious Software: Obfuscation, Watermarking, and Tamperproofing for Software Protection. Pearson Education, 2009. [17] T.A. Proebsting. "Optimizing an ANSI C interpreter with superoperators." Proceedings of the 22nd ACM SIGPLAN-SIGACT symposium on Principles of programming languages. ACM, 1995. [18] R. Rolles. "Unpacking virtualization obfuscators." 3rd USENIX Workshop on Offensive Technologies.(WOOT). 2009. [19] S.A. Sebastian, S. Malgaonkar, P. Shah, M. Kapoor and T. Parekhji, "A study & review on code obfuscation," 2016 World Conference on Futuristic Trends in Research and Innovation for Social Welfare (Startup Conclave), Coimbatore, 2016, pp. 1-6. [20] M. Sharif, et al. "Automatic reverse engineering of malware emulators." 2009 30th IEEE Symposium on Security and Privacy. IEEE, 2009. [21] B. Yadegari, et al. "A generic approach to automatic deobfuscation of executable code." 2015 IEEE Symposium on Security and Privacy. IEEE, 2015.	zh_TW
dc.identifier.doi (DOI)	10.6814/NCCU201900153	en_US

Publications-Theses

Article View/Open

Publication Export

Google ScholarTM

NCCU Library

Citation Infomation

Related Publications in TAIR

Google Scholar^TM