題名 程式預測預訓練模型的攻擊與防禦研究
Natural Attack and Defense for Pre-trained Models of Code Analysis
作者 黃舜硯
Huang, Shun-Yen
貢獻者 郁方
Yu, Fang
Huang, Shun-Yen
關鍵詞 預訓練模型
Natural Attack
日期 2024
上傳時間 1-二月-2024 10:56:56 (UTC+8)
摘要 預訓練程式碼分析模型透過惡意程式碼偵測等應用徹底改變了軟體工程。然而,它們的有效性受到了對抗性攻擊的威脅,例如 ALERT,它透過巧妙地修改輸入來操縱模型輸出。這篇論文提出了一種基於側錄的方法來識別針對 CodeBERT 的 ALERT 攻擊。我們利用動態程式追蹤來捕獲模型在處理原始樣本和對抗性樣本時的內部行為。這些追蹤記錄捕獲了關於函式調用的詳細資訊,包括它們的呼叫次數、返回值和執行時間。透過仔細比較這些追蹤記錄,我們希望識別出 ALERT攻擊的存在與否。 此外,我們利用神經網路進行訓練。該神經網路的訓練集分別正常的程式及惡意程式,其中正常有被攻擊及沒被攻擊過的,惡意程式亦然。我們訓練結果如下:在正常程式程式資料集中,兩種模型提取屬性實現了 62% 和 72.2% 的準確率,在惡意程式資料集中,兩種模型提取屬性實現了 70% 和 89.1% 的準確率,在混合程式資料集中,兩種模型提取屬性實現了 69.3% 和 71.6% 的準確率。這些發現證明了基於效能分析的技術在預訓練程式碼模型中偵測對抗性攻擊的潛力。這項研究為進一步探索和改進這些方法開闢了道路,最終有助於預訓練模型在關鍵軟體工程任務中的彈性提升。
Pre-trained code analysis models have revolutionized software engineering with applications like malicious code detection. However, their effectiveness is threatened by adversarial attacks like ALERT, which subtly alter inputs to manipulate model outputs. This paper presents a novel tracing-based approach to identify ALERT attacks targeting CodeBERT. We leverage dynamic program tracing to capture the model's internal behavior while processing both original and adversarial samples. These traces capture detailed information about function calls, including their counts, return values, and execution times. By meticulously comparing these traces, we aim to identify characteristic patterns indicative of ALERT manipulations, revealing the attack's presence. Further, we explore the use of a neural network trained on profiled data categorized as normal, malicious, and mixed. Our investigation yielded promising results: two key model attributes derived from the traces achieved an accuracy of 62% and 72.2% on normal code, 70% and 89.1% on malicious code, and 69.3% and 71.6% on the combined dataset. These findings demonstrate the potential of profiling-based techniques for detecting adversarial attacks in pre-trained code models. This research opens avenues for further exploration and refinement of such methods, ultimately contributing to the resilience of pre-trained models in critical software engineering tasks.
dc.description.abstract (摘要) 預訓練程式碼分析模型透過惡意程式碼偵測等應用徹底改變了軟體工程。然而,它們的有效性受到了對抗性攻擊的威脅,例如 ALERT,它透過巧妙地修改輸入來操縱模型輸出。這篇論文提出了一種基於側錄的方法來識別針對 CodeBERT 的 ALERT 攻擊。我們利用動態程式追蹤來捕獲模型在處理原始樣本和對抗性樣本時的內部行為。這些追蹤記錄捕獲了關於函式調用的詳細資訊,包括它們的呼叫次數、返回值和執行時間。透過仔細比較這些追蹤記錄,我們希望識別出 ALERT攻擊的存在與否。 此外,我們利用神經網路進行訓練。該神經網路的訓練集分別正常的程式及惡意程式,其中正常有被攻擊及沒被攻擊過的,惡意程式亦然。我們訓練結果如下:在正常程式程式資料集中,兩種模型提取屬性實現了 62% 和 72.2% 的準確率,在惡意程式資料集中,兩種模型提取屬性實現了 70% 和 89.1% 的準確率,在混合程式資料集中,兩種模型提取屬性實現了 69.3% 和 71.6% 的準確率。這些發現證明了基於效能分析的技術在預訓練程式碼模型中偵測對抗性攻擊的潛力。這項研究為進一步探索和改進這些方法開闢了道路,最終有助於預訓練模型在關鍵軟體工程任務中的彈性提升。zh_TW
dc.description.abstract (摘要) Pre-trained code analysis models have revolutionized software engineering with applications like malicious code detection. However, their effectiveness is threatened by adversarial attacks like ALERT, which subtly alter inputs to manipulate model outputs. This paper presents a novel tracing-based approach to identify ALERT attacks targeting CodeBERT. We leverage dynamic program tracing to capture the model's internal behavior while processing both original and adversarial samples. These traces capture detailed information about function calls, including their counts, return values, and execution times. By meticulously comparing these traces, we aim to identify characteristic patterns indicative of ALERT manipulations, revealing the attack's presence. Further, we explore the use of a neural network trained on profiled data categorized as normal, malicious, and mixed. Our investigation yielded promising results: two key model attributes derived from the traces achieved an accuracy of 62% and 72.2% on normal code, 70% and 89.1% on malicious code, and 69.3% and 71.6% on the combined dataset. These findings demonstrate the potential of profiling-based techniques for detecting adversarial attacks in pre-trained code models. This research opens avenues for further exploration and refinement of such methods, ultimately contributing to the resilience of pre-trained models in critical software engineering tasks.en_US
dc.description.tableofcontents 1 Introduction 4 2 Related Work 5 2.1 Code Analysis with AI tools 5 2.2 Attacks on AI Code Analysis Tools 6 2.3 Defense on Attacks 7 3 Methodology 12 3.1 Natural Attack craft altered code 13 3.2 Profiling CodeBERT 14 3.2.1 CodeBERT structure 14 3.2.2 Defect detection tasking on CodeBERT 15 3.2.3 Profiling 17 3.3 Information leakage 19 3.4 Detection rule derivation 22 3.4.1 Straight function approach 22 3.4.2 Neural Network Approach 24 4 EVALUATION 27 4.1 Dataset perturbation through Natural Attacks 27 4.2 Profiling-driven rule discovery 31 4.2.1 Normal code 31 4.2.2 Malicious java code 37 4.3 Can NN trained on a mixture of normal and malicious data detect altered examples? 41 5 Conclusion and Future work 42 6 References 43zh_TW
dc.subject (關鍵詞) 預訓練模型zh_TW
dc.subject (關鍵詞) 自然攻擊zh_TW
dc.subject (關鍵詞) 側錄zh_TW
dc.subject (關鍵詞) CodeBERTen_US
dc.subject (關鍵詞) Natural Attacken_US
dc.subject (關鍵詞) Profileen_US
dc.title (題名) 程式預測預訓練模型的攻擊與防禦研究zh_TW
dc.title (題名) Natural Attack and Defense for Pre-trained Models of Code Analysisen_US
dc.type (資料類型) thesisen_US
