Publications-Theses
Article View/Open
Publication Export
-
Google ScholarTM
NCCU Library
Citation Infomation
Related Publications in TAIR
題名 基於多雲架構之多方數據主成份分析隱私保護技術
Privacy Protection Technology for Principal Component Analysis of Multi-Party Data Based on Multi-Cloud Architecture作者 陳幼恩
Chen, Yu-En貢獻者 左瑞麟
Tso, Ray-Lin
陳幼恩
Chen, Yu-En關鍵詞 隱私保護
秘密共享
主成分分析
多方計算
Privacy-preserving
Secret Sharing
Principal component analysis
Multi party Computation日期 2025 上傳時間 2-Oct-2025 11:11:19 (UTC+8) 摘要 隨著隱私保護需求在資料分析與機器學習領域的重要性日益提升,多方資料環境下的安全協同計算已成為關鍵課題。為此,本研究提出一項基於多雲架構之多方數據主成分分析(PCA)隱私保護技術,能於無需集中明文資料的情況下,完成分散式的特徵萃取任務。本技術整合門限式秘密共享、混淆機制與協同計算協議,有效避免任何單一雲端節點接觸原始資料或中間統計資訊。本方法以保護協方差矩陣為核心,結合多方資料提供者與多雲節點,實現可擴展且具安全保證的主成分特徵計算流程。各方僅需共享經過秘密分割的統計量,即可在不暴露任何原始內容的前提下,完成特徵值與特徵向量之協同運算。最終所輸出的主成分資訊可供資料消費者用於後續模型訓練與分析任務,兼顧隱私保護與實用性。本技術設計強調對多方數據場景的適應性,並在安全性、通訊與計算開銷之間取得良好平衡,具備高度擴展潛力,適用於醫療、金融與工業等需嚴格隱私保護之跨域應用場景。
As the demand for privacy protection in data analytics and machine-learning continues to grow, secure collaborative computation over multi-party data has become a critical research topic. To address this challenge, we propose a privacy-preserving principal component analysis (PCA) technique based on a multi-cloud architecture, which performs distributed feature extraction without aggregating plaintext data. The proposed technique combines threshold secret sharing, obfuscation mechanisms, and collaborative-computing protocols, thereby preventing any single cloud node from accessing raw data or intermediate statistical information. Focusing on the protection of the covariance matrix, the method links multiple data providers with multiple cloud nodes to create a scalable, provably secure workflow for computing principal components. Each party shares only secret-split statis tics, enabling the joint calculation of eigenvalues and eigenvectors without revealing any original content. The resulting principal-component information can be delivered to data consumers for downstream model training and analysis, ensuring both privacy and usability. Designed for multi-party data scenarios, the architecture strikes an effective balance among security, communication, and computational overhead. It offers strong scalability and is well-suited to cross-domain applications—such as healthcare, finance, and industrial settings—where stringent privacy protection is essential.參考文獻 [1] Daniel E. O’Leary. “Artificial Intelligence and Big Data.” In: IEEE Intelligent Sys- tems 28.2 (2013), pp. 96–99 (cit. p. 1). [2] Keith Bonawitz, Vladimir Ivanov, Ben Kreuter, et al. “Practical Secure Aggrega- tion for Privacy-Preserving Machine Learning.” In: Proceedings of the 2017 ACM SIGSAC Conference on Computer and Communications Security. CCS ’17. Dallas, Texas, USA: Association for Computing Machinery, 2017, pp. 1175–1191 (cit. p. 1). [3] Hoon Wei Lim, Geong Sen Poh, Jia Xu, and Varsha Chittawar. “PrivateLink : Privacy- Preserving Integration and Sharing of Datasets.” In: IEEE Transactions on Informa- tion Forensics and Security 15 (2020), pp. 564–577 (cit. p. 1). [4] Arvind Narayanan and Vitaly Shmatikov. “Robust De-anonymization of Large Sparse Datasets.” In: 2008 IEEE Symposium on Security and Privacy (sp 2008). 2008, pp. 111– 125 (cit. p. 1). [5] Wayne Jansen and Timothy Grance. SP 800-144. Guidelines on Security and Privacy in Public Cloud Computing. Tech. rep. Gaithersburg, MD, USA: National Institute of Standards and Technology (NIST), 2011 (cit. p. 1). [6] Johannes Heurix, Peter Zimmermann, Thomas Neubauer, and Stefan Fenz. “A tax- onomy for privacy enhancing technologies.” In: Computers & Security 53 (2015), pp. 1–17 (cit. p. 1). [7] Inayat Ali, Eraj Khan, and Sonia Sabir. “Privacy-preserving data aggregation in resource- constrained sensor nodes in Internet of Things: A review.” In: Future Computing and Informatics Journal 3.1 (2018), pp. 41–50 (cit. p. 1). [8] Adi Shamir. “How to share a secret.” In: Commun. ACM 22.11 (Nov. 1979), pp. 612– 613 (cit. p. 1). [9] Nileshkumar Kakade and Utpalkumar Patel. “Secure Secret Sharing Using Homo- morphic Encryption.” In: 2020 11th International Conference on Computing, Com- munication and Networking Technologies (ICCCNT). 2020, pp. 1–7 (cit. p. 2). [10] Saloni Kwatra and Vicenç Torra. “Data reconstruction attack against principal com- ponent analysis.” In: International Symposium on Security and Privacy in Social Net- works and Big Data. Springer. 2023, pp. 79–92 (cit. p. 2). [11] Valeria Nikolaenko, Stratis Ioannidis, Udi Weinsberg, et al. “Privacy-preserving ma- trix factorization.” In: Proceedings of the 2013 ACM SIGSAC Conference on Com- puter & Communications Security. CCS ’13. Berlin, Germany: Association for Com- puting Machinery, 2013, pp. 801–812 (cit. p. 2). [12] Samanvaya Panda. “Principal component analysis using CKKS homomorphic scheme.” In: Cyber Security Cryptography and Machine Learning: 5th International Sympo- sium, CSCML 2021, Be’er Sheva, Israel, July 8–9, 2021, Proceedings 5. Springer. 2021, pp. 52–70 (cit. pp. 3, 7). [13] Xiaoyu Fan, Guosai Wang, Kun Chen, Xu He, and Wei Xu. “Ppca: Privacy-preserving principal component analysis using secure multiparty computation (mpc).” In: arXiv preprint arXiv:2105.07612 (2021) (cit. pp. 3, 8, 56–59). [14] Ian T Jolliffe. Principal component analysis for special types of data. Springer, 2002 (cit. p. 7). [15] Xinbo Liu, Yaping Lin, Qin Liu, and Xin Yao. “A privacy-preserving principal com- ponent analysis outsourcing framework.” In: 2018 17th IEEE International Confer- ence On Trust, Security And Privacy In Computing And Communications/12th IEEE International Conference On Big Data Science And Engineering (TrustCom/BigDataSE). IEEE. 2018, pp. 1354–1359 (cit. p. 7). [16] Xirong Ma, Chuan Ma, Yali Jiang, and Chunpeng Ge. “Improved privacy-preserving PCA using optimized homomorphic matrix multiplication.” In: Computers & Secu- rity 138 (2024), p. 103658 (cit. p. 8). [17] Zhihua Xia, Qi Gu, Lizhi Xiong, and Wenhao Zhou. “Privacy-preserving image re- trieval based on additive secret sharing.” In: International Journal of Autonomous and Adaptive Communications Systems 17.2 (2024), pp. 99–126 (cit. p. 8). [18] Yingting Liu, Chaochao Chen, Longfei Zheng, et al. “Privacy preserving pca for multiparty modeling.” In: arXiv preprint arXiv:2002.02091 (2020) (cit. p. 8). [19] Hsin-Chan Yen. “Privacy Preserving Principal Components Analysis: A Dual-Cloud Construction Based on Secret Sharing for Protecting Covariance Matrix.” Available from the National Digital Library of Theses and Dissertations in Taiwan. Master’s Thesis. National Taiwan University of Science and Technology, 2024 (cit. pp. 3, 8, 60). [20] Donald Beaver. “Efficient multiparty protocols using circuit randomization.” In: Ad- vances in Cryptology—CRYPTO’91: Proceedings 11. Springer. 1992, pp. 420–432 (cit. p. 11). [21] Michel Journée, Yurii Nesterov, Peter Richtárik, and Rodolphe Sepulchre. “General- ized power method for sparse principal component analysis.” In: Journal of Machine Learning Research 11 (2010) (cit. p. 20). [22] Roger A Horn and Charles R Johnson. Matrix analysis. Cambridge university press, 2012 (cit. p. 17). [23] Josef Stoer, Roland Bulirsch, R Bartels, Walter Gautschi, and Christoph Witzgall. Introduction to numerical analysis. Vol. 1993. Springer, 1980 (cit. p. 18). [24] RP Kanwal and KC Liu. “A Taylor expansion approach for solving integral equa- tions.” In: International Journal of Mathematical Education in Science and Technol- ogy 20.3 (1989), pp. 411–414 (cit. p. 18). [25] Louis W Ehrlich. “A modified Newton method for polynomials.” In: Communica- tions of the ACM (1967), pp. 107–108 (cit. p. 19). [26] Ian T Jolliffe. Principal component analysis for special types of data. Springer, 2002. [27] Payman Mohassel and Yupeng Zhang. “Secureml: A system for scalable privacy- preserving machine learning.” In: 2017 IEEE symposium on security and privacy (SP). IEEE. 2017, pp. 19–38. [28] R. Canetti. “Universally composable security: a new paradigm for cryptographic pro- tocols.” In: Proceedings 42nd IEEE Symposium on Foundations of Computer Sci-ence. 2001, pp. 136–145 (cit. p. 43). [29] Paulo Cortez, António Cerdeira, Fernando Almeida, Telmo Matos, and José Reis. “Modeling wine preferences by data mining from physicochemical properties.” In: Decision Support Systems 47.4 (2009), pp. 547–553 (cit. p. 55). [30] Lingjun Meng, Peter van der Putten, and Haiyang Wang. “A comprehensive bench- mark of the artificial immune recognition system (AIRS).” In: International Con- ference on Advanced Data Mining and Applications. Springer, 2005, pp. 66–76 (cit. p. 55). [31] Fabian Pedregosa, Gaël Varoquaux, Alexandre Gramfort, et al. “Scikit-learn: Ma- chine learning in Python.” In: the Journal of machine Learning research 12 (2011), pp. 2825–2830 (cit. p. 55). 描述 碩士
國立政治大學
資訊安全碩士學位學程
112791005資料來源 http://thesis.lib.nccu.edu.tw/record/#G0112791005 資料類型 thesis dc.contributor.advisor 左瑞麟 zh_TW dc.contributor.advisor Tso, Ray-Lin en_US dc.contributor.author (Authors) 陳幼恩 zh_TW dc.contributor.author (Authors) Chen, Yu-En en_US dc.creator (作者) 陳幼恩 zh_TW dc.creator (作者) Chen, Yu-En en_US dc.date (日期) 2025 en_US dc.date.accessioned 2-Oct-2025 11:11:19 (UTC+8) - dc.date.available 2-Oct-2025 11:11:19 (UTC+8) - dc.date.issued (上傳時間) 2-Oct-2025 11:11:19 (UTC+8) - dc.identifier (Other Identifiers) G0112791005 en_US dc.identifier.uri (URI) https://nccur.lib.nccu.edu.tw/handle/140.119/159701 - dc.description (描述) 碩士 zh_TW dc.description (描述) 國立政治大學 zh_TW dc.description (描述) 資訊安全碩士學位學程 zh_TW dc.description (描述) 112791005 zh_TW dc.description.abstract (摘要) 隨著隱私保護需求在資料分析與機器學習領域的重要性日益提升,多方資料環境下的安全協同計算已成為關鍵課題。為此,本研究提出一項基於多雲架構之多方數據主成分分析(PCA)隱私保護技術,能於無需集中明文資料的情況下,完成分散式的特徵萃取任務。本技術整合門限式秘密共享、混淆機制與協同計算協議,有效避免任何單一雲端節點接觸原始資料或中間統計資訊。本方法以保護協方差矩陣為核心,結合多方資料提供者與多雲節點,實現可擴展且具安全保證的主成分特徵計算流程。各方僅需共享經過秘密分割的統計量,即可在不暴露任何原始內容的前提下,完成特徵值與特徵向量之協同運算。最終所輸出的主成分資訊可供資料消費者用於後續模型訓練與分析任務,兼顧隱私保護與實用性。本技術設計強調對多方數據場景的適應性,並在安全性、通訊與計算開銷之間取得良好平衡,具備高度擴展潛力,適用於醫療、金融與工業等需嚴格隱私保護之跨域應用場景。 zh_TW dc.description.abstract (摘要) As the demand for privacy protection in data analytics and machine-learning continues to grow, secure collaborative computation over multi-party data has become a critical research topic. To address this challenge, we propose a privacy-preserving principal component analysis (PCA) technique based on a multi-cloud architecture, which performs distributed feature extraction without aggregating plaintext data. The proposed technique combines threshold secret sharing, obfuscation mechanisms, and collaborative-computing protocols, thereby preventing any single cloud node from accessing raw data or intermediate statistical information. Focusing on the protection of the covariance matrix, the method links multiple data providers with multiple cloud nodes to create a scalable, provably secure workflow for computing principal components. Each party shares only secret-split statis tics, enabling the joint calculation of eigenvalues and eigenvectors without revealing any original content. The resulting principal-component information can be delivered to data consumers for downstream model training and analysis, ensuring both privacy and usability. Designed for multi-party data scenarios, the architecture strikes an effective balance among security, communication, and computational overhead. It offers strong scalability and is well-suited to cross-domain applications—such as healthcare, finance, and industrial settings—where stringent privacy protection is essential. en_US dc.description.tableofcontents 1 Introduction 1 1.1 Motivation 2 1.2 Contributions 4 1.3 Organization 5 2 Related Works 7 2.1 Homomorphic Encryption-Based Approaches 7 2.2 Secret Sharing and MPC-Based Approaches 8 2.3 Secret Sharing-Based Dual-Cloud Approaches 8 3 Preliminaries 9 3.1 Secret Sharing 9 3.2 Beaver Triples 11 3.3 Similar Matrix. 16 3.4 Inverse of Square Root 18 3.5 Principle Components Analysis 19 4 System Architecture 23 4.1 System Model 23 4.2 Threat Model 25 5 Proposed Protocols 27 5.1 Beaver Triple 28 5.2 Covariance Matrix 30 5.3 Obfuscated Covariance Matrix 31 5.4 Provider Selection and Distribution 33 5.5 Inverse Square Root 34 5.6 Eigenvalues and Eigenvectors 35 5.7 Dimensionality Reduction 37 5.8 MSSPCA 38 6 Security Analysis 43 6.1 Security of Individual Protocols 44 7 Experiments and Results 55 7.1 Experimental Setup 55 7.2 Correctness Evaluation 55 7.3 Efficiency Evaluation 58 7.4 Scalability Evaluation 59 7.5 Parameter Settings in Secure Computation 60 7.6 Experimental Summary 61 8 Conclusion 63 References 65 zh_TW dc.format.extent 1974775 bytes - dc.format.mimetype application/pdf - dc.source.uri (資料來源) http://thesis.lib.nccu.edu.tw/record/#G0112791005 en_US dc.subject (關鍵詞) 隱私保護 zh_TW dc.subject (關鍵詞) 秘密共享 zh_TW dc.subject (關鍵詞) 主成分分析 zh_TW dc.subject (關鍵詞) 多方計算 zh_TW dc.subject (關鍵詞) Privacy-preserving en_US dc.subject (關鍵詞) Secret Sharing en_US dc.subject (關鍵詞) Principal component analysis en_US dc.subject (關鍵詞) Multi party Computation en_US dc.title (題名) 基於多雲架構之多方數據主成份分析隱私保護技術 zh_TW dc.title (題名) Privacy Protection Technology for Principal Component Analysis of Multi-Party Data Based on Multi-Cloud Architecture en_US dc.type (資料類型) thesis en_US dc.relation.reference (參考文獻) [1] Daniel E. O’Leary. “Artificial Intelligence and Big Data.” In: IEEE Intelligent Sys- tems 28.2 (2013), pp. 96–99 (cit. p. 1). [2] Keith Bonawitz, Vladimir Ivanov, Ben Kreuter, et al. “Practical Secure Aggrega- tion for Privacy-Preserving Machine Learning.” In: Proceedings of the 2017 ACM SIGSAC Conference on Computer and Communications Security. CCS ’17. Dallas, Texas, USA: Association for Computing Machinery, 2017, pp. 1175–1191 (cit. p. 1). [3] Hoon Wei Lim, Geong Sen Poh, Jia Xu, and Varsha Chittawar. “PrivateLink : Privacy- Preserving Integration and Sharing of Datasets.” In: IEEE Transactions on Informa- tion Forensics and Security 15 (2020), pp. 564–577 (cit. p. 1). [4] Arvind Narayanan and Vitaly Shmatikov. “Robust De-anonymization of Large Sparse Datasets.” In: 2008 IEEE Symposium on Security and Privacy (sp 2008). 2008, pp. 111– 125 (cit. p. 1). [5] Wayne Jansen and Timothy Grance. SP 800-144. Guidelines on Security and Privacy in Public Cloud Computing. Tech. rep. Gaithersburg, MD, USA: National Institute of Standards and Technology (NIST), 2011 (cit. p. 1). [6] Johannes Heurix, Peter Zimmermann, Thomas Neubauer, and Stefan Fenz. “A tax- onomy for privacy enhancing technologies.” In: Computers & Security 53 (2015), pp. 1–17 (cit. p. 1). [7] Inayat Ali, Eraj Khan, and Sonia Sabir. “Privacy-preserving data aggregation in resource- constrained sensor nodes in Internet of Things: A review.” In: Future Computing and Informatics Journal 3.1 (2018), pp. 41–50 (cit. p. 1). [8] Adi Shamir. “How to share a secret.” In: Commun. ACM 22.11 (Nov. 1979), pp. 612– 613 (cit. p. 1). [9] Nileshkumar Kakade and Utpalkumar Patel. “Secure Secret Sharing Using Homo- morphic Encryption.” In: 2020 11th International Conference on Computing, Com- munication and Networking Technologies (ICCCNT). 2020, pp. 1–7 (cit. p. 2). [10] Saloni Kwatra and Vicenç Torra. “Data reconstruction attack against principal com- ponent analysis.” In: International Symposium on Security and Privacy in Social Net- works and Big Data. Springer. 2023, pp. 79–92 (cit. p. 2). [11] Valeria Nikolaenko, Stratis Ioannidis, Udi Weinsberg, et al. “Privacy-preserving ma- trix factorization.” In: Proceedings of the 2013 ACM SIGSAC Conference on Com- puter & Communications Security. CCS ’13. Berlin, Germany: Association for Com- puting Machinery, 2013, pp. 801–812 (cit. p. 2). [12] Samanvaya Panda. “Principal component analysis using CKKS homomorphic scheme.” In: Cyber Security Cryptography and Machine Learning: 5th International Sympo- sium, CSCML 2021, Be’er Sheva, Israel, July 8–9, 2021, Proceedings 5. Springer. 2021, pp. 52–70 (cit. pp. 3, 7). [13] Xiaoyu Fan, Guosai Wang, Kun Chen, Xu He, and Wei Xu. “Ppca: Privacy-preserving principal component analysis using secure multiparty computation (mpc).” In: arXiv preprint arXiv:2105.07612 (2021) (cit. pp. 3, 8, 56–59). [14] Ian T Jolliffe. Principal component analysis for special types of data. Springer, 2002 (cit. p. 7). [15] Xinbo Liu, Yaping Lin, Qin Liu, and Xin Yao. “A privacy-preserving principal com- ponent analysis outsourcing framework.” In: 2018 17th IEEE International Confer- ence On Trust, Security And Privacy In Computing And Communications/12th IEEE International Conference On Big Data Science And Engineering (TrustCom/BigDataSE). IEEE. 2018, pp. 1354–1359 (cit. p. 7). [16] Xirong Ma, Chuan Ma, Yali Jiang, and Chunpeng Ge. “Improved privacy-preserving PCA using optimized homomorphic matrix multiplication.” In: Computers & Secu- rity 138 (2024), p. 103658 (cit. p. 8). [17] Zhihua Xia, Qi Gu, Lizhi Xiong, and Wenhao Zhou. “Privacy-preserving image re- trieval based on additive secret sharing.” In: International Journal of Autonomous and Adaptive Communications Systems 17.2 (2024), pp. 99–126 (cit. p. 8). [18] Yingting Liu, Chaochao Chen, Longfei Zheng, et al. “Privacy preserving pca for multiparty modeling.” In: arXiv preprint arXiv:2002.02091 (2020) (cit. p. 8). [19] Hsin-Chan Yen. “Privacy Preserving Principal Components Analysis: A Dual-Cloud Construction Based on Secret Sharing for Protecting Covariance Matrix.” Available from the National Digital Library of Theses and Dissertations in Taiwan. Master’s Thesis. National Taiwan University of Science and Technology, 2024 (cit. pp. 3, 8, 60). [20] Donald Beaver. “Efficient multiparty protocols using circuit randomization.” In: Ad- vances in Cryptology—CRYPTO’91: Proceedings 11. Springer. 1992, pp. 420–432 (cit. p. 11). [21] Michel Journée, Yurii Nesterov, Peter Richtárik, and Rodolphe Sepulchre. “General- ized power method for sparse principal component analysis.” In: Journal of Machine Learning Research 11 (2010) (cit. p. 20). [22] Roger A Horn and Charles R Johnson. Matrix analysis. Cambridge university press, 2012 (cit. p. 17). [23] Josef Stoer, Roland Bulirsch, R Bartels, Walter Gautschi, and Christoph Witzgall. Introduction to numerical analysis. Vol. 1993. Springer, 1980 (cit. p. 18). [24] RP Kanwal and KC Liu. “A Taylor expansion approach for solving integral equa- tions.” In: International Journal of Mathematical Education in Science and Technol- ogy 20.3 (1989), pp. 411–414 (cit. p. 18). [25] Louis W Ehrlich. “A modified Newton method for polynomials.” In: Communica- tions of the ACM (1967), pp. 107–108 (cit. p. 19). [26] Ian T Jolliffe. Principal component analysis for special types of data. Springer, 2002. [27] Payman Mohassel and Yupeng Zhang. “Secureml: A system for scalable privacy- preserving machine learning.” In: 2017 IEEE symposium on security and privacy (SP). IEEE. 2017, pp. 19–38. [28] R. Canetti. “Universally composable security: a new paradigm for cryptographic pro- tocols.” In: Proceedings 42nd IEEE Symposium on Foundations of Computer Sci-ence. 2001, pp. 136–145 (cit. p. 43). [29] Paulo Cortez, António Cerdeira, Fernando Almeida, Telmo Matos, and José Reis. “Modeling wine preferences by data mining from physicochemical properties.” In: Decision Support Systems 47.4 (2009), pp. 547–553 (cit. p. 55). [30] Lingjun Meng, Peter van der Putten, and Haiyang Wang. “A comprehensive bench- mark of the artificial immune recognition system (AIRS).” In: International Con- ference on Advanced Data Mining and Applications. Springer, 2005, pp. 66–76 (cit. p. 55). [31] Fabian Pedregosa, Gaël Varoquaux, Alexandre Gramfort, et al. “Scikit-learn: Ma- chine learning in Python.” In: the Journal of machine Learning research 12 (2011), pp. 2825–2830 (cit. p. 55). zh_TW
