Publications-Theses

Article View/Open

Publication Export

Google ScholarTM

NCCU Library

Citation Infomation

Related Publications in TAIR

題名 基於多雲架構之多方數據主成份分析隱私保護技術
Privacy Protection Technology for Principal Component Analysis of Multi-Party Data Based on Multi-Cloud Architecture
作者 陳幼恩
Chen, Yu-En
貢獻者 左瑞麟
Tso, Ray-Lin
陳幼恩
Chen, Yu-En
關鍵詞 隱私保護
秘密共享
主成分分析
多方計算
Privacy-preserving
Secret Sharing
Principal component analysis
Multi party Computation
日期 2025
上傳時間 2-Oct-2025 11:11:19 (UTC+8)
摘要 隨著隱私保護需求在資料分析與機器學習領域的重要性日益提升,多方資料環境下的安全協同計算已成為關鍵課題。為此,本研究提出一項基於多雲架構之多方數據主成分分析(PCA)隱私保護技術,能於無需集中明文資料的情況下,完成分散式的特徵萃取任務。本技術整合門限式秘密共享、混淆機制與協同計算協議,有效避免任何單一雲端節點接觸原始資料或中間統計資訊。本方法以保護協方差矩陣為核心,結合多方資料提供者與多雲節點,實現可擴展且具安全保證的主成分特徵計算流程。各方僅需共享經過秘密分割的統計量,即可在不暴露任何原始內容的前提下,完成特徵值與特徵向量之協同運算。最終所輸出的主成分資訊可供資料消費者用於後續模型訓練與分析任務,兼顧隱私保護與實用性。本技術設計強調對多方數據場景的適應性,並在安全性、通訊與計算開銷之間取得良好平衡,具備高度擴展潛力,適用於醫療、金融與工業等需嚴格隱私保護之跨域應用場景。
As the demand for privacy protection in data analytics and machine-learning continues to grow, secure collaborative computation over multi-party data has become a critical research topic. To address this challenge, we propose a privacy-preserving principal component analysis (PCA) technique based on a multi-cloud architecture, which performs distributed feature extraction without aggregating plaintext data. The proposed technique combines threshold secret sharing, obfuscation mechanisms, and collaborative-computing protocols, thereby preventing any single cloud node from accessing raw data or intermediate statistical information. Focusing on the protection of the covariance matrix, the method links multiple data providers with multiple cloud nodes to create a scalable, provably secure workflow for computing principal components. Each party shares only secret-split statis tics, enabling the joint calculation of eigenvalues and eigenvectors without revealing any original content. The resulting principal-component information can be delivered to data consumers for downstream model training and analysis, ensuring both privacy and usability. Designed for multi-party data scenarios, the architecture strikes an effective balance among security, communication, and computational overhead. It offers strong scalability and is well-suited to cross-domain applications—such as healthcare, finance, and industrial settings—where stringent privacy protection is essential.
參考文獻 [1] Daniel E. O’Leary. “Artificial Intelligence and Big Data.” In: IEEE Intelligent Sys- tems 28.2 (2013), pp. 96–99 (cit. p. 1). [2] Keith Bonawitz, Vladimir Ivanov, Ben Kreuter, et al. “Practical Secure Aggrega- tion for Privacy-Preserving Machine Learning.” In: Proceedings of the 2017 ACM SIGSAC Conference on Computer and Communications Security. CCS ’17. Dallas, Texas, USA: Association for Computing Machinery, 2017, pp. 1175–1191 (cit. p. 1). [3] Hoon Wei Lim, Geong Sen Poh, Jia Xu, and Varsha Chittawar. “PrivateLink : Privacy- Preserving Integration and Sharing of Datasets.” In: IEEE Transactions on Informa- tion Forensics and Security 15 (2020), pp. 564–577 (cit. p. 1). [4] Arvind Narayanan and Vitaly Shmatikov. “Robust De-anonymization of Large Sparse Datasets.” In: 2008 IEEE Symposium on Security and Privacy (sp 2008). 2008, pp. 111– 125 (cit. p. 1). [5] Wayne Jansen and Timothy Grance. SP 800-144. Guidelines on Security and Privacy in Public Cloud Computing. Tech. rep. Gaithersburg, MD, USA: National Institute of Standards and Technology (NIST), 2011 (cit. p. 1). [6] Johannes Heurix, Peter Zimmermann, Thomas Neubauer, and Stefan Fenz. “A tax- onomy for privacy enhancing technologies.” In: Computers & Security 53 (2015), pp. 1–17 (cit. p. 1). [7] Inayat Ali, Eraj Khan, and Sonia Sabir. “Privacy-preserving data aggregation in resource- constrained sensor nodes in Internet of Things: A review.” In: Future Computing and Informatics Journal 3.1 (2018), pp. 41–50 (cit. p. 1). [8] Adi Shamir. “How to share a secret.” In: Commun. ACM 22.11 (Nov. 1979), pp. 612– 613 (cit. p. 1). [9] Nileshkumar Kakade and Utpalkumar Patel. “Secure Secret Sharing Using Homo- morphic Encryption.” In: 2020 11th International Conference on Computing, Com- munication and Networking Technologies (ICCCNT). 2020, pp. 1–7 (cit. p. 2). [10] Saloni Kwatra and Vicenç Torra. “Data reconstruction attack against principal com- ponent analysis.” In: International Symposium on Security and Privacy in Social Net- works and Big Data. Springer. 2023, pp. 79–92 (cit. p. 2). [11] Valeria Nikolaenko, Stratis Ioannidis, Udi Weinsberg, et al. “Privacy-preserving ma- trix factorization.” In: Proceedings of the 2013 ACM SIGSAC Conference on Com- puter & Communications Security. CCS ’13. Berlin, Germany: Association for Com- puting Machinery, 2013, pp. 801–812 (cit. p. 2). [12] Samanvaya Panda. “Principal component analysis using CKKS homomorphic scheme.” In: Cyber Security Cryptography and Machine Learning: 5th International Sympo- sium, CSCML 2021, Be’er Sheva, Israel, July 8–9, 2021, Proceedings 5. Springer. 2021, pp. 52–70 (cit. pp. 3, 7). [13] Xiaoyu Fan, Guosai Wang, Kun Chen, Xu He, and Wei Xu. “Ppca: Privacy-preserving principal component analysis using secure multiparty computation (mpc).” In: arXiv preprint arXiv:2105.07612 (2021) (cit. pp. 3, 8, 56–59). [14] Ian T Jolliffe. Principal component analysis for special types of data. Springer, 2002 (cit. p. 7). [15] Xinbo Liu, Yaping Lin, Qin Liu, and Xin Yao. “A privacy-preserving principal com- ponent analysis outsourcing framework.” In: 2018 17th IEEE International Confer- ence On Trust, Security And Privacy In Computing And Communications/12th IEEE International Conference On Big Data Science And Engineering (TrustCom/BigDataSE). IEEE. 2018, pp. 1354–1359 (cit. p. 7). [16] Xirong Ma, Chuan Ma, Yali Jiang, and Chunpeng Ge. “Improved privacy-preserving PCA using optimized homomorphic matrix multiplication.” In: Computers & Secu- rity 138 (2024), p. 103658 (cit. p. 8). [17] Zhihua Xia, Qi Gu, Lizhi Xiong, and Wenhao Zhou. “Privacy-preserving image re- trieval based on additive secret sharing.” In: International Journal of Autonomous and Adaptive Communications Systems 17.2 (2024), pp. 99–126 (cit. p. 8). [18] Yingting Liu, Chaochao Chen, Longfei Zheng, et al. “Privacy preserving pca for multiparty modeling.” In: arXiv preprint arXiv:2002.02091 (2020) (cit. p. 8). [19] Hsin-Chan Yen. “Privacy Preserving Principal Components Analysis: A Dual-Cloud Construction Based on Secret Sharing for Protecting Covariance Matrix.” Available from the National Digital Library of Theses and Dissertations in Taiwan. Master’s Thesis. National Taiwan University of Science and Technology, 2024 (cit. pp. 3, 8, 60). [20] Donald Beaver. “Efficient multiparty protocols using circuit randomization.” In: Ad- vances in Cryptology—CRYPTO’91: Proceedings 11. Springer. 1992, pp. 420–432 (cit. p. 11). [21] Michel Journée, Yurii Nesterov, Peter Richtárik, and Rodolphe Sepulchre. “General- ized power method for sparse principal component analysis.” In: Journal of Machine Learning Research 11 (2010) (cit. p. 20). [22] Roger A Horn and Charles R Johnson. Matrix analysis. Cambridge university press, 2012 (cit. p. 17). [23] Josef Stoer, Roland Bulirsch, R Bartels, Walter Gautschi, and Christoph Witzgall. Introduction to numerical analysis. Vol. 1993. Springer, 1980 (cit. p. 18). [24] RP Kanwal and KC Liu. “A Taylor expansion approach for solving integral equa- tions.” In: International Journal of Mathematical Education in Science and Technol- ogy 20.3 (1989), pp. 411–414 (cit. p. 18). [25] Louis W Ehrlich. “A modified Newton method for polynomials.” In: Communica- tions of the ACM (1967), pp. 107–108 (cit. p. 19). [26] Ian T Jolliffe. Principal component analysis for special types of data. Springer, 2002. [27] Payman Mohassel and Yupeng Zhang. “Secureml: A system for scalable privacy- preserving machine learning.” In: 2017 IEEE symposium on security and privacy (SP). IEEE. 2017, pp. 19–38. [28] R. Canetti. “Universally composable security: a new paradigm for cryptographic pro- tocols.” In: Proceedings 42nd IEEE Symposium on Foundations of Computer Sci-ence. 2001, pp. 136–145 (cit. p. 43). [29] Paulo Cortez, António Cerdeira, Fernando Almeida, Telmo Matos, and José Reis. “Modeling wine preferences by data mining from physicochemical properties.” In: Decision Support Systems 47.4 (2009), pp. 547–553 (cit. p. 55). [30] Lingjun Meng, Peter van der Putten, and Haiyang Wang. “A comprehensive bench- mark of the artificial immune recognition system (AIRS).” In: International Con- ference on Advanced Data Mining and Applications. Springer, 2005, pp. 66–76 (cit. p. 55). [31] Fabian Pedregosa, Gaël Varoquaux, Alexandre Gramfort, et al. “Scikit-learn: Ma- chine learning in Python.” In: the Journal of machine Learning research 12 (2011), pp. 2825–2830 (cit. p. 55).
描述 碩士
國立政治大學
資訊安全碩士學位學程
112791005
資料來源 http://thesis.lib.nccu.edu.tw/record/#G0112791005
資料類型 thesis
dc.contributor.advisor 左瑞麟zh_TW
dc.contributor.advisor Tso, Ray-Linen_US
dc.contributor.author (Authors) 陳幼恩zh_TW
dc.contributor.author (Authors) Chen, Yu-Enen_US
dc.creator (作者) 陳幼恩zh_TW
dc.creator (作者) Chen, Yu-Enen_US
dc.date (日期) 2025en_US
dc.date.accessioned 2-Oct-2025 11:11:19 (UTC+8)-
dc.date.available 2-Oct-2025 11:11:19 (UTC+8)-
dc.date.issued (上傳時間) 2-Oct-2025 11:11:19 (UTC+8)-
dc.identifier (Other Identifiers) G0112791005en_US
dc.identifier.uri (URI) https://nccur.lib.nccu.edu.tw/handle/140.119/159701-
dc.description (描述) 碩士zh_TW
dc.description (描述) 國立政治大學zh_TW
dc.description (描述) 資訊安全碩士學位學程zh_TW
dc.description (描述) 112791005zh_TW
dc.description.abstract (摘要) 隨著隱私保護需求在資料分析與機器學習領域的重要性日益提升,多方資料環境下的安全協同計算已成為關鍵課題。為此,本研究提出一項基於多雲架構之多方數據主成分分析(PCA)隱私保護技術,能於無需集中明文資料的情況下,完成分散式的特徵萃取任務。本技術整合門限式秘密共享、混淆機制與協同計算協議,有效避免任何單一雲端節點接觸原始資料或中間統計資訊。本方法以保護協方差矩陣為核心,結合多方資料提供者與多雲節點,實現可擴展且具安全保證的主成分特徵計算流程。各方僅需共享經過秘密分割的統計量,即可在不暴露任何原始內容的前提下,完成特徵值與特徵向量之協同運算。最終所輸出的主成分資訊可供資料消費者用於後續模型訓練與分析任務,兼顧隱私保護與實用性。本技術設計強調對多方數據場景的適應性,並在安全性、通訊與計算開銷之間取得良好平衡,具備高度擴展潛力,適用於醫療、金融與工業等需嚴格隱私保護之跨域應用場景。zh_TW
dc.description.abstract (摘要) As the demand for privacy protection in data analytics and machine-learning continues to grow, secure collaborative computation over multi-party data has become a critical research topic. To address this challenge, we propose a privacy-preserving principal component analysis (PCA) technique based on a multi-cloud architecture, which performs distributed feature extraction without aggregating plaintext data. The proposed technique combines threshold secret sharing, obfuscation mechanisms, and collaborative-computing protocols, thereby preventing any single cloud node from accessing raw data or intermediate statistical information. Focusing on the protection of the covariance matrix, the method links multiple data providers with multiple cloud nodes to create a scalable, provably secure workflow for computing principal components. Each party shares only secret-split statis tics, enabling the joint calculation of eigenvalues and eigenvectors without revealing any original content. The resulting principal-component information can be delivered to data consumers for downstream model training and analysis, ensuring both privacy and usability. Designed for multi-party data scenarios, the architecture strikes an effective balance among security, communication, and computational overhead. It offers strong scalability and is well-suited to cross-domain applications—such as healthcare, finance, and industrial settings—where stringent privacy protection is essential.en_US
dc.description.tableofcontents 1 Introduction 1 1.1 Motivation 2 1.2 Contributions 4 1.3 Organization 5 2 Related Works 7 2.1 Homomorphic Encryption-Based Approaches 7 2.2 Secret Sharing and MPC-Based Approaches 8 2.3 Secret Sharing-Based Dual-Cloud Approaches 8 3 Preliminaries 9 3.1 Secret Sharing 9 3.2 Beaver Triples 11 3.3 Similar Matrix. 16 3.4 Inverse of Square Root 18 3.5 Principle Components Analysis 19 4 System Architecture 23 4.1 System Model 23 4.2 Threat Model 25 5 Proposed Protocols 27 5.1 Beaver Triple 28 5.2 Covariance Matrix 30 5.3 Obfuscated Covariance Matrix 31 5.4 Provider Selection and Distribution 33 5.5 Inverse Square Root 34 5.6 Eigenvalues and Eigenvectors 35 5.7 Dimensionality Reduction 37 5.8 MSSPCA 38 6 Security Analysis 43 6.1 Security of Individual Protocols 44 7 Experiments and Results 55 7.1 Experimental Setup 55 7.2 Correctness Evaluation 55 7.3 Efficiency Evaluation 58 7.4 Scalability Evaluation 59 7.5 Parameter Settings in Secure Computation 60 7.6 Experimental Summary 61 8 Conclusion 63 References 65zh_TW
dc.format.extent 1974775 bytes-
dc.format.mimetype application/pdf-
dc.source.uri (資料來源) http://thesis.lib.nccu.edu.tw/record/#G0112791005en_US
dc.subject (關鍵詞) 隱私保護zh_TW
dc.subject (關鍵詞) 秘密共享zh_TW
dc.subject (關鍵詞) 主成分分析zh_TW
dc.subject (關鍵詞) 多方計算zh_TW
dc.subject (關鍵詞) Privacy-preservingen_US
dc.subject (關鍵詞) Secret Sharingen_US
dc.subject (關鍵詞) Principal component analysisen_US
dc.subject (關鍵詞) Multi party Computationen_US
dc.title (題名) 基於多雲架構之多方數據主成份分析隱私保護技術zh_TW
dc.title (題名) Privacy Protection Technology for Principal Component Analysis of Multi-Party Data Based on Multi-Cloud Architectureen_US
dc.type (資料類型) thesisen_US
dc.relation.reference (參考文獻) [1] Daniel E. O’Leary. “Artificial Intelligence and Big Data.” In: IEEE Intelligent Sys- tems 28.2 (2013), pp. 96–99 (cit. p. 1). [2] Keith Bonawitz, Vladimir Ivanov, Ben Kreuter, et al. “Practical Secure Aggrega- tion for Privacy-Preserving Machine Learning.” In: Proceedings of the 2017 ACM SIGSAC Conference on Computer and Communications Security. CCS ’17. Dallas, Texas, USA: Association for Computing Machinery, 2017, pp. 1175–1191 (cit. p. 1). [3] Hoon Wei Lim, Geong Sen Poh, Jia Xu, and Varsha Chittawar. “PrivateLink : Privacy- Preserving Integration and Sharing of Datasets.” In: IEEE Transactions on Informa- tion Forensics and Security 15 (2020), pp. 564–577 (cit. p. 1). [4] Arvind Narayanan and Vitaly Shmatikov. “Robust De-anonymization of Large Sparse Datasets.” In: 2008 IEEE Symposium on Security and Privacy (sp 2008). 2008, pp. 111– 125 (cit. p. 1). [5] Wayne Jansen and Timothy Grance. SP 800-144. Guidelines on Security and Privacy in Public Cloud Computing. Tech. rep. Gaithersburg, MD, USA: National Institute of Standards and Technology (NIST), 2011 (cit. p. 1). [6] Johannes Heurix, Peter Zimmermann, Thomas Neubauer, and Stefan Fenz. “A tax- onomy for privacy enhancing technologies.” In: Computers & Security 53 (2015), pp. 1–17 (cit. p. 1). [7] Inayat Ali, Eraj Khan, and Sonia Sabir. “Privacy-preserving data aggregation in resource- constrained sensor nodes in Internet of Things: A review.” In: Future Computing and Informatics Journal 3.1 (2018), pp. 41–50 (cit. p. 1). [8] Adi Shamir. “How to share a secret.” In: Commun. ACM 22.11 (Nov. 1979), pp. 612– 613 (cit. p. 1). [9] Nileshkumar Kakade and Utpalkumar Patel. “Secure Secret Sharing Using Homo- morphic Encryption.” In: 2020 11th International Conference on Computing, Com- munication and Networking Technologies (ICCCNT). 2020, pp. 1–7 (cit. p. 2). [10] Saloni Kwatra and Vicenç Torra. “Data reconstruction attack against principal com- ponent analysis.” In: International Symposium on Security and Privacy in Social Net- works and Big Data. Springer. 2023, pp. 79–92 (cit. p. 2). [11] Valeria Nikolaenko, Stratis Ioannidis, Udi Weinsberg, et al. “Privacy-preserving ma- trix factorization.” In: Proceedings of the 2013 ACM SIGSAC Conference on Com- puter & Communications Security. CCS ’13. Berlin, Germany: Association for Com- puting Machinery, 2013, pp. 801–812 (cit. p. 2). [12] Samanvaya Panda. “Principal component analysis using CKKS homomorphic scheme.” In: Cyber Security Cryptography and Machine Learning: 5th International Sympo- sium, CSCML 2021, Be’er Sheva, Israel, July 8–9, 2021, Proceedings 5. Springer. 2021, pp. 52–70 (cit. pp. 3, 7). [13] Xiaoyu Fan, Guosai Wang, Kun Chen, Xu He, and Wei Xu. “Ppca: Privacy-preserving principal component analysis using secure multiparty computation (mpc).” In: arXiv preprint arXiv:2105.07612 (2021) (cit. pp. 3, 8, 56–59). [14] Ian T Jolliffe. Principal component analysis for special types of data. Springer, 2002 (cit. p. 7). [15] Xinbo Liu, Yaping Lin, Qin Liu, and Xin Yao. “A privacy-preserving principal com- ponent analysis outsourcing framework.” In: 2018 17th IEEE International Confer- ence On Trust, Security And Privacy In Computing And Communications/12th IEEE International Conference On Big Data Science And Engineering (TrustCom/BigDataSE). IEEE. 2018, pp. 1354–1359 (cit. p. 7). [16] Xirong Ma, Chuan Ma, Yali Jiang, and Chunpeng Ge. “Improved privacy-preserving PCA using optimized homomorphic matrix multiplication.” In: Computers & Secu- rity 138 (2024), p. 103658 (cit. p. 8). [17] Zhihua Xia, Qi Gu, Lizhi Xiong, and Wenhao Zhou. “Privacy-preserving image re- trieval based on additive secret sharing.” In: International Journal of Autonomous and Adaptive Communications Systems 17.2 (2024), pp. 99–126 (cit. p. 8). [18] Yingting Liu, Chaochao Chen, Longfei Zheng, et al. “Privacy preserving pca for multiparty modeling.” In: arXiv preprint arXiv:2002.02091 (2020) (cit. p. 8). [19] Hsin-Chan Yen. “Privacy Preserving Principal Components Analysis: A Dual-Cloud Construction Based on Secret Sharing for Protecting Covariance Matrix.” Available from the National Digital Library of Theses and Dissertations in Taiwan. Master’s Thesis. National Taiwan University of Science and Technology, 2024 (cit. pp. 3, 8, 60). [20] Donald Beaver. “Efficient multiparty protocols using circuit randomization.” In: Ad- vances in Cryptology—CRYPTO’91: Proceedings 11. Springer. 1992, pp. 420–432 (cit. p. 11). [21] Michel Journée, Yurii Nesterov, Peter Richtárik, and Rodolphe Sepulchre. “General- ized power method for sparse principal component analysis.” In: Journal of Machine Learning Research 11 (2010) (cit. p. 20). [22] Roger A Horn and Charles R Johnson. Matrix analysis. Cambridge university press, 2012 (cit. p. 17). [23] Josef Stoer, Roland Bulirsch, R Bartels, Walter Gautschi, and Christoph Witzgall. Introduction to numerical analysis. Vol. 1993. Springer, 1980 (cit. p. 18). [24] RP Kanwal and KC Liu. “A Taylor expansion approach for solving integral equa- tions.” In: International Journal of Mathematical Education in Science and Technol- ogy 20.3 (1989), pp. 411–414 (cit. p. 18). [25] Louis W Ehrlich. “A modified Newton method for polynomials.” In: Communica- tions of the ACM (1967), pp. 107–108 (cit. p. 19). [26] Ian T Jolliffe. Principal component analysis for special types of data. Springer, 2002. [27] Payman Mohassel and Yupeng Zhang. “Secureml: A system for scalable privacy- preserving machine learning.” In: 2017 IEEE symposium on security and privacy (SP). IEEE. 2017, pp. 19–38. [28] R. Canetti. “Universally composable security: a new paradigm for cryptographic pro- tocols.” In: Proceedings 42nd IEEE Symposium on Foundations of Computer Sci-ence. 2001, pp. 136–145 (cit. p. 43). [29] Paulo Cortez, António Cerdeira, Fernando Almeida, Telmo Matos, and José Reis. “Modeling wine preferences by data mining from physicochemical properties.” In: Decision Support Systems 47.4 (2009), pp. 547–553 (cit. p. 55). [30] Lingjun Meng, Peter van der Putten, and Haiyang Wang. “A comprehensive bench- mark of the artificial immune recognition system (AIRS).” In: International Con- ference on Advanced Data Mining and Applications. Springer, 2005, pp. 66–76 (cit. p. 55). [31] Fabian Pedregosa, Gaël Varoquaux, Alexandre Gramfort, et al. “Scikit-learn: Ma- chine learning in Python.” In: the Journal of machine Learning research 12 (2011), pp. 2825–2830 (cit. p. 55).zh_TW