Please use this identifier to cite or link to this item:

Title: 落實本地差分隱私的安全式機器學習
Applying Local Differential Privacy for Secure Machine Learning
Authors: 呂柏漢
Contributors: 胡毓忠
Keywords: 本地差分隱私
Local differential privacy
Secure machine learning
Data Protection
Date: 2019
Issue Date: 2019-03-07 12:07:44 (UTC+8)
Abstract: 隨著大數據時代的到來,各大企業與政府組織皆大量的蒐集與分析用戶資
訊,個人隱私也隨之面臨洩漏的風險,如何平衡資料的可用性與隱私保護成為重要的課題。本研究運用本地差分隱私技術建構安全式機器學習,在不洩漏個人敏感資訊的情形下完成資料分析的正確分類與預測。本研究使用 UCI 提供的” Bank Marketing Data Set”資料集,運用基於 AnonML 與 RAPPOR 的本地差分隱私技術擾動敏感資料完成隱私保護,允許使用者視特徵隱私程度的不同客製化隱私預算,在三方平台還原資料完成安全式機器學習,並具體提出量化與質化的運算觀察結果。
With the arrival of big data era, many big enterprises and governments aggregate and analyze great amounts of user data. Personal privacy faces the risk of leakage nowadays. It becomes an important task to balance data utility and privacy protection.This research proposed to use local differential privacy to implement secure machine
learning and make correct classification and prediction with the data protection. This research uses the “Bank Marketing Data Set” on UCI, adding noise into sensitive data by local differential privacy based on AnonML and RAPPOR for privacy protection and recover the data to implement machine learning on the third-party platform, and
concluding the calculation results of quantization and quality by this method.
Reference: [1] B. Hitaj et al., “Deep models under the GAN: information leakage from collaborative deep learning,” Proceedings of the 2017 ACM SIGSAC Conference on Computer and Communications Security, pp. 603-618, 2017.
[2] A. Narayanan and V. Shmatikov, “Robust de-anonymization of large sparse datasets,” Security and Privacy, 2008. SP 2008. IEEE Symposium on, pp. 111-125, 2008.
[3] T. Dalenius, “Towards a methodology for statistical disclosure control,” statistik Tidskrift, vol. 15, no. 429-444, pp. 2-1, 1977.
[4] C. Dwork, “Differential Privacy,” Proceedings of the 33rd International Colloquium on Automata, Languages and Programming, pp. 1-12, 2006.
[5] L. Sweeney, “k-anonymity: A model for protecting privacy,” International Journal of Uncertainty, Fuzziness and Knowledge-Based Systems, vol. 10, no. 05, pp. 557-570, 2002.
[6] A. Machanavajjhala et al., “l-Diversity: Privacy Beyond k-Anonymity,” Proceedings of the 22nd International Conference on Data Engineering, pp. 24, 2006.
[7] C. Dwork and A. Roth, “The algorithmic foundations of differential privacy,” Foundations and Trends® in Theoretical Computer Science, vol. 9, no. 3–4, pp. 211-407, 2014.
[8] C. Dwork et al., “Calibrating noise to sensitivity in private data analysis,” Theory of Cryptography Conference, pp. 265-284, 2006.
[9] F. McSherry and K. Talwar, “Mechanism design via differential privacy,” Foundations of Computer Science, 2007. FOCS'07. 48th Annual IEEE Symposium on, pp. 94-103, 2007.
[10] F. D. McSherry, “Privacy integrated queries: an extensible platform for privacy-preserving data analysis,” Proceedings of the 2009 ACM SIGMOD International Conference on Management of Data, pp. 19-30, 2009.
[11] C. Dwork et al., “On the complexity of differentially private data release: efficient algorithms and hardness results,” Proceedings of the 41st Annual ACM Symposium on Theory of Computing, pp. 381-390, 2009.
[12] J. C. Duchi et al., “Local privacy and statistical minimax rates,” Foundations of Computer Science (FOCS), 2013 IEEE 54th Annual Symposium on, pp. 429-438, 2013.
[13] S. L. Warner, “Randomized response: A survey technique for eliminating evasive answer bias,” Journal of the American Statistical Association, vol. 60, no. 309, pp. 63-69, 1965.
[14] Ú. Erlingsson et al., “Rappor: Randomized aggregatable privacy-preserving ordinal response,” Proceedings of the 2014 ACM SIGSAC Conference on Computer and Communications Security, pp. 1054-1067, 2014.
[15] G. Fanti et al., “Building a RAPPOR with the Unknown: Privacy-Preserving Learning of Associations and Data Dictionaries,” Proceedings on Privacy Enhancing Technologies, vol. 2016, no. 3, pp. 41, 2016.
[16] R. Tibshirani, “Regression shrinkage and selection via the lasso: a retrospective,” Journal of the Royal Statistical Society: Series B, vol. 73, no. 3, pp. 273-282, 2011.
[17] T. T. Nguyên et al., “Collecting and analyzing data from smart device users with local differential privacy,” arXiv preprint arXiv:1606.05053, 2016.
[18] B. Cyphers and K. Veeramachaneni, “AnonML: Locally private machine learning over a network of peers,” Data Science and Advanced Analytics (DSAA), IEEE International Conference on, pp. 549-560, 2017.
[19] P. Samarati and L. Sweeney, “Generalizing data to provide anonymity when disclosing information,” PODS, vol. 98, pp. 188, 1998.
[20] L. Sweeney, “Achieving k-anonymity privacy protection using generalization and suppression,” International Journal of Uncertainty, Fuzziness Knowledge-Based Systems, vol. 10, no. 05, pp. 571-588, 2002.
[21] F. Prasser et al., “Lightning: Utility-Driven Anonymization of High-Dimensional Data,” Transactions on Data Privacy, vol. 9, no. 2, pp. 161-185, 2016.
[22] R. Bassily and A. Smith, “Local, private, efficient protocols for succinct histograms,” Proceedings of the 47th Annual ACM Symposium on Theory of Computing, pp. 127-135, 2015.
Description: 碩士
Source URI:
Data Type: thesis
Appears in Collections:[資訊科學系碩士在職專班] 學位論文

Files in This Item:

File SizeFormat
100801.pdf3071KbAdobe PDF0View/Open

All items in 學術集成 are protected by copyright, with all rights reserved.

社群 sharing