學術產出-學位論文

文章檢視/開啟

書目匯出

Google ScholarTM

政大圖書館

引文資訊

TAIR相關學術產出

題名 對神經網路模型的個體公平性進行動態型式測試
Concolic Testing on Individual Fairness of Neural Network Models
作者 黃名儀
Huang, Ming-I
貢獻者 郁方<br>洪智鐸
Yu, Fang<br>Hong, Chih-Duo
黃名儀
Huang, Ming-I
關鍵詞 政治大學
深度神經網路
動態符號執行測試
公平性測試
NCCU
Concolic Testing
Fairness Testing
Deep Neural Networks
日期 2024
上傳時間 4-九月-2024 14:06:44 (UTC+8)
摘要 深度神經網絡(DNNs)在刑事司法、招聘實踐和金融貸款決策等關鍵社會領域中變得越來越普遍。然而,這些應用往往無意中延續了偏見,導致對個體的歧視,從而限制了它們對社會的更廣泛利益。本研究針對深度神經網絡(DNNs)中的個體公平性進行探討。與以往研究相比,我們的研究在系統性公平性檢查方面做出了貢獻,提供了一種自動化和嚴謹的方法來識別DNN中的不公平實例。
Deep neural networks (DNNs) are becoming more prevalent in crucial societal domains such as criminal justice, hiring practices, and financial lending decisions. However, these applications frequently unintentionally perpetuate biases that lead to individual discrimination, thus constraining their broader societal benefits. This study addresses individual fairness in deep neural networks (DNNs). Compared to previous work, our research contributes on systematic fairness checking, offering an automatic and rigorous approach to identify instances of unfairness in DNNs.
參考文獻 Aggarwal, A., Lohia, P., Nagar, S., Dey, K., and Saha, D. (2019). Black box fairness testing of machine learning models. In Proceedings of the 2019 27th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering, pages 625–635. Albarghouthi, A., D’Antoni, L., Drews, S., and Nori, A. V. (2017). Fairsquare: prob- abilistic verification of program fairness. Proceedings of the ACM on Programming Languages, 1(OOPSLA):1–30. Awwad, Y., Fletcher, R., Frey, D., Gandhi, A., Najafian, M., and Teodorescu, M. (2020). Exploring fairness in machine learning for international development. Technical report, CITE MIT D-Lab. Bastani, O., Zhang, X., and Solar-Lezama, A. (2019). Probabilistic verification of fairness properties via concentration. Proceedings of the ACM on Programming Languages, 3(OOPSLA):1–27. Biswas, S. and Rajan, H. (2020). Do the machine learning models on a crowd sourced platform exhibit bias? an empirical study on model fairness. In Proceedings of the 28th ACM joint meeting on European software engineering conference and symposium on the foundations of software engineering, pages 642–653. Biswas, S. and Rajan, H. (2021). Fair preprocessing: towards understanding compositional fairness of data transformers in machine learning pipeline. In Proceedings of the 29th ACM joint meeting on European software engineering conference and symposium on the foundations of software engineering, pages 981–993. Biswas, S. and Rajan, H. (2023). Fairify: Fairness verification of neural networks. In 2023 IEEE/ACM 45th International Conference on Software Engineering (ICSE), pages 1546–1558. IEEE. Bojarski, M., Del Testa, D., Dworakowski, D., Firner, B., Flepp, B., Goyal, P., Jackel, L. D., Monfort, M., Muller, U., Zhang, J., et al. (2016). End to end learning for self- driving cars. arXiv preprint arXiv:1604.07316. Buolamwini, J. and Gebru, T. (2018). Gender shades: Intersectional accuracy disparities in commercial gender classification. In Conference on fairness, accountability and transparency, pages 77–91. PMLR. Chakraborty, J., Majumder, S., Yu, Z., and Menzies, T. (2020). Fairway: a way to build fair ml software. In Proceedings of the 28th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering, pages 654–665. Chen, Y.-F., Tsai, W.-L., Wu, W.-C., Yen, D.-D., and Yu, F. (2021). Pyct: A python con- colic tester. In Programming Languages and Systems: 19th Asian Symposium, APLAS 2021, Chicago, IL, USA, October 17–18, 2021, Proceedings 19, pages 38–46. Springer. Chen, Z., Zhang, J. M., Sarro, F., and Harman, M. (2024). Fairness improvement with multiple protected attributes: How far are we? In Proceedings of the IEEE/ACM 46th International Conference on Software Engineering, pages 1–13. Dastin, J. (2022). Amazon scraps secret ai recruiting tool that showed bias against women. In Ethics of data and analytics, pages 296–299. Auerbach Publications. Dwork, C., Hardt, M., Pitassi, T., Reingold, O., and Zemel, R. (2012). Fairness through awareness. In Proceedings of the 3rd innovations in theoretical computer science con- ference, pages 214–226. Feldman, M., Friedler, S. A., Moeller, J., Scheidegger, C., and Venkatasubramanian, S. (2015). Certifying and removing disparate impact. In proceedings of the 21th ACM SIGKDD international conference on knowledge discovery and data mining, pages 259–268. Flores, A. W., Bechtel, K., and Lowenkamp, C. T. (2016). False positives, false nega- tives, and false analyses: A rejoinder to machine bias: There’s software used across the country to predict future criminals. and it’s biased against blacks. Fed. Probation, 80:38. Galhotra, S., Brun, Y., and Meliou, A. (2017). Fairness testing: testing software for dis- crimination. In Proceedings of the 2017 11th Joint meeting on foundations of software engineering, pages 498–510. Gehr, T., Mirman, M., Drachsler-Cohen, D., Tsankov, P., Chaudhuri, S., and Vechev, M. (2018). Ai2: Safety and robustness certification of neural networks with abstract inter- pretation. In 2018 IEEE symposium on security and privacy (SP), pages 3–18. IEEE. Gohar, U., Biswas, S., and Rajan, H. (2023). Towards understanding fairness and its composition in ensemble machine learning. In 2023 IEEE/ACM 45th International Conference on Software Engineering (ICSE), pages 1533–1545. IEEE. Goodfellow, I. J., Shlens, J., and Szegedy, C. (2014). Explaining and harnessing adver- sarial examples. arXiv preprint arXiv:1412.6572. Hort, M., Zhang, J. M., Sarro, F., and Harman, M. (2021). Fairea: A model behaviour mutation approach to benchmarking bias mitigation methods. In Proceedings of the 29th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering, pages 994–1006. John, P. G., Vijaykeerthy, D., and Saha, D. (2020). Verifying individual fairness in machine learning models. In Conference on Uncertainty in Artificial Intelligence, pages 749– 758. PMLR. Katz, G., Barrett, C., Dill, D. L., Julian, K., and Kochenderfer, M. J. (2017). Reluplex: An efficient smt solver for verifying deep neural networks. In Computer Aided Verification: 29th International Conference, CAV 2017, Heidelberg, Germany, July 24-28, 2017, Proceedings, Part I 30, pages 97–117. Springer. Katz, G., Huang, D. A., Ibeling, D., Julian, K., Lazarus, C., Lim, R., Shah, P., Thakoor, S., Wu, H., Zeljić, A., et al. (2019). The marabou framework for verification and analysis of deep neural networks. In Computer Aided Verification: 31st International Conference, CAV 2019, New York City, NY, USA, July 15-18, 2019, Proceedings, Part I 31, pages 443–452. Springer. Khedr, H. and Shoukry, Y. (2023). Certifair: A framework for certified global fairness of neural networks. In Proceedings of the AAAI Conference on Artificial Intelligence, volume 37, pages 8237–8245. Kurakin, A., Goodfellow, I. J., and Bengio, S. (2018). Adversarial examples in the phys- ical world. In Artificial intelligence safety and security, pages 99–112. Chapman and Hall/CRC. Lam, L. and Suen, S. (1997). Application of majority voting to pattern recognition: an analysis of its behavior and performance. IEEE Transactions on Systems, Man, and Cybernetics-Part A: Systems and Humans, 27(5):553–568. Li, T., Xie, X., Wang, J., Guo, Q., Liu, A., Ma, L., and Liu, Y. (2023). Faire: Repair- ing fairness of neural networks via neuron condition synthesis. ACM Transactions on Software Engineering and Methodology, 33(1):1–24. Litjens, G., Kooi, T., Bejnordi, B. E., Setio, A. A. A., Ciompi, F., Ghafoorian, M., Van Der Laak, J. A., Van Ginneken, B., and Sánchez, C. I. (2017). A survey on deep learning in medical image analysis. Medical image analysis, 42:60–88. Ma, L., Juefei-Xu, F., Zhang, F., Sun, J., Xue, M., Li, B., Chen, C., Su, T., Li, L., Liu, Y., et al. (2018). Deepgauge: Multi-granularity testing criteria for deep learning systems. In Proceedings of the 33rd ACM/IEEE international conference on automated software engineering, pages 120–131. Mohammadi, K., Sivaraman, A., and Farnadi, G. (2023). Feta: Fairness enforced verify- ing, training, and predicting algorithms for neural networks. In Proceedings of the 3rd ACM Conference on Equity and Access in Algorithms, Mechanisms, and Optimization, pages 1–11. Pei, K., Cao, Y., Yang, J., and Jana, S. (2017). Deepxplore: Automated whitebox testing of deep learning systems. In proceedings of the 26th Symposium on Operating Systems Principles, pages 1–18. Ruoss, A., Balunovic, M., Fischer, M., and Vechev, M. (2020). Learning certified in- dividually fair representations. Advances in neural information processing systems, 33:7584–7596. Saleiro, P., Kuester, B., Hinkson, L., London, J., Stevens, A., Anisfeld, A., Rodolfa, K. T., and Ghani, R. (2018). Aequitas: A bias and fairness audit toolkit. arXiv preprint arXiv:1811.05577. Sharma, A. and Wehrheim, H. (2020). Automatic fairness testing of machine learning models. In Testing Software and Systems: 32nd IFIP WG 6.1 International Conference, ICTSS 2020, Naples, Italy, December 9–11, 2020, Proceedings 32, pages 255–271. Springer. Singh, G., Gehr, T., Püschel, M., and Vechev, M. (2019). An abstract domain for certifying neural networks. Proceedings of the ACM on Programming Languages, 3(POPL):1–30. Sun, Y., Wu, M., Ruan, W., Huang, X., Kwiatkowska, M., and Kroening, D. (2018). Concolic testing for deep neural networks. In Proceedings of the 33rd ACM/IEEE In- ternational Conference on Automated Software Engineering, pages 109–119. Udeshi, S., Arora, P., and Chattopadhyay, S. (2018). Automated directed fairness test- ing. In Proceedings of the 33rd ACM/IEEE International Conference on Automated Software Engineering, pages 98–108. Urban, C., Christakis, M., Wüstholz, V., and Zhang, F. (2020). Perfectly parallel fairness certification of neural networks. Proceedings of the ACM on Programming Languages, 4(OOPSLA):1–30. Wang, S., Pei, K., Whitehouse, J., Yang, J., and Jana, S. (2018). Formal security analysis of neural networks using symbolic intervals. In 27th USENIX Security Symposium (USENIX Security 18), pages 1599–1614. Yurochkin, M., Bower, A., and Sun, Y. (2019). Training individually fair ml models with sensitive subspace robustness. arXiv preprint arXiv:1907.00020. Zhang, J. M. and Harman, M. (2021). “ignorance and prejudice” in software fairness. In 2021 IEEE/ACM 43rd International Conference on Software Engineering (ICSE), pages 1436–1447. IEEE. Zhang, L., Zhang, Y., and Zhang, M. (2021). Efficient white-box fairness testing through gradient search. In Proceedings of the 30th ACM SIGSOFT International Symposium on Software Testing and Analysis, pages 103–114. Zhang, P., Wang, J., Sun, J., Dong, G., Wang, X., Wang, X., Dong, J. S., and Dai, T. (2020). White-box fairness testing through adversarial sampling. In Proceedings of the ACM/IEEE 42nd International Conference on Software Engineering, pages 949–960. Zheng, H., Chen, Z., Du, T., Zhang, X., Cheng, Y., Ji, S., Wang, J., Yu, Y., and Chen, J. (2022). Neuronfair: Interpretable white-box fairness testing through biased neuron identification. In 44th International Conference on Software Engineering, pages 1–13, New York, NY, USA. ACM.
描述 碩士
國立政治大學
資訊管理學系
111356047
資料來源 http://thesis.lib.nccu.edu.tw/record/#G0111356047
資料類型 thesis
dc.contributor.advisor 郁方<br>洪智鐸zh_TW
dc.contributor.advisor Yu, Fang<br>Hong, Chih-Duoen_US
dc.contributor.author (作者) 黃名儀zh_TW
dc.contributor.author (作者) Huang, Ming-Ien_US
dc.creator (作者) 黃名儀zh_TW
dc.creator (作者) Huang, Ming-Ien_US
dc.date (日期) 2024en_US
dc.date.accessioned 4-九月-2024 14:06:44 (UTC+8)-
dc.date.available 4-九月-2024 14:06:44 (UTC+8)-
dc.date.issued (上傳時間) 4-九月-2024 14:06:44 (UTC+8)-
dc.identifier (其他 識別碼) G0111356047en_US
dc.identifier.uri (URI) https://nccur.lib.nccu.edu.tw/handle/140.119/153165-
dc.description (描述) 碩士zh_TW
dc.description (描述) 國立政治大學zh_TW
dc.description (描述) 資訊管理學系zh_TW
dc.description (描述) 111356047zh_TW
dc.description.abstract (摘要) 深度神經網絡(DNNs)在刑事司法、招聘實踐和金融貸款決策等關鍵社會領域中變得越來越普遍。然而,這些應用往往無意中延續了偏見,導致對個體的歧視,從而限制了它們對社會的更廣泛利益。本研究針對深度神經網絡(DNNs)中的個體公平性進行探討。與以往研究相比,我們的研究在系統性公平性檢查方面做出了貢獻,提供了一種自動化和嚴謹的方法來識別DNN中的不公平實例。zh_TW
dc.description.abstract (摘要) Deep neural networks (DNNs) are becoming more prevalent in crucial societal domains such as criminal justice, hiring practices, and financial lending decisions. However, these applications frequently unintentionally perpetuate biases that lead to individual discrimination, thus constraining their broader societal benefits. This study addresses individual fairness in deep neural networks (DNNs). Compared to previous work, our research contributes on systematic fairness checking, offering an automatic and rigorous approach to identify instances of unfairness in DNNs.en_US
dc.description.tableofcontents 1. Introduction (P.1) 2. Related Work (P.5) 2.1 Fairness Testing (P.5) 2.2 Fairness Verification (P.7) 2.3 Concolic Testing (P.8) 3. Discriminatory Instance Checking (P.11) 3.1 Definition (P.11) 3.2 Neural Network Concolic Testing (P.12) 3.3 A Running Example (P.13) 3.4 Workflow (P.16) 3.5 Algorithm (P.17) 4. Model Fairness Checking (P.19) 4.1 Definition (P.19) 4.2 Double DNN Construction (P.20) 4.3 A Running Example of Unfair Model (P.24) 4.4 A Running Example of Fair Model (P.27) 4.5 Workflow (P.30) 4.6 Algorithm (P.30) 5. Evaluation (P.32) 5.1 Model Description (P.32) 5.2 Research Questions (P.35) 5.3 Results (P.36) 5.3.1 Answer to RQ1(P.37) 5.3.2 Answer to RQ2(P.40) 5.3.3 Answer to RQ3(P.46) 5.3.4 Answer to RQ4(P.47) 6. Discussion (P.50) 7. Conclusions (P.51) Reference (P.52)zh_TW
dc.format.extent 1636132 bytes-
dc.format.mimetype application/pdf-
dc.source.uri (資料來源) http://thesis.lib.nccu.edu.tw/record/#G0111356047en_US
dc.subject (關鍵詞) 政治大學zh_TW
dc.subject (關鍵詞) 深度神經網路zh_TW
dc.subject (關鍵詞) 動態符號執行測試zh_TW
dc.subject (關鍵詞) 公平性測試zh_TW
dc.subject (關鍵詞) NCCUen_US
dc.subject (關鍵詞) Concolic Testingen_US
dc.subject (關鍵詞) Fairness Testingen_US
dc.subject (關鍵詞) Deep Neural Networksen_US
dc.title (題名) 對神經網路模型的個體公平性進行動態型式測試zh_TW
dc.title (題名) Concolic Testing on Individual Fairness of Neural Network Modelsen_US
dc.type (資料類型) thesisen_US
dc.relation.reference (參考文獻) Aggarwal, A., Lohia, P., Nagar, S., Dey, K., and Saha, D. (2019). Black box fairness testing of machine learning models. In Proceedings of the 2019 27th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering, pages 625–635. Albarghouthi, A., D’Antoni, L., Drews, S., and Nori, A. V. (2017). Fairsquare: prob- abilistic verification of program fairness. Proceedings of the ACM on Programming Languages, 1(OOPSLA):1–30. Awwad, Y., Fletcher, R., Frey, D., Gandhi, A., Najafian, M., and Teodorescu, M. (2020). Exploring fairness in machine learning for international development. Technical report, CITE MIT D-Lab. Bastani, O., Zhang, X., and Solar-Lezama, A. (2019). Probabilistic verification of fairness properties via concentration. Proceedings of the ACM on Programming Languages, 3(OOPSLA):1–27. Biswas, S. and Rajan, H. (2020). Do the machine learning models on a crowd sourced platform exhibit bias? an empirical study on model fairness. In Proceedings of the 28th ACM joint meeting on European software engineering conference and symposium on the foundations of software engineering, pages 642–653. Biswas, S. and Rajan, H. (2021). Fair preprocessing: towards understanding compositional fairness of data transformers in machine learning pipeline. In Proceedings of the 29th ACM joint meeting on European software engineering conference and symposium on the foundations of software engineering, pages 981–993. Biswas, S. and Rajan, H. (2023). Fairify: Fairness verification of neural networks. In 2023 IEEE/ACM 45th International Conference on Software Engineering (ICSE), pages 1546–1558. IEEE. Bojarski, M., Del Testa, D., Dworakowski, D., Firner, B., Flepp, B., Goyal, P., Jackel, L. D., Monfort, M., Muller, U., Zhang, J., et al. (2016). End to end learning for self- driving cars. arXiv preprint arXiv:1604.07316. Buolamwini, J. and Gebru, T. (2018). Gender shades: Intersectional accuracy disparities in commercial gender classification. In Conference on fairness, accountability and transparency, pages 77–91. PMLR. Chakraborty, J., Majumder, S., Yu, Z., and Menzies, T. (2020). Fairway: a way to build fair ml software. In Proceedings of the 28th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering, pages 654–665. Chen, Y.-F., Tsai, W.-L., Wu, W.-C., Yen, D.-D., and Yu, F. (2021). Pyct: A python con- colic tester. In Programming Languages and Systems: 19th Asian Symposium, APLAS 2021, Chicago, IL, USA, October 17–18, 2021, Proceedings 19, pages 38–46. Springer. Chen, Z., Zhang, J. M., Sarro, F., and Harman, M. (2024). Fairness improvement with multiple protected attributes: How far are we? In Proceedings of the IEEE/ACM 46th International Conference on Software Engineering, pages 1–13. Dastin, J. (2022). Amazon scraps secret ai recruiting tool that showed bias against women. In Ethics of data and analytics, pages 296–299. Auerbach Publications. Dwork, C., Hardt, M., Pitassi, T., Reingold, O., and Zemel, R. (2012). Fairness through awareness. In Proceedings of the 3rd innovations in theoretical computer science con- ference, pages 214–226. Feldman, M., Friedler, S. A., Moeller, J., Scheidegger, C., and Venkatasubramanian, S. (2015). Certifying and removing disparate impact. In proceedings of the 21th ACM SIGKDD international conference on knowledge discovery and data mining, pages 259–268. Flores, A. W., Bechtel, K., and Lowenkamp, C. T. (2016). False positives, false nega- tives, and false analyses: A rejoinder to machine bias: There’s software used across the country to predict future criminals. and it’s biased against blacks. Fed. Probation, 80:38. Galhotra, S., Brun, Y., and Meliou, A. (2017). Fairness testing: testing software for dis- crimination. In Proceedings of the 2017 11th Joint meeting on foundations of software engineering, pages 498–510. Gehr, T., Mirman, M., Drachsler-Cohen, D., Tsankov, P., Chaudhuri, S., and Vechev, M. (2018). Ai2: Safety and robustness certification of neural networks with abstract inter- pretation. In 2018 IEEE symposium on security and privacy (SP), pages 3–18. IEEE. Gohar, U., Biswas, S., and Rajan, H. (2023). Towards understanding fairness and its composition in ensemble machine learning. In 2023 IEEE/ACM 45th International Conference on Software Engineering (ICSE), pages 1533–1545. IEEE. Goodfellow, I. J., Shlens, J., and Szegedy, C. (2014). Explaining and harnessing adver- sarial examples. arXiv preprint arXiv:1412.6572. Hort, M., Zhang, J. M., Sarro, F., and Harman, M. (2021). Fairea: A model behaviour mutation approach to benchmarking bias mitigation methods. In Proceedings of the 29th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering, pages 994–1006. John, P. G., Vijaykeerthy, D., and Saha, D. (2020). Verifying individual fairness in machine learning models. In Conference on Uncertainty in Artificial Intelligence, pages 749– 758. PMLR. Katz, G., Barrett, C., Dill, D. L., Julian, K., and Kochenderfer, M. J. (2017). Reluplex: An efficient smt solver for verifying deep neural networks. In Computer Aided Verification: 29th International Conference, CAV 2017, Heidelberg, Germany, July 24-28, 2017, Proceedings, Part I 30, pages 97–117. Springer. Katz, G., Huang, D. A., Ibeling, D., Julian, K., Lazarus, C., Lim, R., Shah, P., Thakoor, S., Wu, H., Zeljić, A., et al. (2019). The marabou framework for verification and analysis of deep neural networks. In Computer Aided Verification: 31st International Conference, CAV 2019, New York City, NY, USA, July 15-18, 2019, Proceedings, Part I 31, pages 443–452. Springer. Khedr, H. and Shoukry, Y. (2023). Certifair: A framework for certified global fairness of neural networks. In Proceedings of the AAAI Conference on Artificial Intelligence, volume 37, pages 8237–8245. Kurakin, A., Goodfellow, I. J., and Bengio, S. (2018). Adversarial examples in the phys- ical world. In Artificial intelligence safety and security, pages 99–112. Chapman and Hall/CRC. Lam, L. and Suen, S. (1997). Application of majority voting to pattern recognition: an analysis of its behavior and performance. IEEE Transactions on Systems, Man, and Cybernetics-Part A: Systems and Humans, 27(5):553–568. Li, T., Xie, X., Wang, J., Guo, Q., Liu, A., Ma, L., and Liu, Y. (2023). Faire: Repair- ing fairness of neural networks via neuron condition synthesis. ACM Transactions on Software Engineering and Methodology, 33(1):1–24. Litjens, G., Kooi, T., Bejnordi, B. E., Setio, A. A. A., Ciompi, F., Ghafoorian, M., Van Der Laak, J. A., Van Ginneken, B., and Sánchez, C. I. (2017). A survey on deep learning in medical image analysis. Medical image analysis, 42:60–88. Ma, L., Juefei-Xu, F., Zhang, F., Sun, J., Xue, M., Li, B., Chen, C., Su, T., Li, L., Liu, Y., et al. (2018). Deepgauge: Multi-granularity testing criteria for deep learning systems. In Proceedings of the 33rd ACM/IEEE international conference on automated software engineering, pages 120–131. Mohammadi, K., Sivaraman, A., and Farnadi, G. (2023). Feta: Fairness enforced verify- ing, training, and predicting algorithms for neural networks. In Proceedings of the 3rd ACM Conference on Equity and Access in Algorithms, Mechanisms, and Optimization, pages 1–11. Pei, K., Cao, Y., Yang, J., and Jana, S. (2017). Deepxplore: Automated whitebox testing of deep learning systems. In proceedings of the 26th Symposium on Operating Systems Principles, pages 1–18. Ruoss, A., Balunovic, M., Fischer, M., and Vechev, M. (2020). Learning certified in- dividually fair representations. Advances in neural information processing systems, 33:7584–7596. Saleiro, P., Kuester, B., Hinkson, L., London, J., Stevens, A., Anisfeld, A., Rodolfa, K. T., and Ghani, R. (2018). Aequitas: A bias and fairness audit toolkit. arXiv preprint arXiv:1811.05577. Sharma, A. and Wehrheim, H. (2020). Automatic fairness testing of machine learning models. In Testing Software and Systems: 32nd IFIP WG 6.1 International Conference, ICTSS 2020, Naples, Italy, December 9–11, 2020, Proceedings 32, pages 255–271. Springer. Singh, G., Gehr, T., Püschel, M., and Vechev, M. (2019). An abstract domain for certifying neural networks. Proceedings of the ACM on Programming Languages, 3(POPL):1–30. Sun, Y., Wu, M., Ruan, W., Huang, X., Kwiatkowska, M., and Kroening, D. (2018). Concolic testing for deep neural networks. In Proceedings of the 33rd ACM/IEEE In- ternational Conference on Automated Software Engineering, pages 109–119. Udeshi, S., Arora, P., and Chattopadhyay, S. (2018). Automated directed fairness test- ing. In Proceedings of the 33rd ACM/IEEE International Conference on Automated Software Engineering, pages 98–108. Urban, C., Christakis, M., Wüstholz, V., and Zhang, F. (2020). Perfectly parallel fairness certification of neural networks. Proceedings of the ACM on Programming Languages, 4(OOPSLA):1–30. Wang, S., Pei, K., Whitehouse, J., Yang, J., and Jana, S. (2018). Formal security analysis of neural networks using symbolic intervals. In 27th USENIX Security Symposium (USENIX Security 18), pages 1599–1614. Yurochkin, M., Bower, A., and Sun, Y. (2019). Training individually fair ml models with sensitive subspace robustness. arXiv preprint arXiv:1907.00020. Zhang, J. M. and Harman, M. (2021). “ignorance and prejudice” in software fairness. In 2021 IEEE/ACM 43rd International Conference on Software Engineering (ICSE), pages 1436–1447. IEEE. Zhang, L., Zhang, Y., and Zhang, M. (2021). Efficient white-box fairness testing through gradient search. In Proceedings of the 30th ACM SIGSOFT International Symposium on Software Testing and Analysis, pages 103–114. Zhang, P., Wang, J., Sun, J., Dong, G., Wang, X., Wang, X., Dong, J. S., and Dai, T. (2020). White-box fairness testing through adversarial sampling. In Proceedings of the ACM/IEEE 42nd International Conference on Software Engineering, pages 949–960. Zheng, H., Chen, Z., Du, T., Zhang, X., Cheng, Y., Ji, S., Wang, J., Yu, Y., and Chen, J. (2022). Neuronfair: Interpretable white-box fairness testing through biased neuron identification. In 44th International Conference on Software Engineering, pages 1–13, New York, NY, USA. ACM.zh_TW