Please use this identifier to cite or link to this item: https://ah.nccu.edu.tw/handle/140.119/131902


Title: 自然語言推理之後設可解釋性建模
Modeling Meta-Explainability of Natural Language Inference
Authors: 蔡鎮宇
Tsai, Chen-Yu
Contributors: 黃瀚萱
陳宜秀

Huang, Hen-Hsen
Chen, Yi-Hsiu

蔡鎮宇
Tsai, Chen-Yu
Keywords: 自然語言處理
自然語言理解
自然語言推理
可解釋性
可解釋人工智慧
注意力機制
信任度評估
nlp
nlu
nli
explainability
interpretability
trust evaluation
attention mechanism
explainable AI
Date: 2020
Issue Date: 2020-09-02 13:08:22 (UTC+8)
Abstract: 本研究之主軸為利用注意力機制,在自然語言推理任務上,以自然語言形式之解釋賦予模型可解釋性,並進行人類信任度評估。近年來人工智慧系統的可解釋性逐漸受到重視,可解釋性使開發者、相關開發人員及終端使用者能夠了解人工智慧系統,進而得以設計更完備的系統、產品以及更符合使用者需求的應用產品。注意力機制做為系統原生的可解釋機制,能夠同時提供忠實且合理的解釋,目前於注意力機制可解釋性之研究,多以注意力權重進行視覺化的呈現來賦予模型決策可解釋性,然而在一般互動的情境中,解釋多是以自然語言的方式表達。而在可解釋性的評估部分,目前所採用的評估方式甚少加入終端使用者—人類進行評估;若有,其評估方式之完備性也難以為人工智慧系統之應用部署提供洞見。

本研究利用 Transformer 架構模型之注意力機制,以自然語言之方式呈現其解釋,賦予模型可解釋性;同時探討在提供不同任務知識後,對於此方法之解釋有何影響;最後以不同模型之解釋進行人類信任度之評估,分析人類對於解釋之信任及偏好。實驗顯示,在自然語言推理任務上,模型之效能與注意力關注區間確實相關;在加入不同特性之任務知識後,模型的解釋能夠忠實地呈現其訓練任務之特性;最後在人類信任度上,人類對於解釋方式偏好不盡相同,但是長而資訊豐富的解釋方式,較短而精確的解釋方式來得有優勢。
The explainability of artificial intelligence (AI) model has recently attracted much interest from the researchers. Explainability provides developers, stakeholders and end users with a better understanding of how the model works and can assist in better interaction between human and machine. Attention mechanism, as an intrinsic explainable method, is considered more suitable for faithful and plausible explanations. The majority of research on attention mechanism, however, focuses on visualization of the attention weight as a way to make the model explainable. Yet in real-life interactions, explanations are more likely presented in natural language. Furthermore, while evaluating model explainability, little research has taken human responses into consideration or included measurement of human reactions. The void of human-related research led to absence of useful insights to develop and deploy AI applications.

This research employs natural language inference paradigm, using transformer-based attention weight to provide explanations of the task performance of the model. After the training, we also evaluate human trust and preference towards the explanation provided by different models. The results indicate that in natural language inference tasks, the model performance, and , long, contextual explanations are more advantageous than short, concise explanation in gaining human trust.
Reference: Adadi, Amina & Berrada, Mohammed. (2018). Peeking inside the black-box: A survey on Explainable Artificial Intelligence (XAI). IEEE Access. PP. 1-1. 10.1109/ACCESS.2018.2870052.
Review of "Natural language understanding" by James Allen. Benjamin/Cummings 1987.
L. Arras, F. Horn, G. Montavon, K.-R. M¨uller, and W. Samek. Explaining predictions of non-linear classifiersinnlp. InProceedingsofthe1stWorkshoponRepresentation Learning for NLP, pages 1–7. ACL, 2016
Baehrens, David & Fiddike, Timon & Harmeling, Stefan & Kawanabe, Motoaki & Hansen, Katja & Müller, Klaus-Robert. (2009). How to Explain Individual Classification Decisions. Journal of Machine Learning Research. 11.
Bahdanau, Dzmitry & Cho, Kyunghyun & Bengio, Y.. (2014). Neural Machine Translation by Jointly Learning to Align and Translate. ArXiv. 1409.
Baxter, J. (1997). A Bayesian/information theoretic model of learning to learn via multiple task sampling. Machine Learning, 28, 7–39. Retrieved from http://link.springer.com/article/10.1023/A:1007327622663
Bos, Johan & Markert, Katja. (2005). Recognising Textual Entailment with Logical Inference.. 10.3115/1220575.1220654.
Bowman, Samuel & Angeli, Gabor & Potts, Christopher & Manning, Christoper. (2015). A large annotated corpus for learning natural language inference. 10.18653/v1/D15-1075.
R. Caruana, Y. Lou, J. Gehrke, P. Koch, M. Sturm, N. Elhadad, “Intelligible models for healthcare: Predicting pneumonia risk and hospital 30-day readmission”, In Proc. of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, p. 1721–1730, 2015.
Chen, Hanjie & Zheng, Guangtao & Ji, Yangfeng. (2020). Generating Hierarchical Explanations on Text Classification via Feature Interaction Detection. 5578-5593. 10.18653/v1/2020.acl-main.494.
Clark, Kevin & Khandelwal, Urvashi & Levy, Omer & Manning, Christoper. (2019). What Does BERT Look at? An Analysis of BERT’s Attention. 276-286. 10.18653/v1/W19-4828.
Dagan, Ido & Glickman, Oren & Magnini, Bernardo. (2006). The pascal recognising textual entailment challenge.
Devlin, Jacob & Chang, Ming-Wei & Lee, Kenton & Toutanova, Kristina. (2018). BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding.
F. Doshi-Velez, B. Kim, “Towards a rigorous science of interpretable machine learning”, arXiv preprint arXiv:1702.08608, 2018.
Feng, Shi & Boyd-Graber, Jordan. (2019). What can AI do for me?: evaluating machine learning interpretations in cooperative play. 229-239. 10.1145/3301275.3302265.
Ghaeini, Reza & Fern, Xiaoli & Tadepalli, Prasad. (2018). Interpreting Recurrent and Attention-Based Neural Models: a Case Study on Natural Language Inference.
Hase, Peter & Bansal, Mohit. (2020). Evaluating Explainable AI: Which Algorithmic Explanations Help Users Predict Model Behavior?. 5540-5552. 10.18653/v1/2020.acl-main.491.
G.E. Hinton, J.L. McClelland, D.E. Rumelhart. Distributed representations. In: Parallel distributed processing: Explorations in the microstructure of cognition. Volume 1: Foundations,MIT Press, 1986.
Hinton, Geoffrey & Vinyals, Oriol & Dean, Jeff. (2015). Distilling the Knowledge in a Neural Network.
Hoffman, Robert & Mueller, Shane & Klein, Gary & Litman, Jordan. (2018). Metrics for Explainable AI: Challenges and Prospects.
Huang, Hen-Hsen & Chang, Kai-Chun & Chen, Hsin-Hsi. (2013). Modeling Human Inference Process for Textual Entailment Recognition. ACL 2013 - 51st Annual Meeting of the Association for Computational Linguistics, Proceedings of the Conference. 2. 446-450.
Jacovi, Alon & Goldberg, Yoav. (2020). Towards Faithfully Interpretable NLP Systems: How Should We Define and Evaluate Faithfulness?. 4198-4205. 10.18653/v1/2020.acl-main.386.
Jain, Sarthak & Wallace, Byron. (2019). Attention is not Explanation.
Jijkoun, Valentin & Rijke, Maarten. (2005). Recognizing Textual Entailment Using Lexical Similarity. Journal of Colloid and Interface Science - J COLLOID INTERFACE SCI.
Kumar, Sawan & Talukdar, Partha. (2020). NILE : Natural Language Inference with Faithful Natural Language Explanations. 8730-8742. 10.18653/v1/2020.acl-main.771.
Lapuschkin, Sebastian & Binder, Alexander & Montavon, Grégoire & Klauschen, Frederick & Müller, Klaus-Robert & Samek, Wojciech. (2015). On Pixel-Wise Explanations for Non-Linear Classifier Decisions by Layer-Wise Relevance Propagation. PLoS ONE. 10. e0130140. 10.1371/journal.pone.0130140.
Lee, Dong-Ho & Khanna, Rahul & Lin, Bill & Lee, Seyeon & Ye, Qinyuan & Boschee, Elizabeth & Neves, Leonardo & Ren, Xiang. (2020). LEAN-LIFE: A Label-Efficient Annotation Framework Towards Learning from Explanation. 372-379. 10.18653/v1/2020.acl-demos.42.
Z. C. Lipton, “The mythos of model interpretability”, ICML Workshop on Human Interpretability in Machine Learning, 2016.
Liu, Yinhan & Ott, Myle & Goyal, Naman & Du, Jingfei & Joshi, Mandar & Chen, Danqi & Levy, Omer & Lewis, Mike & Zettlemoyer, Luke & Stoyanov, Veselin. (2019). RoBERTa: A Robustly Optimized BERT Pretraining Approach.
Xinhong, Ma & Zhang, Tianzhu & Xu, Changsheng. (2019). GCAN: Graph Convolutional Adversarial Network for Unsupervised Domain Adaptation. 8258-8268. 10.1109/CVPR.2019.00846.
MacCartney, Bill & Manning, Christoper. (2009). An extended model of natural logic. 140-156. 10.3115/1693756.1693772.
Marneffe, Marie-Catherine & Rafferty, Anna & Manning, Christoper. (2008). Finding Contradictions in Text.. 1039-1047.
T. Miller, “Explanation in Artificial Intelligence: Insights from the Social Sciences”, arXiv preprint arXiv: 1706.07269, 2017.
Mohankumar, Akash & Nema, Preksha & Narasimhan, Sharan & Khapra, Mitesh & Srinivasan, Balaji & Ravindran, Balaraman. (2020). Towards Transparent and Explainable Attention Models.
Montavon, Grégoire & Samek, Wojciech & Müller, Klaus-Robert. (2018). Methods for Interpreting and Understanding Deep Neural Networks. Digital Signal Processing. 73. 1-15. 10.1016/j.dsp.2017.10.011.
Mueller, Shane & Hoffman, Robert & Clancey, William & Emrey, Abigail & Klein, Gary. (2019). Explanation in Human-AI Systems: A Literature Meta-Review, Synopsis of Key Ideas and Publications, and Bibliography for Explainable AI.
Mullenbach, James & Wiegreffe, Sarah & Duke, Jon & Sun, J. & Eisenstein, Jacob. (2018). Explainable Prediction of Medical Codes from Clinical Text. 1101-1111. 10.18653/v1/N18-1100.
Pruthi, Danish & Dhingra, Bhuwan & Neubig, Graham & Lipton, Zachary. (2020). Learning to Deceive with Attention-Based Explanations. 4782-4793. 10.18653/v1/2020.acl-main.432.
Radford, Alec and Wu, Jeff and Child, Rewon and Luan, David and Amodei, Dario and Sutskever, Ilya. (2019). Language Models are Unsupervised Multitask Learners
Raffel, Colin & Shazeer, Noam & Roberts, Adam & Lee, Katherine & Narang, Sharan & Matena, Michael & Zhou, Yanqi & Li, Wei & Liu, Peter. (2019). Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer.
Rajani, Nazneen & Zhang, Rui & Tan, Yi & Zheng, Stephan & Weiss, Jeremy & Vyas, Aadit & Gupta, Abhijit & Xiong, Caiming & Socher, Richard & Radev, Dragomir. (2020). ESPRIT: Explaining Solutions to Physical Reasoning Tasks. 7906-7917. 10.18653/v1/2020.acl-main.706.
Ribera Turró, Mireia & Lapedriza, Agata. (2019). Can we do better explanations? A proposal of User-Centered Explainable AI.
Ribeiro, Marco & Singh, Sameer & Guestrin, Carlos. (2016). "Why Should I Trust You?": Explaining the Predictions of Any Classifier. 1135-1144. 10.1145/2939672.2939778.
Rocktäschel, Tim & Grefenstette, Edward & Hermann, Karl & Kočiský, Tomáš & Blunsom, Phil. (2015). Reasoning about Entailment with Neural Attention.
Rudin, Cynthia. (2019). Stop explaining black box machine learning models for high stakes decisions and use interpretable models instead. Nature Machine Intelligence. 1. 206-215. 10.1038/s42256-019-0048-x.
Samek, Wojciech & Wiegand, Thomas & Müller, Klaus-Robert. (2017). Explainable Artificial Intelligence: Understanding, Visualizing and Interpreting Deep Learning Models. ITU Journal: ICT Discoveries - Special Issue 1 - The Impact of Artificial Intelligence (AI) on Communication Networks and Services. 1. 1-10.
Sammons, Mark & Vydiswaran, V.G.Vinod & Roth, Dan. (2010). "Ask Not What Textual Entailment Can Do for You...".. ACL 2010 - 48th Annual Meeting of the Association for Computational Linguistics, Proceedings of the Conference. 1199-1208.
Sanh, Victor & Debut, Lysandre & Chaumond, Julien & Wolf, Thomas. (2019). DistilBERT, a distilled version of BERT: smaller, faster, cheaper and lighter.
Serrano, Sofia & Smith, Noah. (2019). Is Attention Interpretable?.
Shrikumar, Avanti & Greenside, Peyton & Kundaje, Anshul. (2017). Learning Important Features Through Propagating Activation Differences.
D.Silver,A.Huang,C.J.Maddison,A.Guez,L.Sifre, G. Van Den Driessche, et al. Mastering the game of go with deep neural networks and tree search. Nature, 529(7587):484–489, 2016.
Simonyan, Karen & Vedaldi, Andrea & Zisserman, Andrew. (2013). Deep Inside Convolutional Networks: Visualising Image Classification Models and Saliency Maps. preprint.
Vashishth, Shikhar & Upadhyay, Shyam & Tomar, Gaurav & Faruqui, Manaal. (2019). Attention Interpretability Across NLP Tasks.
Vaswani, Ashish & Shazeer, Noam & Parmar, Niki & Uszkoreit, Jakob & Jones, Llion & Gomez, Aidan & Kaiser, Lukasz & Polosukhin, Illia. (2017). Attention Is All You Need.
Vig, Jesse. (2019). A Multiscale Visualization of Attention in the Transformer Model.
Wang, Alex & Singh, Amapreet & Michael, Julian & Hill, Felix & Levy, Omer & Bowman, Samuel. (2018). GLUE: A Multi-Task Benchmark and Analysis Platform for Natural Language Understanding.
Wang, Danding & Yang, Qian & Abdul, Ashraf & Lim, Brian. (2019). Designing Theory-Driven User-Centric Explainable AI. 10.1145/3290605.3300831.
Wiegreffe, Sarah & Pinter, Yuval. (2019). Attention is not not Explanation.
Wu, Lianwei & Yuan, Rao & Zhao, Yongqiang & Liang, Hao & Nazir, Ambreen. (2020). DTCA: Decision Tree-based Co-Attention Networks for Explainable Claim Verification.
K. Xu, J. Ba, R. Kiros, K. Cho, A. Courville, R. Salakhutdinov, R. Zemel, Y. Bengio, “Show, attend and tell: Neural image caption generation with visual attention”, In International Conference on Machine Learning (ICML), 2015.
Yang, Zhilin & Dai, Zihang & Yang, Yiming & Carbonell, Jaime & Salakhutdinov, Ruslan & Le, Quoc. (2019). XLNet: Generalized Autoregressive Pretraining for Language Understanding.
Yin, Wenpeng & Schütze, Hinrich & Xiang, Bing & Zhou, Bowen. (2015). ABCNN: Attention-Based Convolutional Neural Network for Modeling Sentence Pairs. Transactions of the Association for Computational Linguistics. 4. 10.1162/tacl_a_00244.
Young, Peter & Lai, Alice & Hodosh, Micah & Hockenmaier, Julia. (2014). From image descriptions to visual denotations: New similarity metrics for semantic inference over event descriptions. TACL. 2. 67-78. 10.1162/tacl_a_00166.
Zeiler, M.D. & Fergus, R.. (2013). Visualizing and understanding convolutional networks. European Conference on Computer Vision(ECCV). 8689. 818-833.
Zintgraf, Luisa & Cohen, Taco & Adel, Tameem & Welling, Max. (2017). Visualizing Deep Neural Network Decisions: Prediction Difference Analysis.
Description: 碩士
國立政治大學
數位內容碩士學位學程
107462009
Source URI: http://thesis.lib.nccu.edu.tw/record/#G0107462009
Data Type: thesis
Appears in Collections:[數位內容碩士學位學程] 學位論文
[數位內容碩士學位學程] 學位論文

Files in This Item:

File Description SizeFormat
200901.pdf3942KbAdobe PDF25View/Open


All items in 學術集成 are protected by copyright, with all rights reserved.


社群 sharing