Options
2024
Presentation
Title
Evaluating Local Explanations for Survey Variable Detection in Scientific Publications
Title Supplement
Exploring the utility and faithfulness of the model’s explanations
Paper presented at the Workshop on Scientific Document Understanding co-located with the 38th AAAI Conference on Artificial Inteligence, February 26, 2024 Vancouver, Canada
Abstract
Deep learning methods are increasingly integrated into scientific discovery and document understanding applications, helping scientists to get access to high-quality information. In the Social Sciences, this includes identifying sentences that contain survey variables mentions in scientific publication as part of a dataset. (Variable Detection) Existing methods for the task rely on (generative) pre-trained transformer models and have a high performance, however, interpretability and lack of user trust is a major concern. This work investigates the capabilities of gpt3.5-turbo (chatGPT) and various BERT models for identifying survey variables in publications and generating an explanation for their decision. Regarding performance, we find that the BERT-based fine-tund supervised classifier outperforms the large language model (LLM) with an accuracy of 94.43% (versus 67.50% for chatGPT in a zero-shotsetting). We observe that prompting LLMs to provide an explanation along with the prediction increases accuracy. In terms of interpretability, we apply various explainable techniques (i.e., LIME and SHAP) post-hoc to BERT, thus producing local explanations based on feature attributions. These are constrained to supervised models and cannot be applied to generative pretrained LLMs likewise. Faithfulness metrics show that LIME and SciBERT are best suited to reveal the model’s decision-making process. Instead, natural language explanations (NLE) that justify a model’s prediction are generated by chatGPT as self-explanation. We conduct a human evaluation of the generated free-text rationales to assess their quality. For the evaluation setup, three annotators judge explanations for their utility based on the evaluation criteria wellformedness, consistency, factual groundedness and plausibility. Our experiments were run on an open-access dataset, hence allowing full reproducibility.
Open Access
Rights
CC BY 4.0: Creative Commons Attribution
Language
English