Skip to content

TalkToModel gives anyone with the powers of XAI through natural language conversations πŸ’¬!

License

Notifications You must be signed in to change notification settings

dylan-slack/TalkToModel

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

63 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

drawing

TalkToModel: Explaining Machine Learning Models with Interactive Natural Language Conversations

Python application arXiv

Welcome to the TalkToModel paper page! The goal of this project is to enable anyone to understand the predictions of a trained machine learning model through a natural language conversation. Ultimately, it is a platform for conversational XAI!

Β Β Β  drawing

Links

If you found this work useful, please cite us!

@Article{Slack2023,
author={Slack, Dylan
and Krishna, Satyapriya
and Lakkaraju, Himabindu
and Singh, Sameer},
title={Explaining machine learning models with interactive natural language conversations using TalkToModel},
journal={Nature Machine Intelligence},
year={2023},
month={Jul},
day={27},
abstract={Practitioners increasingly use machine learning (ML) models, yet models have become more complex and harder to understand. To understand complex models, researchers have proposed techniques to explain model predictions. However, practitioners struggle to use explainability methods because they do not know which explanation to choose and how to interpret the explanation. Here we address the challenge of using explainability methods by proposing TalkToModel: an interactive dialogue system that explains ML models through natural language conversations. TalkToModel consists of three components: an adaptive dialogue engine that interprets natural language and generates meaningful responses; an execution component that constructs the explanations used in the conversation; and a conversational interface. In real-world evaluations, 73{\%} of healthcare workers agreed they would use TalkToModel over existing systems for understanding a disease prediction model, and 85{\%} of ML professionals agreed TalkToModel was easier to use, demonstrating that TalkToModel is highly effective for model explainability.},
issn={2522-5839},
doi={10.1038/s42256-023-00692-8},
url={https://doi.org/10.1038/s42256-023-00692-8}
}

[UPDATE] This work won an honorable mention outstanding paper at the TSRML Workshop at NeurIPS πŸŽ‰

We additionally wrote a precursor paper about domain experts needs for understanding models, that helped inspire this work. It's called Rethinking Explainability as a Dialogue: A Practitioner's Perspective. Check that out as well!

Table of Contents

Overview

Here follows a brief overview of the purpose and scope of the system.

Purpose

As machine learning models are being increasingly integrated in our day-to-day lives, it becomes important that anyone can interact with and understand them. TalkToModel helps realize this goal and enables anyone to chat with a machine learning model to understand the model's predictions.

Please read our paper for more motivation and details about how the system works.

Scope

TalkToModel supports tabular models and datasets. For example, you could use the system to chat with a random forest trained on a loan prediction task but not BERT trained on a sentiment analysis task, in the system's current form.

Installation

To run TalkToModel, you can either setup a conda environment or use Docker to directly run the Flask App.

Note, for GPU inference, the environment requires CUDA 11.3. If you do not have cuda 11.3 and try and run the docker application, it will crash! One way around this is to switch to cpu inference. It will be a bit slower (instructions), but should work.

Docker

If you want to use Docker, you can skip this setup step ⏭️

Conda

Create the environment and install dependencies.

conda create -n ttm python=3.9
conda activate ttm

Install the requirements

pip install -r requirements.txt

Nice work πŸ‘

Running A Pre-Configured TalkToModel Demo

Next, we discuss how to run the Flask Application on one of the datasets and models from the paper. These include a diabetes, crime, and credit prediction task.

Configuration

Here, we talk about choosing a demo + parsing model. If you just want to run the demo on the diabetes dataset, you can skip to Running With Conda or Running With Docker.

Choosing a Demo & Parsing Model

This software is configured using gin-config. Global parameters are stored in ./global_config.gin and demo specific parameters are stored in the ./configs directory, e.g., ./configs/diabetes-config.gin for the diabetes prediction demo.

This repo comes configured using a fine-tuned t5-small model for the diabetes prediction task, but this can be changed by modifying ./global_config.gin:

GlobalArgs.config = "./configs/{demo}-config.gin"

and changing {demo} to one of diabetes, compas, or german respectively. Further, we provide our best fine-tuned t5-small and t5-large parsing models on the huggingface hub. These can be selected by modifying ./configs/{demo}-config.gin:

# For the t5 small model
ExplainBot.parsing_model_name = "ucinlp/{demo}-t5-small"
# For the t5 large model
ExplainBot.parsing_model_name = "ucinlp/{demo}-t5-large"

and the respective parsing model will be downloaded from the hub automatically.

GPU Availability

By default, the system will try to push the models to a cuda device and will crash if one isn't available. To switch to CPU inference or to use a different cuda device, modify ./parsing/t5/gin_configs/t5-large.gin

load_t5_params.device = "{device}"

where {device} is the device you want (e.g., cpu).

Running With Conda

If you installed the conda environment, to launch the Flask web app you can run

python flask_app.py

Running With Docker

If you want to run with Docker, you can build the docker app

sudo docker build -t ttm .

And then run the image

sudo docker run -d -p 4000:4000 ttm

Use It!

It might take a minute or two to build the application. The reason is that we cache a fair number of computations (mostly explanations) beforehand to improve the realtime user experience. For reference, on my M1 macbook pro, building from scratch takes about 5 minutes. However, after your first time running a demo, these computations will be stored in the ./cache folder, and it is not necessary to compute them again, so startup should be very quick.

That's it! The app should be running πŸ’₯

Experiments

Here, we discuss running experiments from the paper. For these experiments, make sure to set

ExplainBot.skip_prompts = False

in each of the diabetes, german, and compas gin config files, so that the system generates the prompts for each dataset. We set this to True for the fine-tuned demo parsing models, because it is unnecessary in this case and speeds up the startup time significantly.

Fine-tuning Models

To fine-tune a parsing model, run

python parsing/t5/start_fine_tuning.py --gin parsing/t5/gin_configs/t5-{small, base, large}.gin --dataset {diabetes, german, compas}

where {small, base, large} and {diabetes, german, compas} are one of the values in the set. Note, these experiments require use Weights & Biases to track training and the best validation model.

For simplicity, we also provide all the best validation pre-trained models for download on huggingface: https://huggingface.co/dslack/all-finetuned-ttm-models. You can download these models from the provided zip file, and unzip the models to ./parsing/t5/models.

Evaluating Gold Parsing Accuracy

With all the models downloaded, you can compute the parsing accuracies by running

python experiments/generate_parsing_results.py

The results will be deposited in the ./experiments/results_store directory.

Running on Your Own Model & Dataset

Please see the tutorial ./tutorials/running-on-your-own-model.ipynb for a step-by-step walk-through about how to do this, and the different options you have for setting up the conversation.

Development

TalkToModel can be extended to include new functionality pretty easily. Please see ./tutorials/extending-ttm.md for a walk-through on how to extend the system.

Testing

You can run the tests by running pytest from the base directory.

Citation

Cite us 🫢

@Article{Slack2023,
author={Slack, Dylan
and Krishna, Satyapriya
and Lakkaraju, Himabindu
and Singh, Sameer},
title={Explaining machine learning models with interactive natural language conversations using TalkToModel},
journal={Nature Machine Intelligence},
year={2023},
month={Jul},
day={27},
abstract={Practitioners increasingly use machine learning (ML) models, yet models have become more complex and harder to understand. To understand complex models, researchers have proposed techniques to explain model predictions. However, practitioners struggle to use explainability methods because they do not know which explanation to choose and how to interpret the explanation. Here we address the challenge of using explainability methods by proposing TalkToModel: an interactive dialogue system that explains ML models through natural language conversations. TalkToModel consists of three components: an adaptive dialogue engine that interprets natural language and generates meaningful responses; an execution component that constructs the explanations used in the conversation; and a conversational interface. In real-world evaluations, 73{\%} of healthcare workers agreed they would use TalkToModel over existing systems for understanding a disease prediction model, and 85{\%} of ML professionals agreed TalkToModel was easier to use, demonstrating that TalkToModel is highly effective for model explainability.},
issn={2522-5839},
doi={10.1038/s42256-023-00692-8},
url={https://doi.org/10.1038/s42256-023-00692-8}
}

Contact

You can reach out to dslack@uci.edu with any questions or issues you're running into.

Please do reach out or submit an issue, because I would love to help you get it running, especially on your own models and data ❀️!