This is the code release for the paper: The Power of the Noisy Channel: Unsupervised End-to-End Task-Oriented Dialogue with LLMs
Brendan King and Jeffrey Flanigan.
-
We provide a Docker image, in which there is a virtual environment at
/root/venv
with all dependencies installed. See ./k8s/Dockerfile for details. -
Alternatively, for a local installation, we use conda:
# Create environment with Python 3.10
conda create python=3.10 --prefix venv
# Add in torch/cuda and gxx, nvcc
conda install --yes pytorch torchvision torchaudio pytorch-cuda=12.1 -c pytorch -c nvidia
conda install -c anaconda gxx_linux-64 nvidia::cuda-nvcc
# Install dependencies and repo itself in edit mode (gets most dependencies via setup.cfg). NOTE: we found it important to install flash-attention last, with this specific version of ninja
pip install pyzmq faiss-cpu faiss-gpu && pip install packaging ninja==1.10.2 && pip install --user -e . && pip install flash-attn --no-build-isolation
We use the MultiWOZ 2.2 dataset, available in its original form here:
We share our processed version on Huggingface at Brendan/multiwoz_turns_v22.
- We release our final model, trained with 2 steps of our EM process on Huggingface link
Here are the steps for repeating experiments, as well as outputs from each step.
Many experiments depend on Weights & Biases for artifact storage. Apologies for any inconvenience. You should be able to set the
entity these are logged to with environment variable WANDB_ENTITY
and/or a function argument.
In this step, we create an initial self-labelling of the MultiWOZ dataset using bigcode/starcoder (15B). This method follows the procedure described in sections 4.1-4.4 of the paper.
Inputs:
- The unlabelled MultiWOZ corpus (train split), partitioned into 50 dialogue chunks [link]
- StarCoder 15B [link]
Outputs
- A self-labelled MultiWOZ dataset. Here is an example: [link]
Further Details & Reproduction Steps: runs/offline_labelling_experiment/initial_labelling/README.md
Inputs:
- A self-labeled corpus
- Pre-trained base model (we use StarCoder 3B)
Outputs:
- A fine-tuned Dialogue State Tracker and Dialogue Act Tagger, which can be used to re-label the corpus
Further Details & Reproduction Steps: runs/finetune_multitask/starcoder_3b/offline_label/README.md
Inputs:
- The unlabelled MultiWOZ corpus (train split), partitioned into 50 dialogue chunks [link]
- A StarCoder 3B model fine-tuned as a Dialogue State Tracker and Dialogue Act Tagger
Outputs:
- An improved self-labelled MultiWOZ dataset. Here is an example:[link]
Further Details & Reproduction Steps: runs/offline_labelling_experiment/second_labelling/README.md
Inputs:
- A self-labeled corpus
- Pre-trained base model (we use StarCoder 3B)
Outputs:
- A model which can be used as an end-to-end dialogue agent
Further Details & Reproduction Steps: runs/finetune_multitask/starcoder_3b/online_e2e/README.md
Inputs:
- A model which can be used as an end-to-end dialogue agent.
- MultiWOZ corpus
Outputs:
- predictions and evaluation scores.
Further Details & Reproduction Steps: runs/online_e2e_experiment/test_set/README.md
@misc{king2024power,
title={The Power of the Noisy Channel: Unsupervised End-to-End Task-Oriented Dialogue with LLMs},
author={Brendan King and Jeffrey Flanigan},
year={2024},
eprint={2404.15219},
archivePrefix={arXiv},
primaryClass={cs.CL}
}