Skip to content
This repository has been archived by the owner on Nov 3, 2023. It is now read-only.

Open source TOD Models #4438

Merged
merged 208 commits into from
Mar 24, 2022
Merged

Open source TOD Models #4438

merged 208 commits into from
Mar 24, 2022

Conversation

moyapchen
Copy link
Contributor

Thought I'd put this up ages ago but maybe not? (Or it got pruned lol)

Moya Chen added 30 commits November 15, 2021 20:15
See documentation block in `tod_agents.py`
As noted in the README, this agent takes data generated from `tod_world_script.py` and dumps it out to a teacher.

(Note that I tried setting up a regression test for this teacher, but I ran into issues getting it to save the output directory to not be something that included my local homedir name in it..)
See documentation block in `tod_agents.py`

(I'm not 100% sure if `conftest.py` is a right file to change, though I did notice that `pytest.ini` was necessary to get pytest to run.)
See documentation in `tod_world_script.py` for usage.
Refactor Google SGD away from old format into TOD Conversations format.

Datasets added in this substack:
* *Google SGD*
   * Google SGD Simulation Splits (In-domain, Out-domain)
* MetalWoz
* MSR_E2E
* Multidogo
* MultiWoz V2.2
* Taskmaster
* Taskmaster2
* Taskmaster3 (TicketTalk)

Test plan:
Regression test, `parlai dd` of dataset
Code for processing Google SGD into In-domain and Out-domain data via `build.py`, using via agents.

Datasets added in this substack:
* Google SGD
   * **Google SGD Simulation Splits (In-domain, Out-domain)**
* MetalWoz
* MSR_E2E
* Multidogo
* MultiWoz V2.2
* Taskmaster
* Taskmaster2
* Taskmaster3 (TicketTalk)

Test plan:
Regression test, `parlai dd` of dataset
Code for process MetalWoz into System + User Simulator teachers

Getting it to be in the Conversations format is a pain, so I don't even try here. (It's documented this way in the paper as well)

----------------------------
Datasets added in this substack:
* Google SGD
   * Google SGD Simulation Splits (In-domain, Out-domain)
* **MetalWoz**
* MSR_E2E
* Multidogo
* MultiWoz V2.2
* Taskmaster
* Taskmaster2
* Taskmaster3 (TicketTalk)

Test plan:
Regression test, `parlai dd` of dataset
Title. I only include System + UserSimulator Teachers here since that's all we need right now from dataset.

Datasets added in this substack:
* Google SGD
   * Google SGD Simulation Splits (In-domain, Out-domain)
* MetalWoz
* **MSR_E2E**
* Multidogo
* MultiWoz V2.2
* Taskmaster
* Taskmaster2
* Taskmaster3 (TicketTalk)

Test plan:
Regression test, `parlai dd` of dataset
Title. I only include System + UserSimulator Teachers here since that's all we need right now from dataset.

Datasets added in this substack:
* Google SGD
   * Google SGD Simulation Splits (In-domain, Out-domain)
* MetalWoz
* **MSR_E2E**
* Multidogo
* MultiWoz V2.2
* Taskmaster
* Taskmaster2
* Taskmaster3 (TicketTalk)

Test plan:
Regression test, `parlai dd` of dataset
Title. I only include System + UserSimulator Teachers here since that's all we need right now from dataset.

There are so many versions of MultiWoz, but this one is closest to our simulator.

---------------------------------

Datasets added in this substack:
* Google SGD
   * Google SGD Simulation Splits (In-domain, Out-domain)
* MetalWoz
* **MSR_E2E**
* Multidogo
* MultiWoz V2.2
* Taskmaster
* Taskmaster2
* Taskmaster3 (TicketTalk)

Test plan:
Regression test, `parlai dd` of dataset
Title. I only include System + UserSimulator Teachers here since that's all we need right now from dataset.

There's non-fb people that made edits in the original version of Taskmaster, so keep those teachers around too.
---------------
Datasets added in this substack:
* Google SGD
   * Google SGD Simulation Splits (In-domain, Out-domain)
   * MetalWoz
   * MSR_E2E
   * Multidogo
   * MultiWoz V2.2
   * **Taskmaster**
   * Taskmaster2
   * Taskmaster3 (TicketTalk)

   Test plan:
   Regression test, `parlai dd` of dataset
Title. I only include System + UserSimulator Teachers here since that's all we need right now from dataset.

There's a legacy implementation of Taskmaster2 here, but doesn't seem to be anyone using, so clobbering.

---------------------------------

Datasets added in this substack:
* Google SGD
   * Google SGD Simulation Splits (In-domain, Out-domain)
* MetalWoz
* MSR_E2E
* Multidogo
* **MultiWoz V2.2**
* Taskmaster
* Taskmaster2
* Taskmaster3 (TicketTalk)

Test plan:
Regression test, `parlai dd` of dataset
Title. I only include System + UserSimulator Teachers here since that's all we need right now from dataset.

Sooo many datasets.

---------------------------------

Datasets added in this substack:
* Google SGD
   * Google SGD Simulation Splits (In-domain, Out-domain)
* MetalWoz
* MSR_E2E
* Multidogo
* MultiWoz V2.2
* Taskmaster
* Taskmaster2
* **Taskmaster3 (TicketTalk)**

Test plan:
Regression test, `parlai dd` of dataset
Base automatically changed from simpler_tod_6_project to main March 24, 2022 04:48
@moyapchen moyapchen merged commit d567eb8 into main Mar 24, 2022
@moyapchen moyapchen deleted the simpler_tod_7_open_source_models branch March 24, 2022 16:28
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants