Add ability to switch output languages for multilingual models

### Context:
Currently when loading pipelines for multilingual models we use the runtime cli param/option to set the target language for the translation output.  
This is emulated in the dummy model that was recently added:
https://github.com/facebookresearch/SimulEval/blob/main/examples/speech_to_text/counter_in_tgt_lang_agent.py#L22-L24
When used in the demo in its current state, `tgt_lang` needs to be set  in `vad_main.yaml` and loaded from there when building the model.

The issue with passing the target language this way is that we would have to reload the pipeline/model when we have to change the target language, which would result in a lot of unnecessary overhead.

### To Do:
To avoid the overhead with reloading and to effectively showcase the capabilities of multilingual models, we would like to be able to pass in the target language to the models dynamically. 
Refactor the current target language passing mechanism to make it dynamic. 

Hint: passing it through the input [segment](https://github.com/facebookresearch/SimulEval/blob/e6e8ea3981546e791911766ce835db6a995b7033/simuleval/data/segments.py#L12) and then through the agent states could work.


Additionally, we would also like to allow specifying multiple target languages (as opposed to just one shown in dummy model), this would eventually help us to design pipelines for simultaneously translating to multiple languages and for getting ASR output.


There are 2 parts to this issue:
1.  Changes in SimulEval repo to enable passing in the target language as part of the instances/segments to the pipeline and then passing it through the agents in the pipelines through their states.
2. Change in Seamless-Experiences repo to hook up the SimulEval pipeline with the changes mentioned above into the demo.


### Hints/Pointers:
- For SimulEval side changes:
  -  You can use the [example pipeline](https://github.com/facebookresearch/SimulEval/blob/main/examples/speech_to_text_demo/counter_in_tgt_lang_pipeline.py) we added.
  - You can run it locally: 
    - `cd SimulEval/examples/speech_to_text`
    - `simuleval --agent counter_in_tgt_lang_agent.py --user-dir . --agent-class agents.CounterInTargetLanguageAgent --source-segment-size 1000 --source source.txt --target reference/en.txt --output <path to output folder> --tgt-lang es`
  - We suggest using the debugger and stepping through the code while running the pipeline to observe and understand how the language tag is set, how the [instances are loaded from the source dataset](https://github.com/facebookresearch/SimulEval/blob/e6e8ea3981546e791911766ce835db6a995b7033/simuleval/evaluator/evaluator.py#L154-L156), and how the [segments are constructed from the instance and pushed through the pipeline ](https://github.com/facebookresearch/SimulEval/blob/e6e8ea3981546e791911766ce835db6a995b7033/simuleval/evaluator/evaluator.py#L218-L222)
  - After running the  pipeline, you should be able to see a `instances.log` file under the output folder. You should be able to see the output of the pipeline there as "predictions". Try changing the `tgt-lang` and observe how  the output changes accordingly.
  - Using this knowledge, we would like you to make changes so that `tgt_lang` can be inferred from the source dataset  and passed along via the input segment and states ultimately to the agent that uses it.
 

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Add ability to switch output languages for multilingual models #69

Context:

To Do:

Hints/Pointers:

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Add ability to switch output languages for multilingual models #69

Description

Context:

To Do:

Hints/Pointers:

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions