Skip to content

Query regarding Code switch Wav2Vec2 #37

Open
@Anuj-Mishraa

Description

I have read your paper titled " Code Switched and Code Mixed Speech Recognition for Indic languages". It is very interesting paper and I am trying to execute Hindi-English code switch data on Wav2Vec2. I am having some queries:

  1. If we are going to finetune the code switch data, what will be the tokenizer and processor we have to use?
  2. What kind of model, we have to choose (Either English or Hindi)?3. How the model will identify the code switch instances in order to generate the accurate transcription.
  3. In what form, the predicted final output will come i.e. either in hindi form (when fine tune with Hindi-4200) or in english form (when fine tune with XLSR-53) or in same code switch form (Hindi-English) (using either of above model).

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions