Translator layer to cut on vocab size #588

IlyaGazman · 2024-06-13T18:30:24Z

IlyaGazman
Jun 13, 2024

What do you think about adding a translator layer that will pre process all the training data and translate all the documents to smaller vocab size. For example

it will translate all the languages to English
It will translate all smiles and other special signs to asci words.

The translator will be an LLM by it self, but it will only perform the task of translation.

Then it will be used again during reference, both for input and output.

What do you think about this idea? How impactful can this layer be in reducing vocab size? And how that can effect performance?

youseai · 2024-06-18T08:55:01Z

youseai
Jun 18, 2024

@IlyaGazman I was also thinking on a similar line. We can have a good strong model trained only on English. Now, for other languages we can have encoder and decoder for specific languages that can be used to translate from language to English and vice versa. Using this we can have strong models for that support other language as well.

@karpathy Also, was curious to explore if we can create a new language that has a bigger vocab size but is independent of grammar (Which word should come first, different arrangement of words based on tenses) . Then the training process would focus more on which word to output rather than the placement of word. Not sure how good or bad would this model perform

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Translator layer to cut on vocab size #588

Uh oh!

{{title}}

Uh oh!

Replies: 1 comment

Uh oh!

{{title}}

Uh oh!

Select a reply

Uh oh!

Translator layer to cut on vocab size #588

Uh oh!

IlyaGazman Jun 13, 2024

Replies: 1 comment

Uh oh!

youseai Jun 18, 2024

IlyaGazman
Jun 13, 2024

youseai
Jun 18, 2024