You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Dear all,
I am new to NLP and has some strange questions, I try to explain them clearly.
My goal is to using a specific corpus to fine-tune t5-base model with a casual language modeling, I find this document and it use AutoModelForCasualLM, but this liabrary just not include series of t5 models.
So my question is:
How should I do to finetune t5 model for CLM object? In my understanding, CLM is a process of predicting token_2 from token_1 , token_3 from token_1, token_2 until the end of input sequence, so i am confused how to finish this process myself.
I try to spilt one my train data into something like this (ti == token_i, 1 == eos_token):
input_ids labels
[t1, t2, t3, t4, 1, 1, ...][t1, t2, t3, t4, t5, 1, ...]
The first problem is obvious, the expanded dataset is too large and requires more time to fine-tune; The second problem is that this seems strange, and I don't know if this fulfills the CLM's mission requirements. This is the only idea that i can catch up to solve this problem, does it work?
Thanks!!
The text was updated successfully, but these errors were encountered:
Dear all,
I am new to NLP and has some strange questions, I try to explain them clearly.
My goal is to using a specific corpus to fine-tune t5-base model with a casual language modeling, I find this document and it use
AutoModelForCasualLM
, but this liabrary just not include series of t5 models.So my question is:
How should I do to finetune t5 model for CLM object? In my understanding, CLM is a process of predicting
token_2
fromtoken_1
,token_3
fromtoken_1, token_2
until the end of input sequence, so i am confused how to finish this process myself.I try to spilt one my train data into something like this (ti == token_i, 1 == eos_token):
input_ids labels
[t1, 1, 1, 1, 1, 1, ...]
[t1, t2, 1, 1, 1, 1, ...]
[t1, t2, 1, 1, 1, 1, ...]
[t1, t2, t3, 1, 1, 1, ...]
[t1, t2, t3, 1, 1, 1, ...]
[t1, t2, t3, t4, 1, 1, ...]
[t1, t2, t3, t4, 1, 1, ...]
[t1, t2, t3, t4, t5, 1, ...]
The first problem is obvious, the expanded dataset is too large and requires more time to fine-tune; The second problem is that this seems strange, and I don't know if this fulfills the CLM's mission requirements. This is the only idea that i can catch up to solve this problem, does it work?
Thanks!!
The text was updated successfully, but these errors were encountered: