-
Notifications
You must be signed in to change notification settings - Fork 1.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
add arg padding_free
to DataCollatorForCompletionOnlyLM
#1887
Conversation
cc @kashif this is linked to huggingface/transformers#31629 which will be in the next transformers release! |
thanks @ArthurZucker |
Gentle ping and info, this post covers the support in transformers: https://huggingface.co/blog/packing-with-FA2 |
The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for adding this neat extension to the collator! It LGTM barring the unit tests using the assert methods from unittest
batch = collator(tokenized_instruction) | ||
batch_paddingfree = collator_paddingfree(tokenized_instruction) | ||
|
||
assert "attention_mask" not in batch_paddingfree |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you please replace these assert statements with assert methods from unittest
, e.g. self.assertTrue()
for this case
batch = collator(tokenized_instruction) | ||
batch_paddingfree = collator_paddingfree(tokenized_instruction) | ||
|
||
assert "attention_mask" not in batch_paddingfree |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
assert "attention_mask" not in batch_paddingfree | |
self.assertNotIn("attention_mask", batch_paddingfree) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Co-authored-by: Pedro Cuenca <pedro@huggingface.co>
add argument
DataCollatorForCompletionOnlyLM(padding_free=True)
to utilize huggingface/transformers#31629