Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

deepspeed-chat: add end-of-text special token #775

Merged
merged 2 commits into from
Oct 17, 2023

Conversation

mosheisland
Copy link
Contributor

Stages 1 & 2 append '<|endoftext|>' text marker to all samples. However, some tokenizers (e.g. OPT, Bloom), encode this marker as a sequence of subword tokens and not as a single special token.

This commit adds an optional support to add the EOT marker as a special token to force the tokenizer to encode it as a single token.

Note that using EOT special token may change the dynamics of stage3 training. Therefore, to be backward compliant, this commit makes it optional.

Change-Id: If98d348fcaa7d6685e755aabe305e23e7649c367

Stages 1 & 2 append '<|endoftext|>' text marker to all samples.
However, some tokenizers (e.g. OPT, Bloom), encode this marker as a sequence
of subword tokens and not as a single special token.

This commit adds an optional support to add the EOT marker as a special token
to force the tokenizer to encode it as a single token.

Note that using EOT special token may change the dynamics of stage3 training.
Therefore, to be backward compliant, this commit makes it optional.

Change-Id: If98d348fcaa7d6685e755aabe305e23e7649c367
Signed-off-by: Moshe Island <misland@habana.ai>
Copy link
Contributor

@lekurile lekurile left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@tjruwase tjruwase merged commit e8d879e into microsoft:master Oct 17, 2023
@mosheisland mosheisland deleted the 11_add_eot_special_token branch November 22, 2023 07:52
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants