Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

numerical tokenization seems a miss #233

Open
dmikey opened this issue Aug 17, 2023 · 0 comments
Open

numerical tokenization seems a miss #233

dmikey opened this issue Aug 17, 2023 · 0 comments

Comments

@dmikey
Copy link

dmikey commented Aug 17, 2023

groups of numbers do not appear to be tokenized as a single weight.

i.e. 22 will parse to 2 and 2.

Here's evidence, but haven't had the chance to look through code of the tokenzier yet.

Screenshot_2023-08-17_at_4 07 58_PM Screenshot_2023-08-17_at_4 13 03_PM Screenshot 2023-08-17 at 4 14 53 PM
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant