Skip to content

feat: abstract away tokenizer as a json file#12

Merged
ayeganov merged 15 commits intomainfrom
feat/tokenizer_saving
Sep 5, 2025
Merged

feat: abstract away tokenizer as a json file#12
ayeganov merged 15 commits intomainfrom
feat/tokenizer_saving

Conversation

@ayeganov
Copy link
Contributor

@ayeganov ayeganov commented Sep 3, 2025

This PR removes the pickling of tokenizers and saves them appropriately as json files. Optionally adding support for HF tokenizers.

@ayeganov ayeganov self-assigned this Sep 3, 2025
@ayeganov ayeganov added the enhancement New feature or request label Sep 3, 2025
@ayeganov ayeganov changed the base branch from feat/train_with_bin_files to main September 5, 2025 14:14
@ayeganov ayeganov merged commit e2a971a into main Sep 5, 2025
3 checks passed
@ayeganov ayeganov deleted the feat/tokenizer_saving branch September 5, 2025 18:02
@ayeganov ayeganov mentioned this pull request Sep 5, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants