Skip to content

Migrate Tokenizer Components to utilize pytorch-labs/tokenizers #1440

Closed
@Jack-Khuu

Description

@Jack-Khuu

🚀 The feature, motivation and pitch

@larryliu0820 has created a new shared repository for hosting tokenizer definitions.

The initial migration attempt was reverted in #1414 due to a tokenizer issue flagged in #1413, but should be straightforward to debug and reland

Task: Taking inspiration from #1401, reattempt this migration

Alternatives

No response

Additional context

No response

RFC (Optional)

No response

Metadata

Metadata

Assignees

No one assigned

    Labels

    actionableItems in the backlog waiting for an appropriate impl/fixenhancementNew feature or requestgood first issueGood for newcomerstriagedThis issue has been looked at a team member, and triaged and prioritized into an appropriate module

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions