Closed
Description
Support is almost complete. There is a dangling issue with the pre-tokenizer: #7036
A useful discussion related to that is here: #7144
Outdated below
Creating this issue for more visibility
The main problem is around tokenization support, since the models use some variation of the BPE pre-processing regex. There are also some issues with the conversion scripts.
Anyway, looking for contributions to help with this
Previous unfinished work:
Possible implementation plan: #5464 (comment)