Slow tokenizer is used by default

First of all, thanks for a useful metric along with the code!

I've used to evaluate a big set of internal documents and it takes a while. Upon a review of your code, I've noticed that by default slow tokenizers (python version) [is used](https://github.com/Tiiiger/bert_score/blob/2e78e66d9bfec047b2c6c08acc23b5b3d3a0dcd0/bert_score/utils.py#L280) (`use_fast=False`).
It would be great either to switch the default to fast tokenizers (written in RUST) or at least parametrize it to avoid hacking lib code to make it work for a massive set of documents (currently it's a bottleneck, especially on GPU-powered machines).

Thanks!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Slow tokenizer is used by default #105

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Slow tokenizer is used by default #105

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions