Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add low accuracy mode #17

Closed
pemistahl opened this issue Nov 5, 2022 · 0 comments
Closed

Add low accuracy mode #17

pemistahl opened this issue Nov 5, 2022 · 0 comments

Comments

@pemistahl
Copy link
Owner

Lingua's high detection accuracy comes at the cost of being noticeably slower than other language detectors. The large language models also consume significant amounts of memory. These requirements might not be feasible for systems running low on resources.

For users who want to classify mostly long texts or need to save resources, a so-called low accuracy mode will be implemented that loads only a small subset of the language models into memory. The API will be as follows:

lingua.NewLanguageDetectorBuilder().FromAllLanguages().WithLowAccuracyMode().Build()

The downside of this approach is that detection accuracy for short texts consisting of less than 120 characters will drop significantly. However, detection accuracy for texts which are longer than 120 characters will remain mostly unaffected.

@pemistahl pemistahl added this to the Lingua 1.2.0 milestone Nov 5, 2022
pemistahl added a commit that referenced this issue Nov 14, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant