Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Compile-time language inclusion #63

Closed
gudvinr opened this issue Apr 7, 2024 · 1 comment
Closed

Compile-time language inclusion #63

gudvinr opened this issue Apr 7, 2024 · 1 comment

Comments

@gudvinr
Copy link

gudvinr commented Apr 7, 2024

If generated data for languages will be split between per-language, it's possible to strip down bundled language immensely.

Tag handling

Each file can contain //go:build directive that controls inclusion of languages

By default, all generated files can include //go:build !lingua_ignore which means "unless built with -tags lingua_ignore, include this file". That is the same behaviour as it is now.

Then, build constraint //go:build (!lingua_ignore && !lingua_no<language>) || lingua_<language> will be built when either tags -lingua_<language> is specified or -tags lingua_no<language> is NOT specified.

Thus, if you want all languages to be included, you simply do nothing and when you want to reduce language set to the minimum, you use build tags like -tags lingua_ignore,lingua_en,lingua_es,etc.

If you want to exclude only several languages, you add -tags lingua_noge without adding lingua_ignore.

Model loading

For now, models are loaded from a single point in detector.go through embed.FS.
Instead of that, each language-model/<language> could contain .go file that has aforementioned build constraints.

This file can also load all *.zip files into separate embed.FS entity which can be then passed to the "main" filesystem in language-model package.

language-model package then can implement interface for fs.SubFS.
It could be as simple as generated file that has switch/case for all available languages that includes all language-model/* packages.
Or, if you don't want to use generation, it should be simple enough to add Register method that init function of language-model/<language>/ package can then call. It won't be called if language package is ignored.

@pemistahl
Copy link
Owner

Closed in favor of #68.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants