You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
If generated data for languages will be split between per-language, it's possible to strip down bundled language immensely.
Tag handling
Each file can contain //go:build directive that controls inclusion of languages
By default, all generated files can include //go:build !lingua_ignore which means "unless built with -tags lingua_ignore, include this file". That is the same behaviour as it is now.
Then, build constraint //go:build (!lingua_ignore && !lingua_no<language>) || lingua_<language> will be built when either tags -lingua_<language> is specified or -tags lingua_no<language> is NOT specified.
Thus, if you want all languages to be included, you simply do nothing and when you want to reduce language set to the minimum, you use build tags like -tags lingua_ignore,lingua_en,lingua_es,etc.
If you want to exclude only several languages, you add -tags lingua_noge without adding lingua_ignore.
Model loading
For now, models are loaded from a single point in detector.go through embed.FS.
Instead of that, each language-model/<language> could contain .go file that has aforementioned build constraints.
This file can also load all *.zip files into separate embed.FS entity which can be then passed to the "main" filesystem in language-model package.
language-model package then can implement interface for fs.SubFS.
It could be as simple as generated file that has switch/case for all available languages that includes all language-model/* packages.
Or, if you don't want to use generation, it should be simple enough to add Register method that init function of language-model/<language>/ package can then call. It won't be called if language package is ignored.
The text was updated successfully, but these errors were encountered:
If generated data for languages will be split between per-language, it's possible to strip down bundled language immensely.
Tag handling
Each file can contain
//go:build
directive that controls inclusion of languagesBy default, all generated files can include
//go:build !lingua_ignore
which means "unless built with-tags lingua_ignore
, include this file". That is the same behaviour as it is now.Then, build constraint
//go:build (!lingua_ignore && !lingua_no<language>) || lingua_<language>
will be built when eithertags -lingua_<language>
is specified or-tags lingua_no<language>
is NOT specified.Thus, if you want all languages to be included, you simply do nothing and when you want to reduce language set to the minimum, you use build tags like
-tags lingua_ignore,lingua_en,lingua_es,etc
.If you want to exclude only several languages, you add
-tags lingua_noge
without addinglingua_ignore
.Model loading
For now, models are loaded from a single point in
detector.go
throughembed.FS
.Instead of that, each
language-model/<language>
could contain.go
file that has aforementioned build constraints.This file can also load all
*.zip
files into separateembed.FS
entity which can be then passed to the "main" filesystem inlanguage-model
package.language-model
package then can implement interface for fs.SubFS.It could be as simple as generated file that has switch/case for all available languages that includes all
language-model/*
packages.Or, if you don't want to use generation, it should be simple enough to add
Register
method thatinit
function oflanguage-model/<language>/
package can then call. It won't be called if language package is ignored.The text was updated successfully, but these errors were encountered: