You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
There're some changes that I think should be addressed before package Cheese-shop release.
Result ranking
There's no full-text result set ranking function out-of-the-box in SQLite. I think it makes sense to extent the scope of the package to address ranking as it is absolutely a topic of both "sqlite" and "fts".
All code is already out there. There's the article, even though it's about MIT-licensed package, peewee, the code can be easily extracted. Here's a gist with module and test case for it.
Because BM25 is a general language-independent ranking function its presence in the package makes it more complete.
Minimum documentation
README should be written to overview and cover basics. I can assist with it.
Also recipes for integration with tokenizers for major domains (CJK, Cyrillic, etc) is a good idea.
Minor
Underscore is undesired in a Python module name. I suggest to rename sqlite_tokenizer.py. "sqlite" part is the obvious context. tokenizer.py is better but not good anyway as it's not informative as the module doesn't provide real tokenizer per se, rather than a binding to register it. binding.py may be a better name, though you can try to coin a better one.
Make user symbols available from __init__.py so import sqlitefts is sufficient.
setup.py. url points to other package. "Operating System :: POSIX :: Linux" seems redundant with "Operating System :: OS Independent".
The text was updated successfully, but these errors were encountered:
Rnaking: I'll merge your implementation at gist. I'll add some test cases for CJK and other scoring functions
Document: yes, I know I need to write it. I'll finish it.
Minor: You're right. sqlitefts.sqlite_tokenizer is redundant. Let me think about it.
the URL was copied from another package, I totally forgot to change it...
There're some changes that I think should be addressed before package Cheese-shop release.
Result ranking
There's no full-text result set ranking function out-of-the-box in SQLite. I think it makes sense to extent the scope of the package to address ranking as it is absolutely a topic of both "sqlite" and "fts".
All code is already out there. There's the article, even though it's about MIT-licensed package,
peewee
, the code can be easily extracted. Here's a gist with module and test case for it.Because BM25 is a general language-independent ranking function its presence in the package makes it more complete.
Minimum documentation
README should be written to overview and cover basics. I can assist with it.
Also recipes for integration with tokenizers for major domains (CJK, Cyrillic, etc) is a good idea.
Minor
Underscore is undesired in a Python module name. I suggest to rename
sqlite_tokenizer.py
. "sqlite" part is the obvious context.tokenizer.py
is better but not good anyway as it's not informative as the module doesn't provide real tokenizer per se, rather than a binding to register it.binding.py
may be a better name, though you can try to coin a better one.Make user symbols available from
__init__.py
soimport sqlitefts
is sufficient.setup.py
. url points to other package. "Operating System :: POSIX :: Linux" seems redundant with "Operating System :: OS Independent".The text was updated successfully, but these errors were encountered: