Closed
Description
This post will give you the change log for PyThaiNLP 4.0. PyThaiNLP published the first version is 0.0.4 to PyPI at 6 years ago, so PyThaiNLP 4.0 will have special codename. The codename for PyThaiNLP 4.0 is PyThaiNLP 4.0 (Real).
Schedule
- Beta release: 1 April 2023
- Production release: 14 April 2023
See 4.0 Milestone.
What is new?
Deprecation and other API changes
- Delete all LST20 model Delete all LST20 model #728
- 947c7be Change pythainlp.tools.misspell to pythainlp.tools.misspell.misspell
Improve
- Reduce import time Improve: Reduce import time #719
- Fix/broken numeric data format (Mistake in word tokenization for text containing digit related time and finance #652) Fix/broken numeric data format (#652) #723
Tokenizer
- Add blackboard cls Add blackboard cls #732
- Add rule to TCC and Change TCC rule for newmm Add <Karan> rule to TCC and Change TCC rule for newmm #741
Tag
- Add blackboard pos_tag Add blackboard pos_tag #733
- Add ThaiNER 2.0 Add Thai NER 2.0 #781
Util
- Add pythainlp.util.count_thai_chars Add pythainlp.util.count_thai_chars #748
- Add thai_strptime and convert_years Add thai_strptime and convert_years #767
Transliterate
- Add Thai2Rom ONNX model Add Thai2Rom ONNX model #743
Khavee
- add khavee to pythainlp add khavee to pythainlp #777
- add aek/too checker function to khavee add aek/too checker function to khavee #779
Parse
- Add ud_goeswith Add ud_goeswith #757
Corpus
- Add new science word Add new science word #763