|
1 | 1 |  |
2 | 2 |
|
3 | | -# PyThaiNLP 1.7 |
| 3 | +# PyThaiNLP 2.0 |
4 | 4 |
|
5 | 5 | [](https://www.codacy.com/app/pythainlp/pythainlp_2?utm_source=github.com&utm_medium=referral&utm_content=PyThaiNLP/pythainlp&utm_campaign=Badge_Grade)[](https://pypi.python.org/pypi/pythainlp) |
6 | 6 | [](https://travis-ci.org/PyThaiNLP/pythainlp) |
|
10 | 10 |
|
11 | 11 | PyThaiNLP is a Python library for natural language processing (NLP) of Thai language. |
12 | 12 |
|
13 | | -PyThaiNLP features include Thai word and subword segmentations, soundex, romanization, part-of-speech taggers, and spelling corrections. |
| 13 | +PyThaiNLP includes Thai word tokenizers, transliterators, soundex converters, part-of-speech taggers, and spell checkers. |
14 | 14 |
|
15 | | -## What's new in version 1.7 ? |
| 15 | +📖 For details on upgrading from PyThaiNLP 1.7 to PyThaiNLP 2.0, see [From PyThaiNLP 1.7 to PyThaiNLP 2.0](https://thainlp.org/pythainlp/docs/2.0/notes/pythainlp-1_7-2_0.html) |
16 | 16 |
|
17 | | -- Deprecate Python 2 support |
18 | | -- Refactor pythainlp.tokenize.pyicu for readability |
19 | | -- Add Thai NER model to pythainlp.ner |
20 | | -- thai2vec v0.2 - larger vocab, benchmarking results on Wongnai dataset |
21 | | -- Sentiment classifier based on ULMFit and various product review datasets |
22 | | -- Add ULMFit utility to PyThaiNLP |
23 | | -- Add Thai romanization model thai2rom |
24 | | -- Retrain POS-tagging model |
| 17 | +📖 For ThaiNER user after upgrading from PyThaiNLP 1.7 to PyThaiNLP 2.0, see [Upgrade ThaiNER from PyThaiNLP 1.7 to PyThaiNLP 2.0](https://github.com/PyThaiNLP/pythainlp/wiki/Upgrade-ThaiNER-from-PyThaiNLP-1.7-to-PyThaiNLP-2.0) |
| 18 | + |
| 19 | +📫 follow us on Facebook [Pythainlp](https://www.facebook.com/pythainlp/) |
| 20 | + |
| 21 | +## What's new in version 2.0 ? |
| 22 | + |
| 23 | +- New NorvigSpellChecker spell checker class, which can be initialized with custom dictionary. |
| 24 | +- Terminate Python 2 support. Remove all Python 2 compatibility code. |
| 25 | +- Remove old, obsolated, deprecated, and experimental code. |
| 26 | +- Thai2fit (Upgrade ULMFiT-related codes to fastai 1.0) |
| 27 | +- ThaiNER 1.0 |
| 28 | +- Remove sentiment analysis |
25 | 29 | - Improved word_tokenize (newmm, mm) and dict_word_tokenize |
26 | | -- Documentation added |
| 30 | +- Improved POS-tagging |
| 31 | +- More and improved examples |
| 32 | +- see [PyThaiNLP 2.0 change log](https://github.com/PyThaiNLP/pythainlp/issues/118) |
27 | 33 |
|
28 | 34 | ## Install |
29 | 35 |
|
| 36 | +For stable version: |
| 37 | + |
30 | 38 | ```sh |
31 | 39 | pip install pythainlp |
32 | 40 | ``` |
33 | 41 |
|
| 42 | +For some advanced functionalities, like word vector, extra packages may be needed. Install them with these options during pip install: |
| 43 | + |
| 44 | +``` |
| 45 | +pip install pythainlp[extra1,extra2,...] |
| 46 | +``` |
| 47 | + |
| 48 | +where extras can be |
| 49 | + |
| 50 | +- `artagger` (to support artagger part-of-speech tagger)* |
| 51 | +- `deepcut` (to support deepcut machine-learnt tokenizer) |
| 52 | +- `icu` (for ICU support in transliteration and tokenization) |
| 53 | +- `ipa` (for International Phonetic Alphabet support in transliteration) |
| 54 | +- `ml` (to support fastai 1.0.22 ULMFiT models) |
| 55 | +- `ner` (for named-entity recognizer) |
| 56 | +- `thai2fit` (for Thai word vector) |
| 57 | +- `thai2rom` (for machine-learnt romanization) |
| 58 | +- `full` (install everything) |
| 59 | + |
34 | 60 | **Note for Windows**: `marisa-trie` wheels can be obtained from https://www.lfd.uci.edu/~gohlke/pythonlibs/#marisa-trie |
35 | 61 | Install it with pip, for example: `pip install marisa_trie‑0.7.5‑cp36‑cp36m‑win32.whl` |
36 | 62 |
|
37 | 63 | ## Links |
38 | 64 |
|
39 | | -- Docs: https://thainlp.org/pythainlp/docs/1.7/ |
| 65 | +- User guide : [English](https://colab.research.google.com/drive/1MQ10D1mJC5r1vQAHcj4ShoRS14vz8ZF-) , [ภาษาไทย](https://colab.research.google.com/drive/1rEkB2Dcr1UAKPqz4bCghZV7pXx2qxf89) |
| 66 | +- Docs: https://thainlp.org/pythainlp/docs/2.0/ |
40 | 67 | - GitHub: https://github.com/PyThaiNLP/pythainlp |
41 | 68 | - Issues: https://github.com/PyThaiNLP/pythainlp/issues |
| 69 | +- Facebook : [Pythainlp](https://www.facebook.com/pythainlp/) |
0 commit comments