You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Please follow our [PyThaiNLP Facebook page](https://www.facebook.com/pythainlp/) for more updates.
31
30
31
+
32
32
## Getting Started with PyThaiNLP
33
33
34
34
We provide [PyThaiNLP Get Started Tutorial](https://www.thainlp.org/pythainlp/tutorials/notebooks/pythainlp_get_started.html) for exploring features in PyThaiNLP; We also have tutorials for specific tasks. Please visit [our tutorial page](https://www.thainlp.org/pythainlp/tutorials).
@@ -37,27 +37,29 @@ Latest document is available at [https://thainlp.org/pythainlp/docs/2.2/](https:
37
37
38
38
We try to make the package easy to use as much as possible; therefore, some additional data (like word lists and language models) may get automatically download during runtime. PyThaiNLP caches additional data under the directory `~/pythainlp-data` by default, but the user can change the value by specifying the environment variable `PYTHAINLP_DATA_DIR`. See corpus catalog at [PyThaiNLP/pythainlp-corpus](https://github.com/PyThaiNLP/pythainlp-corpus).
39
39
40
+
40
41
## Capabilities
41
42
42
-
PyThaiNLP provides standard NLP functions for Thai, for example part-of-speec tagging, linguistic unit segmentation (syllable, word, or sentence). Some of these functions are also available via command-line interface.
43
+
PyThaiNLP provides standard NLP functions for Thai, for example part-of-speech tagging, linguistic unit segmentation (syllable, word, or sentence). Some of these functions are also available via command-line interface.
43
44
44
45
<details>
45
46
<summary>List of Features</summary>
46
47
47
48
- Convenient character and word classes, like Thai consonants (`pythainlp.thai_consonants`), vowels (`pythainlp.thai_vowels`), digits (`pythainlp.thai_digits`), and stop words (`pythainlp.corpus.thai_stopwords`) -- comparable to constants like `string.letters`, `string.digits`, and `string.punctuation`
48
49
- Thai linguistic unit segmentation/tokenization, including sentence (`sent_tokenize`), word (`word_tokenize`), and subword segmentations based on Thai Character Cluster (`subword_tokenize`)
49
-
- Thai part-of-speech taggers (`pos_tag`)
50
+
- Thai part-of-speech tagging (`pos_tag`)
50
51
- Thai spelling suggestion and correction (`spell` and `correct`)
51
52
- Thai transliteration (`transliterate`)
52
53
- Thai soundex (`soundex`) with three engines (`lk82`, `udom83`, `metasound`)
53
-
- Thai collation (sort by dictionoary order) (`collate`)
54
+
- Thai collation (sort by dictionary order) (`collate`)
54
55
- Read out number to Thai words (`bahttext`, `num_to_thaiword`)
- Command-line interface for basic functions, like tokenization and pos tagging (run `thainlp` in your shell)
58
59
</details>
59
60
60
-
Please see [our tutorials](https://www.thainlp.org/pythainlp/tutorials) on how to apply these functions to ML problems.
61
+
Please see [our tutorials](https://www.thainlp.org/pythainlp/tutorials) on how to apply these functions to machine-learning problems.
62
+
61
63
62
64
## Installation
63
65
@@ -66,7 +68,7 @@ pip install --upgrade pythainlp
66
68
```
67
69
68
70
This will install the latest stable release of PyThaiNLP.
69
-
PyThaiNLP uses pip as its package manger and PyPI as its main distribution channel, see [https://pypi.org/project/pythainlp/](https://pypi.org/project/pythainlp/)
71
+
PyThaiNLP uses pip as its package manager and PyPI as its main distribution channel, see [https://pypi.org/project/pythainlp/](https://pypi.org/project/pythainlp/)
For dependency details, look at `extras` variable in [`setup.py`](https://github.com/PyThaiNLP/pythainlp/blob/dev/setup.py).
100
102
101
103
102
-
## Command-line
104
+
## Command-Line Interface
103
105
104
-
Some of PyThaiNLP functionalities can be used at command line, using `thainlp`
106
+
Some of PyThaiNLP functionalities can be used at command line, using `thainlp` command.
105
107
106
108
For example, displaying a catalog of datasets:
107
109
```sh
@@ -121,6 +123,7 @@ thainlp help
121
123
-[Upgrade ThaiNER from 1.7](https://github.com/PyThaiNLP/pythainlp/wiki/Upgrade-ThaiNER-from-PyThaiNLP-1.7-to-PyThaiNLP-2.0)
122
124
- Python 2.7 users can use PyThaiNLP 1.6
123
125
126
+
124
127
## Citations
125
128
126
129
If you use `PyThaiNLP` in your project or publication, please cite the library as follows
@@ -148,6 +151,7 @@ or BibTeX entry:
148
151
- Please do fork and create a pull request :)
149
152
- For style guide and other information, including references to algorithms we use, please refer to our [contributing](https://github.com/PyThaiNLP/pythainlp/blob/dev/CONTRIBUTING.md) page.
150
153
154
+
151
155
## Licenses
152
156
153
157
|| License |
@@ -157,6 +161,7 @@ or BibTeX entry:
157
161
| Language models created by PyThaiNLP |[Creative Commons Attribution 4.0 International Public License (CC-by)](https://creativecommons.org/licenses/by/4.0/)|
158
162
| Other corpora and models that may included with PyThaiNLP | See [Corpus License](https://github.com/PyThaiNLP/pythainlp/blob/dev/pythainlp/corpus/corpus_license.md)|
159
163
164
+
160
165
## Sponsors
161
166
162
167
[](https://airesearch.in.th/)
0 commit comments