Skip to content

Commit 694dbf7

Browse files
authored
Merge pull request #153 from bact/dev
Manage extra requires + merge g2p and romanization to one transliterate module
2 parents bcf7980 + a47d297 commit 694dbf7

31 files changed

+184
-273
lines changed

.travis.yml

Lines changed: 2 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -3,12 +3,11 @@
33

44
language: python
55
python:
6-
- "3.4"
7-
- "3.5"
86
- "3.6"
97
# command to install dependencies, e.g. pip install -r requirements.txt --use-mirrors
108
install:
11-
- pip install -r requirements-travis.txt
9+
- pip install -r requirements.txt
10+
- pip install .[icu,ner,pos,tokenize,transliterate]
1211
- pip install coveralls
1312

1413
os:

README-pypi.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11
![PyThaiNLP Logo](https://avatars0.githubusercontent.com/u/32934255?s=200&v=4)
22

3-
# PyThaiNLP 1.7
3+
# PyThaiNLP 1.8.0
44

55
[![Codacy Badge](https://api.codacy.com/project/badge/Grade/cb946260c87a4cc5905ca608704406f7)](https://www.codacy.com/app/pythainlp/pythainlp_2?utm_source=github.com&utm_medium=referral&utm_content=PyThaiNLP/pythainlp&utm_campaign=Badge_Grade)[![pypi](https://img.shields.io/pypi/v/pythainlp.svg)](https://pypi.python.org/pypi/pythainlp)
66
[![Build Status](https://travis-ci.org/PyThaiNLP/pythainlp.svg?branch=develop)](https://travis-ci.org/PyThaiNLP/pythainlp)
@@ -14,7 +14,7 @@ PyThaiNLP features include Thai word and subword segmentations, soundex, romaniz
1414

1515
## What's new in version 1.7 ?
1616

17-
- Deprecate Python 2 support
17+
- Deprecate Python 2 support. (Python 2 compatibility code will be completely dropped in PyThaiNLP 1.8)
1818
- Refactor pythainlp.tokenize.pyicu for readability
1919
- Add Thai NER model to pythainlp.ner
2020
- thai2vec v0.2 - larger vocab, benchmarking results on Wongnai dataset

README.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -21,7 +21,7 @@ Python 2 users can still use PyThaiNLP 1.6.
2121
## Capabilities
2222

2323
- Thai word segmentation (```word_tokenize```), including subword segmentation based on Thai Character Cluster (```tcc```) and ETCC (```etcc```)
24-
- Thai romanization (```romanize```)
24+
- Thai romanization and transliteration (```romanize```, ```transliterate```)
2525
- Thai part-of-speech taggers (```pos_tag```)
2626
- Read out number to Thai words (```bahttext```, ```num_to_thaiword```)
2727
- Thai collation (sort by dictionoary order) (```collate```)
@@ -85,7 +85,7 @@ PyThaiNLP เป็นไลบารีภาษาไพทอนเพื่
8585
## ความสามารถ
8686

8787
- ตัดคำภาษาไทย (```word_tokenize```) และรองรับ Thai Character Clusters (```tcc```) และ ETCC (```etcc```)
88-
- ถอดเสียงภาษาไทยเป็นอักษรละติน (```romanize```)
88+
- ถอดเสียงภาษาไทยเป็นอักษรละตินและสัทอักษร (```romanize```, ```transliterate```)
8989
- ระบุชนิดคำ (part-of-speech) ภาษาไทย (```pos_tag```)
9090
- อ่านตัวเลขเป็นข้อความภาษาไทย (```bahttext```, ```num_to_thaiword```)
9191
- เรียงลำดับคำตามพจนานุกรมไทย (```collate```)

appveyor.yml

Lines changed: 1 addition & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -2,11 +2,6 @@ build: off
22

33
environment:
44
matrix:
5-
- PYTHON: "C:/Python34"
6-
PYTHON_VERSION: "3.4"
7-
PYTHON_ARCH: "32"
8-
PYICU_WHEEL: "https://get.openlp.org/win-sdk/PyICU-1.9.5-cp34-cp34m-win32.whl"
9-
105
- PYTHON: "C:/Python36"
116
PYTHON_VERSION: "3.6"
127
PYTHON_ARCH: "32"
@@ -37,7 +32,7 @@ install:
3732
# - "set ICU_VERSION=62"
3833
- "%PYTHON%/python.exe -m pip install --upgrade pip"
3934
- "%PYTHON%/python.exe -m pip install %PYICU_WHEEL%"
40-
- "%PYTHON%/python.exe -m pip install -e ."
35+
- "%PYTHON%/python.exe -m pip install -e .[icu,ner,pos,tokenize,transliterate]"
4136

4237
test_script:
4338
- "%PYTHON%/python.exe -m pip --version"

docs/api/romanization.rst

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -1,10 +1,10 @@
11
.. currentmodule:: pythainlp.romanization
22

3-
pythainlp.romanization
3+
pythainlp.transliterate
44
====================================
5-
The :class:`pythainlp.romanization` turns thai text into a romanized one (put simply, spelled with English).
5+
The :class:`pythainlp.transliterate` turns Thai text into a romanized one (put simply, spelled with English).
66

7-
.. autofunction:: romanization
8-
.. currentmodule:: pythainlp.romanization.thai2rom
7+
.. autofunction:: transliterate
8+
.. currentmodule:: pythainlp.transliterate.thai2rom
99
.. autoclass:: thai2rom
1010
:members: romanize

docs/conf.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -29,7 +29,7 @@
2929
# The short X.Y version
3030
version = ''
3131
# The full version, including alpha/beta/rc tags
32-
release = '1.7'
32+
release = '1.8.0'
3333

3434

3535
# -- General configuration ---------------------------------------------------

docs/pythainlp-dev-thai.md

Lines changed: 5 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -256,12 +256,13 @@ lentext คือ จำนวนคำขั้นต่ำที่ต้อ
256256

257257
คืนค่าเป็น dict
258258

259-
### romanization
259+
### transliteration
260260

261261
```python
262-
from pythainlp.romanization import romanize
262+
from pythainlp.transliterate import romanize, transliterate
263263

264264
romanize(str, engine="royin")
265+
transliterate(str, engine="pyicu")
265266
```
266267

267268
มี engine ดังนี้
@@ -275,9 +276,10 @@ romanize(str, engine="royin")
275276
**ตัวอย่าง**
276277

277278
```python
278-
from pythainlp.romanization import romanize
279+
from pythainlp.transliterate import romanize, transliterate
279280

280281
romanize("แมว") # 'maew'
282+
transliterate("นก")
281283
```
282284

283285
### spell

examples/romanization.py

Lines changed: 0 additions & 5 deletions
This file was deleted.

examples/transliterate.py

Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,6 @@
1+
# -*- coding: utf-8 -*-
2+
3+
from pythainlp.transliterate import romanize, transliterate
4+
5+
print(romanize("แมว"))
6+
print(transliterate("แมว"))

pythainlp/__init__.py

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11
# -*- coding: utf-8 -*-
22

3-
__version__ = 1.7
3+
__version__ = 1.8
44

55
thai_alphabets = "กขฃคฅฆงจฉชซฌญฎฏฐฑฒณดตถทธนบปผฝพฟภมยรลวศษสหฬอฮ" # 44 chars
66
thai_vowels = "ฤฦะ\u0e31าำ\u0e34\u0e35\u0e36\u0e37\u0e38\u0e39เแโใไ\u0e45\u0e47" # 19
@@ -24,7 +24,7 @@
2424

2525
from pythainlp.collation import collate
2626
from pythainlp.date import now
27-
from pythainlp.romanization import romanize
27+
from pythainlp.transliterate import romanize, transliterate
2828
from pythainlp.sentiment import sentiment
2929
from pythainlp.soundex import soundex
3030
from pythainlp.spell import spell

0 commit comments

Comments
 (0)