Description
model name | description | model size | download | Update Date |
---|---|---|---|---|
ch | Chinese and English | 3.71M | inference model / trained model | 2020.9.22 |
ch_tra | chinese traditional | 5.63M | inference model / trained model | 2021.1.21 |
en | English | 2.56M | inference model / trained model | 2020.9.22 |
fr | French | 2.65M | inference model / trained model | 2021.9.22 |
ar | Arabic | 2.53M | inference model / trained model | 2021.1.21 |
es | Spanish | 2.53M | inference model / trained model | 2021.1.21 |
pt | Portuguese | 2.63M | inference model / trained model | 2021.1.21 |
ru | Russia | 2.63M | inference model / trained model | 2021.1.21 |
ge | german | 2.65M | inference model / trained model | 2020.9.22 |
kr | Korean | 3.9M | inference model / trained model | 2020.9.22 |
jp | Japanese | 4.23M | inference model / trained model | 2020.9.22 |
it | Italian | 2.53M | inference model / trained model | 2021.1.21 |
hi | Hindi | 2.63M | inference model / trained model | 2021.1.21 |
ug | Uyghur | 2.63M | inference model / trained model | 2021.1.21 |
fa | Persian | 2.63M | inference model / trained model | 2021.1.21 |
ur | Urdu | 2.63M | inference model / trained model | 2021.1.21 |
oc | Occitan | 2.53M | inference model / trained model | 2021.1.21 |
mr | Marathi | 2.63M | inference model / trained model | 2021.1.21 |
ne | Nepali | 2.63M | inference model / trained model | 2021.1.21 |
rs_cyrillic | Serbian(cyrillic) | 2.63M | inference model / trained model | 2021.1.21 |
rs_latin | Serbian(latin) | 2.53M | inference model / trained model | 2021.1.21 |
bg | Bulgarian | 2.63M | inference model / trained model | 2021.1.21 |
uk | Ukranian | 2.63M | inference model / trained model | 2021.1.21 |
be | Belarusian | 2.63M | inference model / trained model | 2021.1.21 |
te | Telugu | 2.63M | inference model / trained model | 2021.1.21 |
kn | Kannada | 2.63M | inference model / trained model | 2021.1.21 |
ta | Tamil | 2.63M | inference model / trained model | 2021.1.21 |
mg | Mongolian | -- | Ongoing | |
bg | Bangla | -- | Need dict and corpus | |
bm | Burmese | -- | Need dict and corpus | call for contribution |
ku_cent | kurdish central | -- | PR8347 | call for contribution |
od | Odia | -- | PR6348 | call for contribution |
th | thai | -- | PR6719 issue chat | call for contribution |
More | TBC |
Guideline for new language requests
If you want to request a new language support, a PR with 2 following files are needed:
-
In folder ppocr/utils/dict,
it is necessary to submit the dict text to this path and name it with{language}_dict.txt
that contains a list of all characters. Please see the format example from other files in that folder. -
In folder ppocr/utils/corpus,
it is necessary to submit the corpus to this path and name it with{language}_corpus.txt
that contains a list of words in your language.
Maybe, 50000 words per language is necessary at least.
Of course, the more, the better. -
call for contributions to add new language support for PaddleOCR.
For anyone might be insterested in traing the new language model, Guidance to train the model is provided. We are calling contributions to add new language support for PaddleOCR.
If your language has unique elements, please tell me in advance within any way, such as useful links, wikipedia and so on.
Activity