1,980,000 sets of Chinese and Polish language parallel translation corpus, data storage format is txt document. Data cleaning, desensitization, and quality inspection have been carried out, which can be used as a basic corpus for text data analysis and in fields such as machine translation.
For more details, please refer to the link:https://www.nexdata.ai/datasets/nlu/1337?source=Github
TXT
Chinese-Polish Parallel Corpus Data, content has been preliminarily categorized, covering the fields of technology, healthcare, tourism, spoken, news and military.
1.99 million pairs of Chinese-Polish Parallel Corpus Data.
Chinese, Polish
machine translation
Commercial License