Skip to content

banglawiki/BengaliNLP

Repository files navigation

Bengali Natural Language Processing(BengaliNLP)

PyPI version Supported Python versions

BengaliNLP is a natural language processing toolkit for Bengali Language. This tool will help you to tokenize Bengali text, Embedding Bengali words, Embedding Bengali Document, Bengali POS Tagging, Bengali Name Entity Recognition, Bangla Text Cleaning for Bengali NLP purposes.

Features

Installation

PIP installer

pip install bengalinlp

or Upgrade

pip install -U bengalinlp
  • Python: 3.8, 3.9, 3.10, 3.11
  • OS: Linux, Windows, Mac

Build from source

git clone https://github.com/banglawiki/bengalinlp.git
cd bengalinlp
python setup.py install

Sample Usage

from bengalinlp import BasicTokenizer

tokenizer = BasicTokenizer()

raw_text = "আমি বাংলায় গান গাই।"
tokens = tokenizer(raw_text)
print(tokens)
# output: ["আমি", "বাংলায়", "গান", "গাই", "।"]