Skip to content

bigdata-ustc/EduNLP

Repository files navigation

EduNLP

PyPI Build Status codecov Download License DOI

NLP tools for Educational data (e.g., exercise, papers)

Introduction

EduNLP is a library for advanced Natural Language Processing in Python and is one of the projects of EduX plan of BDAA. It's built on the very latest research, and was designed from day one to be used in real educational products.

EduNLP now comes with pretrained pipelines and currently supports segment, tokenization and vertorization. It supports varies of preprocessing for NLP in educational scenario, such as formula parsing, multi-modal segment.

EduNLP is commercial open-source software, released under the Apache-2.0 license.

Installation

Git and install by pip

pip install -e .

or install from pypi:

pip install EduNLP

Resource

We will continously publish new datasets in Standard Item Format (SIF) to encourage the relavant research works. The data resourses can be accessed via another EduX project EduData

Tutorial

Contribute

EduNLP is still under development. More algorithms and features are going to be added and we always welcome contributions to help make EduNLP better. If you would like to contribute, please follow this guideline.

Citation

If this repository is helpful for you, please cite our work

@misc{bigdata2021edunlp,
  title={EduNLP},
  author={bigdata-ustc},
  publisher = {GitHub},
  journal = {GitHub repository},
  year = {2021},
  howpublished = {\url{https://github.com/bigdata-ustc/EduNLP}},
}

About

A library for advanced Natural Language Processing towards multi-modal educational items.

Topics

Resources

License

Stars

Watchers

Forks

Packages

No packages published

Contributors 9