Skip to content

shafiahmed/NLP-Tutorial

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

14 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

NLP-Tutorial

Document Classification in Python

A tutorial showing how to leverage a few great libraries out there -- gensim and scikit-learn -- to not only perform document similarity queries, but document classification as well.

===== Files

corpus -- A directory of 4 tiny text files
.gitignore -- Files in repo for Git to ignore
classifier.py -- The main file that does everything
requirements.txt -- File used by pip to download dependencies

======== Download

All you need to do is clone the repo:

git clone https://github.com/Scripted/NLP-Tutorial

============ Dependencies

In a perfect world, running "pip install -r requirements.txt" should download all the dependencies necessary to run this code. Unfortunately, Numpy and Scipy don't always play nice with pip. So try "pip install -r requirements.txt" and if that doesn't work, check out the installation instructions on the modules' sites: Numpy , Scipy , Gensim , Scikit-Learn

======= Running

Easy enough:

python classifier.py

The output shows the various steps of the algorithm as it works.

About

Document Classification in Python

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published