Implementing CNN Text Classification in MXNet

Recently, I have been learning mxnet for Natural Language Processing (NLP). I followed this official code in MXNet github. However, I find the official codes are too simple to run a whole process, so I changed it.

RNN text classification in MXNet is here.

The main difference with the official version

Inference code were added, one can use his trained model to do prediction
The MXNet version is 0.12.1, so some original functions may be deprecated
Binary classification tasks were changed to multi-category tasks
The codes about pretrained embedding were removed, data format were changed
Label shape were changed to (batch_size,)

Data

training and validation data

two txt file, the format of each line is: <label> sentence.

<pos> This is the best movie about troubled teens since 1998's whatever.
<neg> This 10th film in the series looks and feels tired.

config data

one label a line, the number of labels is equals to total classes.

pos
neg

inference data

one sentence a line, without <label>

inference data with evaluation

the format of each line is: <label> sentence, like validation file

The data is recommended to be tokenized or segmented(Chinese).

Quick start

python cnn_model.py --train path/to/train.data --validate /path/to/validate.data --config /path/to/config

python inference.py --test python/to/inference.data --config /path/to/config --checkpoint 1

python inference.py --test python/to/inference-evaluation.data --config /path/to/config --checkpoint 1 --evaluation

Name		Name	Last commit message	Last commit date
Latest commit History 8 Commits
.gitignore		.gitignore
README.md		README.md
cnn_model.py		cnn_model.py
custom_init.py		custom_init.py
data_helpers.py		data_helpers.py
inference.py		inference.py
util.py		util.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Implementing CNN Text Classification in MXNet

The main difference with the official version

Data

training and validation data

config data

inference data

inference data with evaluation

Quick start

References

About

Releases

Packages

Languages

blueMug/cnn_text_classification

Folders and files

Latest commit

History

Repository files navigation

Implementing CNN Text Classification in MXNet

The main difference with the official version

Data

training and validation data

config data

inference data

inference data with evaluation

Quick start

References

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages