neural_nlp

Deep Neural Networks for natural language processing.

Project structure

This repository is structured to have the following features:

datagen: handles data download and data iteration
models: generic model structures; not every hyperparameter is made generic, but these should be easily adjustable to your needs
integrations: specific model integration of a specific dataset
neural network utilities: TODO explain utilities

Datasets

These utility classes seamlessly download the required data and stores a local cache on your system. The data is pre-processed and allows batch iteration over the dataset for use in neural networks.

Reuters-21578

These SML are parsed into json records, one per line. The fields currently exposed are title, text, topics, and mode. The types of splits available for this dataset are modapte, lewissplit, cgisplit, and no split. See the README.txt in the download directory for more details. The value of the mode is either train or test, depending on the split value. Topics may have zero or more values.

Stackoverflow titles

This data comes from the 2015NAACL VSM-NLP workshop-"Short Text Clustering via Convolutional Neural Networks". Please see http://naacl15vs.github.io/index.html and https://github.com/jacoxu/StackOverflow for details.

This class allows for multiple types of iteration over the data.

straight iteration as a label, title tuple
iteration of a "query," positive sample, and a set of negative examples; this is for input into the DSSM model described below
a batched version of the previous iteration

Models

Deep Structured Semantic Model

See xxx for details.

Integrations

stackoverflow dssm

This integration allows for a similarity model of stackoverflow titles. Here the "query" is a title. The positive sample is a title from the same class and the negative samples are titles from different classes then the query class.

Name		Name	Last commit message	Last commit date
Latest commit History 35 Commits
neural_nlp		neural_nlp
.gitignore		.gitignore
README.md		README.md
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

neural_nlp

Project structure

Datasets

Reuters-21578

Stackoverflow titles

Models

Deep Structured Semantic Model

Integrations

stackoverflow dssm

About

Uh oh!

Releases

Packages

Languages

init-random/neural_nlp

Folders and files

Latest commit

History

Repository files navigation

neural_nlp

Project structure

Datasets

Reuters-21578

Stackoverflow titles

Models

Deep Structured Semantic Model

Integrations

stackoverflow dssm

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages