Skip to content

init-random/neural_nlp

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

35 Commits
 
 
 
 
 
 
 
 

Repository files navigation

neural_nlp

Deep Neural Networks for natural language processing.

Project structure

This repository is structured to have the following features:

  • datagen: handles data download and data iteration
  • models: generic model structures; not every hyperparameter is made generic, but these should be easily adjustable to your needs
  • integrations: specific model integration of a specific dataset
  • neural network utilities: TODO explain utilities

These utility classes seamlessly download the required data and stores a local cache on your system. The data is pre-processed and allows batch iteration over the dataset for use in neural networks.

These SML are parsed into json records, one per line. The fields currently exposed are title, text, topics, and mode. The types of splits available for this dataset are modapte, lewissplit, cgisplit, and no split. See the README.txt in the download directory for more details. The value of the mode is either train or test, depending on the split value. Topics may have zero or more values.

This data comes from the 2015NAACL VSM-NLP workshop-"Short Text Clustering via Convolutional Neural Networks". Please see http://naacl15vs.github.io/index.html and https://github.com/jacoxu/StackOverflow for details.

This class allows for multiple types of iteration over the data.

  • straight iteration as a label, title tuple
  • iteration of a "query," positive sample, and a set of negative examples; this is for input into the DSSM model described below
  • a batched version of the previous iteration

Deep Structured Semantic Model

See xxx for details.

stackoverflow dssm

This integration allows for a similarity model of stackoverflow titles. Here the "query" is a title. The positive sample is a title from the same class and the negative samples are titles from different classes then the query class.

About

No description or website provided.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages