Notebook for NLP Feature Engineering

Based on Alice Zheng’s great book on feature engineering, you can find Alice’s repo here.

This is notebook used the yelp challenge dataset to illustrate feature extraction and engineering for natural language processing (NLP). A very simple logistic regression model was used to classify the categories business type, i.e., ['Nightlife', 'Restraunts']. Methods used for tokenizing text reviews are:

bag-of-word (word counting)
n-gram
term frequency-inverse document frequency normalization
L2 normalization

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
README.org		README.org
nlp_yelp.ipynb		nlp_yelp.ipynb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Notebook for NLP Feature Engineering

About

Releases

Packages

Languages

hrgentry/nlp-yelp

Folders and files

Latest commit

History

Repository files navigation

Notebook for NLP Feature Engineering

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages