-
Notifications
You must be signed in to change notification settings - Fork 0
Generates the subcategories of one word; eg. sports generates baseball, volleyball, etc...
License
johncadigan/CategoryGenerator
Folders and files
Name | Name | Last commit message | Last commit date | |
---|---|---|---|---|
Repository files navigation
Dependencies: Natural Language Toolkit + Wordnet data *JSON export: jstree library included in tree_js *DB export: sqlalchemy, extensions for desired db (eg. python-mysqldb) Program files: tree.py: an implementation of trees hyponym_generator.py: a generic generator for both json and db based branches js_hyponym_trees.py: inherits from tree.py and hyponym_generator.py and makes them export to json db_hyponym_trees.py: inherits tree.py and hyponym_generator.py and makes them export to a database Data files: whitelist.csv: a file to contain existing categories so that they can be marked with '@' during tree generation word-totals.txt: a file of unigrams occuring more than 40,000 times in the google unigram corpus Set-up: Install nltk and its dependencies with pip: $pip install -U numpy $pip install -U pyyaml $pip install -U nltk Download wordnet corpora: $python >>>import nltk >>>nltk.download() Download wordnet and wordnet_ic from the GUI interface Instructions: Run either js_hyponym_trees.py for json export or db_hyponym_trees.py for export to a db via sqlalchemy
About
Generates the subcategories of one word; eg. sports generates baseball, volleyball, etc...
Resources
License
Stars
Watchers
Forks
Releases
No releases published
Packages 0
No packages published