Skip to content

Latest commit

 

History

History
8 lines (5 loc) · 542 Bytes

README.md

File metadata and controls

8 lines (5 loc) · 542 Bytes

Company Description Classifier

Classifying companies into one of 41 industries based on a text description. Uses sentence-transformer embeddings and sklearn's NearestCentroid model.

Dataset is modified from this 2013 dump of Crunchbase company info, which contained company name, industry category, and crunchbase permalink. Used the free Crunchbase API to scrape the short description for each company.

Written in colab.