Skip to content
#

nltk-tokenizer

Here are 34 public repositories matching this topic...

Text Classification for Resumes: Conducted Exploratory Data Analysis (EDA) on a vast collection of resumes. Organized the data using Bag of Words (BoW) and TF-IDF techniques. Built and evaluated multiple models, with Logistic Regression delivering standout performance. Created Word Clouds and Histograms.

  • Updated Feb 5, 2025
  • Jupyter Notebook

This repository features two Jupyter Notebooks: one for text preprocessing and the other showcasing topic modeling using the OCTIS library with Latent Dirichlet Allocation (LDA) applied to the ChiLit Corpus. Dive into the README for detailed instructions, citations, and links to the datasets and libraries used in this project.

  • Updated Jan 29, 2025
  • Jupyter Notebook

Improve this page

Add a description, image, and links to the nltk-tokenizer topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the nltk-tokenizer topic, visit your repo's landing page and select "manage topics."

Learn more