A Modern C++ Data Sciences Toolkit
-
Updated
Apr 17, 2023 - C++
A Modern C++ Data Sciences Toolkit
Porter stemming library (C++)
Plagiarism detector in C++ for naive text file matching
The C++ implementation of Aho-Corasick Automation, which can apply to full-text indexing
RcppJagger is a wrapper package for Jagger
The app finds and lists the top 10 longer letter combinations with their frequency differences.
A text analysis tool for PDF files.
Example of cleaning the text-file for unreasonable symbols
Minhash text analyzer developed during Algorithmics subject.
A C++ project implementing a self-balancing AVL tree for efficient word frequency counting. This program analyzes text files, finds unique words, tracks word occurrences, and prints results in alphabetical order for ease of viewing.
Markov chain N-gram text generator for fast work with big number of N. Want to reach fast work with 6-grams or more.
A case study for a word search application in text
A command line tool analyzing your text for undesired expressions in academic writing.
Frequency dictionary implementation based on custom hashtable
Add a description, image, and links to the text-analysis topic page so that developers can more easily learn about it.
To associate your repository with the text-analysis topic, visit your repo's landing page and select "manage topics."