Pyquantify is a powerful CLI tool for semantic analysis. It leverages natural language processing to unveil insights from text, files, or websites, empowering sophisticated data visualization and exploration.
- 📷 Demo screenshot
- 🎯 Features
- 🧰 Getting Started
- 🧰 Installation
- 📖 Usage Guide
- ❔ FAQ
- 💎 Acknowledgements
📷 Demo screenshot
🎯 Features
-
Data Loading: Load text data from raw input, files, or websites with interactive prompts for user input.
-
Metrics Generation: Calculate and display key metrics, including character count (with and without spaces), sentence count, word count, and paragraph count.
-
Morphological Analysis: Generate a detailed table of word morphology, including word rank, original form, lemmatized form, part-of-speech (POS) tag, percentage occurrence, and count.
-
Export Functionality: Optionally export generated metrics, frequency tables, and visualizations to files.
-
Visualization:
- Generate and visualize the frequency of the top 20 words in the text.
- Create and display a word cloud visualization of processed text data.
-
Interactive Commands: Utilize command-line interface commands for actions like displaying metrics, limiting results, searching for specific words, and generating visualizations.
-
Summarize Text: Summarize text using a BERT Extractive Summarizer.
-
Sentiment Analysis:
- Perform sentiment analysis on the text.
- Provides insights into sentiment polarity and subjectivity.
🧰 Getting Started
Ensure you meet the following requirements before installation (if you're building from source):
pip install -r requirements.txt
🧰 Installation
You can install the pyquantify
package directly from PyPI using the following command:
pip install pyquantify
- Clone the project:
git clone <repository_url>
cd pyquantify
- Build the package:
python3 -m build
- Install the package:
pip install dist/*gz
- Run the tool in terminal:
pyquantify
Pyquantify provides several commands for analyzing and visualizing text data. Below is a guide on how to use the key functionalities:
-
Search for a Specific Word in Morphological Analysis:
pyquantify search-word --mode [raw/file/website] --word [desired_word]
--mode
: Specify the data loading mode (raw input, file, or website).--word
: Specify the word you want to search for.
-
Generate Word Frequency Plot:
pyquantify visualize --mode [raw/file/website] --freq-chart --export
--mode
: Specify the data loading mode (raw input, file, or website).--freq-chart
: Flag to generate word frequency chart.--export
: Optional flag to export the frequency plot to a file.
-
Generate Word Cloud:
pyquantify visualize --mode [raw/file/website] --wordcloud --export
--mode
: Specify the data loading mode (raw input, file, or website).--wordcloud
: Flag to generate word cloud.--export
: Optional flag to export the word cloud to a file.
-
Text Analysis and Metrics Generation:
pyquantify analyze --mode [raw/file/website] --n [number_of_rows] --export
--mode
: Specify the data loading mode (raw input, file, or website).--n
: Optional parameter to display a specific number of rows in the analysis.--export
: Optional flag to export the analysis results to files.
-
Summarize Text:
pyquantify summarize --mode [raw/file/website] --export
--mode
: Specify the data loading mode (raw input, file, or website).--export
: Optional flag to export the summary to a file.
-
Sentiment Analysis
pyquantify sentiment-analysis --mode [raw/file/website] --export
--mode
: Specify the data loading mode (raw input, file, or website).--export
: Optional flag to export the summary to a file.
- View the Pyquantify GitHub page:
pyquantify --git
Feel free to explore additional options and functionalities by checking the help documentation for each command:
pyquantify [command] --help
Pyquantify is a tool designed for in-depth analysis of textual data, focusing on extracting meaning and linguistic insights. It provides features like word frequency, morphology, and metrics generation, enhancing data exploration and visualization.
Pyquantify was created for the DSA subject in the fifth semester of college. The goal was to offer a versatile NLP tool, empowering users to analyze and profile text efficiently. The tool's features aim to deepen understanding and exploration of linguistic aspects within textual data.
Originally conceived as a word frequency counter, Pyquantify's development took a different direction. The decision to expand its capabilities was driven by the desire to create a more comprehensive tool for natural language processing. The project evolved to encompass semantic profiling, offering a richer set of features such as morphology analysis, metrics generation, and enhanced data visualization. This shift aimed to provide users with a more powerful and versatile solution for exploring and understanding textual data beyond simple word frequency analysis.
NLPFreq felt limiting and didn't capture the full scope of the project. Pyquantify more accurately reflects its capabilities as a Python-based tool for quantitative data analysis.
Note: Pyquantify has undergone thorough testing on Linux, and its functionality is confirmed to work seamlessly. However, it's important to note that when running on Windows Subsystem for Linux (WSL), certain features may have limited functionality due to the absence of the complete Linux toolset in the WSL environment.