This project demonstrates how to generate Word Clouds in Python using text data. A Word Cloud is a visual representation of text where the size of each word indicates its frequency or importance.
- Downloads datasets directly from Kaggle using kagglehub.
- Reads text files from the dataset for visualization.
- Generates a Word Cloud using the wordcloud library.
- Provides a fallback sample text if no dataset text file is found.
- Visualizes the Word Cloud using Matplotlib.
π Dataset The notebook uses the Kaggle dataset:
This dataset contains mask images and text files that can be used to generate customized word clouds.
π οΈ Installation Make sure you have Python 3.x installed. Install the required libraries: pip install kagglehub wordcloud matplotlib
π Usage
- Clone or download this notebook.
- Run the notebook in Google Colab or locally.
- The script will:
- Download the dataset from Kaggle.
- Search for .txt files in the dataset.
- Load text data and generate a Word Cloud.
- Display the Word Cloud using Matplotlib.
π Example Code from wordcloud import WordCloud import matplotlib.pyplot as plt
text_data = "Word clouds are great for visualizing text data and seeing frequent words."
wordcloud = WordCloud(width=800, height=400, background_color='white').generate(text_data)
plt.figure(figsize=(10, 5)) plt.imshow(wordcloud, interpolation='bilinear') plt.axis('off') plt.show()
πΈ Output The notebook generates a Word Cloud image where:
- Larger words = more frequent in the text.
- Smaller words = less frequent.
β Notes
- If no .txt file is found in the Kaggle dataset, the script uses a fallback sample text.
- You can replace the sample text with your own dataset or custom text.
π Requirements
- Python 3.x
- Libraries: kagglehub, wordcloud, matplotlib, os, sys
