A Python tool to analyze and summarize chat logs from .txt files using NLP techniques. Extracts key topics, message statistics, and generates summaries.
- Parse single or multiple chat log files
- Extract speaker-specific messages (User/AI)
- Identify main topics using TF-IDF and lemmatization
- Generate summary statistics (message counts, keywords)
- Python 3.12.4
- pip 24.0+
python -m venv venv
# Windows:
venv\Scripts\activate
# Mac/Linux:
source venv/bin/activatepip install -r requirements.txt
python -m nltk.downloader stopwords wordnet punktpython ai_chat_summarize_for_single_txt_file.pyOutput Example:
Total messages: 4
User messages: 2
AI messages: 2
Summary
- The conversation had 15 exchanges
- The user asked mainly about python and use
- Most common keywords: python, use, hi, tell, sure
python ai_chat_summarize_to_parse_all_txt_and_analysis.pyOutput Example:
Total messages: 8
User messages: 4
AI messages: 4
Summary
The conversation had 26 exchanges
The user asked mainly about python and ai
Most common keywords: python, ai, data, hi, learn
jupyter notebook AI_Chat_Log_Summarizer_multiple_txt_parse.ipynb- Create an
assets/folder:mkdir assets
- Save screenshot (e.g.,
sample_output.png) in this folder
.
├── chat_log/ # Folder for input chat logs (.txt files)
├── venv/ # Virtual environment (ignored)
├── assets/ # For screenshots and images
├── .gitignore
├── requirements.txt
├── README.md
├── ai_chat_summarize_for_single_txt_file.py
├── ai_chat_summarize_to_parse_all_txt_and_analysis.py
└── AI_Chat_Log_Summarizer_multiple_txt_parse.ipynb
- Uses NLTK for tokenization and lemmatization
- TF-IDF vectorization for keyword extraction
- Regular expression pattern matching for message parsing:
PATTERN = r'(User|AI):\s*(.*?)(?=\n*User:|\n*AI:|\$)'
- If you get NLTK errors, re-run:
python >>> import nltk >>> nltk.download('stopwords') and so on (necessary libraries) - For virtual environment issues:
deactivate


