100% Free & Open-Source Meeting Transcription & Summarization
Never miss important details from your meetings again. Shiro transcribes and summarizes your meeting recordings using powerful local AI models and optional cloud summarization.
I was having weekly 1-on-1s with my manager, and every time I'd walk away feeling like I'd captured the key points. But inevitably, a few days later, I'd realize I'd missed something important—an action item, a deadline, or a nuanced piece of feedback.
I tried taking more detailed notes, but then I wasn't fully present in the conversation. I tried recording and rewatching, but who has time to watch an hour-long meeting again? Commercial transcription services were either expensive, had subscription fees, or I didn't trust them with my work conversations.
So I built Shiro. It's a simple tool that:
- Extracts audio from meeting recordings (MKV, MP4, etc.)
- Transcribes using OpenAI's Whisper model running locally on your machine
- Summarizes using Claude API to extract action items, decisions, and key discussion points
No subscriptions. No hidden fees. No data leaving your machine unless you want it to (for summarization). Just a tool built by a developer who was tired of missing details.
Privacy First: Your meeting recordings are sensitive. Shiro runs transcription entirely on your machine—your data never leaves your computer unless you explicitly choose to use the optional cloud summarization.
No Vendor Lock-In: No subscriptions, no credits, no usage limits. Install it once, use it forever.
Community-Driven: The best tools are built by communities. If Shiro helps you, consider contributing back—whether that's code, documentation, bug reports, or just spreading the word.
Transparency: You can see exactly what the code does. No black boxes, no telemetry, no surprises.
- 🎤 Local Speech-to-Text: Uses OpenAI Whisper for accurate transcription
- 🎬 Video Audio Extraction: Automatically extracts audio from MKV, MP4, and other video formats
- 📝 Multiple Output Formats: JSON (detailed), TXT (clean text), SRT (subtitles), Markdown
- 🧠 AI-Powered Summaries: Optional Claude API integration for intelligent meeting analysis
- ⚡ Smart Auto-Detection: Skips already-completed steps (audio extraction, transcription)
- 🔒 Privacy-Focused: All transcription happens on your machine
- 💰 100% Free: No subscriptions, no hidden fees, completely open-source
- 🎯 Action Item Extraction: Automatically identifies tasks, decisions, and follow-ups
- ⏱️ Word-Level Timestamps: Detailed timing information for every word
- 🔧 Automated Setup: One-command installation with automatic Python version management
# Clone the repository
git clone https://github.com/yourusername/shiro.git
cd shiro
# Run the automated installer (handles everything!)
chmod +x install.sh
./install.shThe macOS installer automatically:
- ✅ Installs Homebrew (if needed)
- ✅ Detects and fixes Python version compatibility issues
- ✅ Installs pyenv and Python 3.12 if you have Python 3.14+
- ✅ Installs ffmpeg via Homebrew
- ✅ Sets up project-specific Python version
- ✅ Creates virtual environment
- ✅ Installs all dependencies
- ✅ Verifies installation
# Clone the repository
git clone https://github.com/yourusername/shiro.git
cd shiro
# Run the automated installer (handles everything!)
chmod +x install-linux.sh
./install-linux.shThe Linux installer automatically:
- ✅ Detects your Linux distribution (Ubuntu, Debian, Fedora, Arch, etc.)
- ✅ Installs ffmpeg using your package manager
- ✅ Installs Python development headers
- ✅ Verifies Python version (3.10-3.13 required)
- ✅ Creates virtual environment
- ✅ Installs all dependencies
- ✅ Verifies installation
Supported distributions: Ubuntu, Debian, Fedora, RHEL, CentOS, Arch, Manjaro
# Clone the repository
git clone https://github.com/yourusername/shiro.git
cd shiro
# Run the automated installer
install.batThe Windows installer automatically:
- ✅ Verifies Python installation (3.10-3.13 required)
- ✅ Checks for ffmpeg (provides install instructions if missing)
- ✅ Creates virtual environment
- ✅ Installs all dependencies
- ✅ Verifies installation
Prerequisites for Windows:
- Python 3.10-3.13 from python.org (make sure to check "Add Python to PATH")
- ffmpeg - Install via:
- Chocolatey:
choco install ffmpeg - Scoop:
scoop install ffmpeg - Or download from ffmpeg.org
- Chocolatey:
macOS / Linux:
# Activate virtual environment
source venv/bin/activate
# Transcribe and summarize a meeting
python shiro.py meeting_recording.mkv
# Transcribe only (no summarization)
python shiro.py meeting_recording.mkv --no-summary
# Force re-processing (ignore cached files)
python shiro.py meeting_recording.mkv --forceWindows:
# Activate virtual environment
venv\Scripts\activate
# Transcribe and summarize a meeting
python shiro.py meeting_recording.mkv
# Transcribe only (no summarization)
python shiro.py meeting_recording.mkv --no-summary
# Force re-processing (ignore cached files)
python shiro.py meeting_recording.mkv --force- Copy the example environment file:
cp .env.example .env- (Optional) Add your Claude API key for summarization:
# Edit .env and add your key
ANTHROPIC_API_KEY=sk-ant-xxxxxNote: Summarization is optional. Without an API key, Shiro will still transcribe your meetings perfectly—you just won't get the AI-powered summary and action item extraction.
┌─────────────────┐
│ Meeting Video │
│ (MKV/MP4) │
└────────┬────────┘
│
▼
┌─────────────────┐
│ Audio Extractor │ ──▶ output/meeting_audio.wav
│ (ffmpeg) │
└────────┬────────┘
│
▼
┌─────────────────┐
│ Transcriber │ ──▶ output/meeting_transcript.json
│ (Whisper Local) │ ──▶ output/meeting_transcript.txt
└────────┬────────┘ ──▶ output/meeting_transcript.srt
│
▼
┌─────────────────┐
│ Summarizer │ ──▶ output/meeting_summary.md
│ (Claude API) │ ──▶ output/meeting_summary.json
└─────────────────┘
-
Audio Extraction (
src/audio_extractor.py)- Converts video to 16kHz mono WAV using ffmpeg
- Optimized format for speech recognition
- Smart skip: Won't re-extract if audio file exists
-
Transcription (
src/transcriber.py)- Uses OpenAI Whisper (medium model by default)
- Runs entirely on your local machine
- Generates word-level timestamps
- Outputs: JSON (detailed), TXT (clean), SRT (subtitles)
- Smart skip: Won't re-transcribe if transcript exists
-
Summarization (
src/summarizer.py)- Optional Claude API integration
- Extracts: executive summary, discussion points, decisions, action items
- Cost: ~$0.15 per hour-long meeting
- Outputs: Markdown (readable), JSON (structured data)
Shiro intelligently skips completed steps:
# First run: Full pipeline (~10 minutes)
python shiro.py meeting.mkv
# ▶ Extracting audio...
# ▶ Transcribing audio...
# ▶ Generating summary...
# Second run: Only new summary (~30 seconds)
python shiro.py meeting.mkv
# ⏭️ Skipping audio extraction (file already exists)
# ⏭️ Skipping transcription (file already exists)
# ▶ Generating summary...
# Force complete re-processing
python shiro.py meeting.mkv --forceAfter processing meeting.mkv, you'll find:
output/
├── meeting_audio.wav # Extracted audio (16kHz mono)
├── meeting_transcript.json # Full transcript with timestamps
├── meeting_transcript.txt # Clean text transcript
├── meeting_transcript.srt # Subtitle file
├── meeting_summary.md # Human-readable summary
└── meeting_summary.json # Structured summary data
python shiro.py <video_file> [options]
Required:
video_file Path to video file (MKV, MP4, etc.)
Optional:
--no-summary Skip summarization (transcription only)
--skip-extraction Skip audio extraction step
--force Force re-processing (ignore cached files)
--whisper-model SIZE Whisper model size (tiny/base/small/medium/large)
--language CODE Language code (en, es, fr, etc.)
--meeting-context TEXT Additional context for summarization# Transcribe Spanish meeting
python shiro.py meeting.mkv --language es
# Use larger model for better accuracy (slower)
python shiro.py meeting.mkv --whisper-model large
# Transcribe only, no summary
python shiro.py meeting.mkv --no-summary
# Add context for better summarization
python shiro.py meeting.mkv --meeting-context "Weekly sprint planning"| Task | Duration (1-hour meeting) |
|---|---|
| Audio Extraction | ~30 seconds |
| Transcription (medium) | ~8-10 minutes |
| Transcription (large) | ~15-20 minutes |
| Summarization | ~10-30 seconds |
| Total | ~10-15 minutes |
| Model | Speed | Accuracy | VRAM |
|---|---|---|---|
| tiny | Very Fast | Good | ~1 GB |
| base | Fast | Good | ~1 GB |
| small | Medium | Better | ~2 GB |
| medium | Slower | Great (default) | ~5 GB |
| large | Slowest | Best | ~10 GB |
- Claude 3.5 Sonnet: ~$0.10-0.20 per hour-long meeting
- Alternative: Skip summarization entirely (free) and sumerize using free ChatGPT by dropping the .txt file.
Problem: Python 3.14.0 is too new!
Solution: The installer automatically handles this! It will:
- Install pyenv (if needed)
- Install Python 3.12
- Set Shiro to use Python 3.12 automatically
If you still see this error:
# Manually activate pyenv and retry
export PYENV_ROOT="$HOME/.pyenv"
export PATH="$PYENV_ROOT/bin:$PATH"
eval "$(pyenv init -)"
./install.shProblem: Audio extraction failed: ffmpeg not found
Solution:
# macOS
brew install ffmpeg
# Ubuntu/Debian
sudo apt-get install ffmpegProblem: System runs out of memory with large model
Solution: Use a smaller Whisper model
python shiro.py meeting.mkv --whisper-model smallProblem: Your credit balance is too low to access the Anthropic API
Solution: Either:
- Add credits to your Anthropic account at https://console.anthropic.com
- Skip summarization:
python shiro.py meeting.mkv --no-summary
Problem: Shiro re-processes everything even when files exist
Solution: Use auto-detection (default behavior). If you want to force re-processing:
python shiro.py meeting.mkv --forceNever commit .env file to Git! The .gitignore file already excludes it, but double-check:
# Verify .env is not tracked
git status
# If .env appears, remove it immediately
git rm --cached .env- Use environment variables (already configured)
- Rotate keys regularly at https://console.anthropic.com
- Set usage limits in Anthropic dashboard
- Never share your
.envfile
- Transcription happens entirely on your machine—no data sent anywhere
- Summarization sends transcript text to Claude API (opt-in)
- Meeting recordings never leave your machine
- No telemetry or usage tracking of any kind
shiro/
├── shiro.py # Main orchestration script
├── install.sh # Automated installation script
├── requirements.txt # Python dependencies
├── .env.example # Environment configuration template
├── .gitignore # Git ignore rules
├── LICENSE # MIT License
│
├── src/
│ ├── __init__.py
│ ├── audio_extractor.py # Audio extraction from video (ffmpeg)
│ ├── transcriber.py # Speech-to-text (Whisper)
│ └── summarizer.py # AI summarization (Claude)
│
├── output/ # Generated files (git-ignored)
│ ├── *_audio.wav
│ ├── *_transcript.json
│ ├── *_transcript.txt
│ ├── *_transcript.srt
│ ├── *_summary.md
│ └── *_summary.json
│
└── venv/ # Python virtual environment (git-ignored)
Contributions are welcome! This project is open-source because the best tools are built by communities.
-
Fork the repository
# Click "Fork" on GitHub, then: git clone https://github.com/YOUR_USERNAME/shiro.git cd shiro
-
Create a feature branch
git checkout -b feature/your-feature-name
-
Make your changes
- Write clean, documented code
- Follow existing code style
- Test your changes thoroughly
-
Commit and push
git add . git commit -m "Add: your feature description" git push origin feature/your-feature-name
-
Open a Pull Request
- Describe what your PR does
- Reference any related issues
- Be responsive to feedback
Q: Does this work on Windows? A: Currently macOS/Linux only. Windows support is planned—contributions welcome!
Q: Can I use it without an API key? A: Yes! Transcription works completely offline. You only need an API key for optional summarization.
Q: Is my data private? A: Transcription happens 100% locally. If you use summarization, only the transcript text is sent to Claude API.
Q: What languages are supported?
A: Whisper supports 99 languages. Use --language <code> to specify (e.g., --language es for Spanish).
Q: Can I use a different summarization API?
A: Yes! The code is modular. You can easily swap out src/summarizer.py for OpenAI, Gemini, or local models.
Q: Why "Shiro"? A: Shiro (白) means "white" or "pure" in Japanese—representing the project's focus on transparency and simplicity.
MIT License - see LICENSE file for details.
TL;DR: You can use, modify, and distribute this software freely, even commercially. Just keep the copyright notice.
Built with ❤️ by a Ir0nByte tired of missing meeting details.
If Shiro saves you time, consider:
- ⭐ Starring the repo
- 🐛 Reporting bugs
- 💡 Suggesting features
- 🔧 Contributing code
- 📢 Sharing with others
Let's build better tools, together.