Your personal AI agent, running on your machine, controlled from Telegram.
Telegram ──▶ Grammy Bot ──▶ Claude Agent SDK ──▶ Your Machine
voice/text command router agentic runtime bash, files, code
Discord ──▶ discord.js ──▶ Claude Agent SDK ──▶ Your Machine
voice/text slash commands agentic runtime bash, files, code
Claudegram bridges Telegram and Discord to a full Claude Code agent running locally on your machine. Send a message — Claude reads your files, runs commands, writes code, browses Reddit, fetches Medium articles, extracts media from YouTube/Instagram/TikTok, transcribes voice notes, and speaks responses back. All from your phone.
This is not a simple API wrapper. It's the real Claude Code agent with tool access — Bash, file I/O, code editing, web browsing — packaged behind a Telegram and Discord interface with streaming responses, session memory, and rich output formatting.
- Full Claude Code with tool access (Bash, Read, Write, Edit, Glob, Grep)
- Session resume across messages — Claude remembers everything
- Project-based working directories with interactive picker
- Streaming responses with live-updating messages
- Model picker: Sonnet / Opus / Haiku
- Plan mode, explore mode, loop mode
- Teleport sessions to terminal (
/teleport)
/extract— extract content from YouTube, Instagram, and TikTok- Pull transcripts (plain text, SRT, or VTT subtitles)
- Download audio (MP3) or video (MP4)
- Groq Whisper transcription for videos without subtitles
- Cookie support for age-restricted / private content
- Proxy fallback for IP-blocked platforms
- SSRF protection (blocks private/internal hosts)
/reddit— posts, subreddits, user profiles with sorting & time filters/vreddit— download and send Reddit-hosted videos- Native TypeScript Reddit API client (no external Python dependency)
- Auto-compression for videos > 50 MB (two-pass encoding)
- Large threads auto-export to JSON
/medium— fetch paywalled articles via Freedium- Telegraph Instant View, save as Markdown, or both
- Pure TypeScript, no Python/Playwright needed
- Send a voice note → transcribed via Groq Whisper → fed to Claude
/transcribe— standalone transcription (reply-to or prompt)- Audio file transcription (MP3, WAV, FLAC, OGG)
- Large file chunking for files exceeding Groq limits
/tts— agent responses spoken back as Telegram voice notes- Groq Orpheus (default): 6 voices — autumn, diana, hannah, austin, daniel, troy
- OpenAI TTS: 13 voices — alloy, ash, ballad, cedar, coral, echo, fable, marin, nova, onyx, sage, shimmer, verse
- Speed adjustment (0.25x – 4.0x), tone instructions (gpt-4o-mini-tts)
- MarkdownV2 formatting with automatic escaping
- Telegraph Instant View for long responses
- Smart chunking that preserves code blocks
- ForceReply interactive prompts for multi-step commands
- Inline keyboards for settings (model, mode, TTS, clear)
- Terminal UI mode with animated spinners and tool status
- Send photos or image documents in chat
- Saved to project under
.claudegram/uploads/ - Claude is notified with path + caption for visual context
- Full slash command parity with Telegram
- Gemini Live real-time voice channel conversations (Google 2.5-flash)
- Built-in Google Search, translation, and utility tools in voice
- Factory Droid integration via
/droid - Streaming responses with configurable debounce
- Node.js 18+ with npm
- Claude Code CLI — installed and authenticated (
claudein your PATH) - Telegram bot token — from @BotFather
- Your Telegram user ID — from @userinfobot
git clone https://github.com/lliWcWill/claudegram.git
cd claudegram
cp .env.example .envEdit .env:
TELEGRAM_BOT_TOKEN=your_bot_token
ALLOWED_USER_IDS=your_user_idnpm install
npm run dev # dev mode with hot reloadOpen your bot in Telegram → /start
/start— Welcome message and getting started guide/project— Set working directory (interactive folder browser)/newproject <name>— Create and switch to a new project/clear— Clear conversation + session (with confirmation)/status— Current session info (model, session ID, created date)/sessions— List all saved sessions with restore options/resume— Pick from recent sessions via inline keyboard/continue— Resume most recent session instantly/teleport— Fork session to terminal (get a CLI command to continue in your shell)
/plan <task>— Plan mode for complex, multi-step tasks/explore <question>— Explore codebase to answer questions/loop <task>— Run iteratively until task complete (max iterations configurable)/model— Switch between Sonnet / Opus / Haiku/mode— Toggle streaming (live updates) vs. wait (single message)/context— Show Claude context window / token usage breakdown
/extract <url>— Extract content from YouTube, Instagram, or TikTok- Text — downloads subtitles or transcribes audio via Groq Whisper
- Audio — downloads and sends MP3
- Video — downloads and sends MP4 (compressed if > 50 MB)
- All — transcript + audio + video
- Supports subtitle format selection (plain text, SRT, VTT)
/reddit <target>— Fetch Reddit content- Targets: post URL, post ID,
r/<subreddit>,u/<username>, share links - Flags:
--sort <hot|new|top|rising>,--limit <n>,--time <day|week|month|year|all>,--depth <n>,-f <markdown|json>
- Targets: post URL, post ID,
/vreddit <url>— Download Reddit-hosted videos (DASH + ffmpeg)
/medium <url>— Fetch Medium articles via Freedium- Choose: Telegraph Instant View, Markdown file, or both
/tts— Toggle voice replies on/off, pick voice and provider/transcribe— Transcribe audio to text (reply to a voice note or send one after)- Send voice note — Auto-transcribed and fed to Claude as context
- Send audio file — Auto-transcribed (MP3, WAV, FLAC, OGG supported)
/file <path>— Download a project file to Telegram/telegraph— View Markdown content as a Telegraph Instant View page/terminalui— Toggle terminal-style UI (animated spinners, tool status display)
/ping— Health check (bypasses queue)/context— Show Claude context / token usage/botstatus— Bot process status (uptime, memory, CPU)/restartbot— Restart the bot (with confirmation)/cancel— Cancel current request (bypasses queue)/softreset— Cancel + clear session/commands— Show all available commands
Media Extraction — /extract
Extracts text transcripts, audio, and video from YouTube, Instagram, and TikTok.
System requirements: yt-dlp, ffmpeg, ffprobe
# Install on Debian/Ubuntu
sudo apt install ffmpeg
pip install yt-dlp
# .env (optional)
YTDLP_COOKIES_PATH=/path/to/cookies.txt # for age-restricted content
EXTRACT_TRANSCRIBE_TIMEOUT_MS=180000 # timeout per chunk (ms)Cookie files enable access to age-restricted YouTube, private Instagram accounts, and mature TikTok content. Export from your browser using the "Get cookies.txt LOCALLY" extension.
Reddit — /reddit & /vreddit
Native TypeScript Reddit API client. Create a "script" app at reddit.com/prefs/apps.
# .env
REDDIT_CLIENT_ID=your_client_id
REDDIT_CLIENT_SECRET=your_client_secret
REDDIT_USERNAME=your_bot_account
REDDIT_PASSWORD=your_bot_passwordNote: Reddit's script-app OAuth requires the actual account password. Use a dedicated bot account — not your personal Reddit credentials.
Video downloads (/vreddit) need ffmpeg and ffprobe on your PATH. Videos over 50 MB are automatically compressed before sending to Telegram.
Medium — /medium
Pure TypeScript via Freedium mirror — no extra dependencies.
# .env (optional tuning)
FREEDIUM_HOST=freedium-mirror.cfd
MEDIUM_TIMEOUT_MS=15000Voice Transcription — Groq Whisper
# .env
GROQ_API_KEY=your_groq_key
GROQ_TRANSCRIBE_PATH=/absolute/path/to/groq_transcribe.pyUsed for voice note transcription, /transcribe command, and /extract text mode (when videos lack subtitles).
Text-to-Speech
Two providers available:
Groq Orpheus (default, faster):
# .env
TTS_PROVIDER=groq
# Reuses GROQ_API_KEY from above
TTS_VOICE=troy # autumn, diana, hannah, austin, daniel, troy
TTS_SPEED=1.5OpenAI TTS (more voices, tone instructions):
# .env
TTS_PROVIDER=openai
OPENAI_API_KEY=your_openai_key
TTS_MODEL=gpt-4o-mini-tts
TTS_VOICE=coral # alloy, ash, ballad, cedar, coral, echo, fable, marin, nova, onyx, sage, shimmer, verse
TTS_SPEED=1.0
TTS_INSTRUCTIONS="Speak in a friendly, natural conversational tone."Discord Bot
Run the Discord bot alongside or instead of Telegram.
# .env
DISCORD_BOT_TOKEN=your_discord_bot_token
DISCORD_APPLICATION_ID=your_app_id
DISCORD_GUILD_ID=your_guild_id # optional, for instant slash command updates
DISCORD_ALLOWED_USER_IDS=your_discord_idDiscord-exclusive features:
- Gemini Live — real-time voice channel conversations via Google 2.5-flash
- Factory Droid —
/droidfor autonomous coding sprints - Voice tools — Google Search, translation, dice, coin flip, math in voice
Requires a Discord application with MESSAGE_CONTENT privileged intent enabled.
All config lives in .env. See .env.example for the full annotated reference.
TELEGRAM_BOT_TOKEN— Bot token from @BotFatherALLOWED_USER_IDS— Comma-separated Telegram user IDs
ANTHROPIC_API_KEY— API key (optional with Claude Max subscription)WORKSPACE_DIR— Root directory for project picker (default:$HOME)CLAUDE_EXECUTABLE_PATH— Path to Claude Code CLI (default:claude)BOT_NAME— Bot name in system prompt (default:Claudegram)STREAMING_MODE—streamingorwait(default:streaming)STREAMING_DEBOUNCE_MS— Debounce interval for live edits (default:500)MAX_MESSAGE_LENGTH— Character limit before Telegraph fallback (default:4096)DANGEROUS_MODE— Auto-approve all tool permissions (default:false)MAX_LOOP_ITERATIONS— Max iterations for/loop(default:5)
REDDIT_CLIENT_ID/REDDIT_CLIENT_SECRET— Reddit API credentialsREDDIT_USERNAME/REDDIT_PASSWORD— Bot Reddit accountREDDITFETCH_TIMEOUT_MS— Execution timeout (default:30000)REDDITFETCH_DEFAULT_LIMIT— Default post limit (default:10)REDDITFETCH_DEFAULT_DEPTH— Default comment depth (default:5)REDDITFETCH_JSON_THRESHOLD_CHARS— Auto-switch to JSON (default:8000)REDDIT_VIDEO_MAX_SIZE_MB— Max video size before compression (default:50)
FREEDIUM_HOST— Freedium mirror host (default:freedium-mirror.cfd)FREEDIUM_RATE_LIMIT_MS— Rate limit between requests (default:2000)MEDIUM_TIMEOUT_MS— Fetch timeout (default:15000)MEDIUM_FILE_THRESHOLD_CHARS— File save threshold (default:8000)
GROQ_API_KEY— Groq API key for Whisper + Orpheus TTSGROQ_TRANSCRIBE_PATH— Path togroq_transcribe.pyVOICE_SHOW_TRANSCRIPT— Show transcript before agent response (default:true)VOICE_MAX_FILE_SIZE_MB— Max voice file size (default:19)VOICE_LANGUAGE— ISO 639-1 language code (default:en)TTS_PROVIDER—groqoropenai(default:groq)TTS_VOICE— Voice name (default:troyfor Groq,coralfor OpenAI)TTS_SPEED— Speech speed 0.25–4.0 (default:1.5)TTS_MAX_CHARS— Max chars before skipping voice (default:4096)OPENAI_API_KEY— OpenAI API key (only forTTS_PROVIDER=openai)
YTDLP_COOKIES_PATH— Path to cookies.txt for yt-dlpEXTRACT_TRANSCRIBE_TIMEOUT_MS— Transcription timeout per chunk (default:180000)
DISCORD_BOT_TOKEN— Discord bot tokenDISCORD_APPLICATION_ID— Discord application IDDISCORD_GUILD_ID— Guild ID for guild-scoped commandsDISCORD_ALLOWED_USER_IDS— Comma-separated Discord user IDsDISCORD_ALLOWED_ROLE_IDS— Comma-separated Discord role IDsDISCORD_STREAMING_DEBOUNCE_MS— Streaming edit debounce (default:1500)GEMINI_API_KEY— Google Gemini API key (for Discord voice channels)
TERMINAL_UI_DEFAULT— Enable terminal-style UI by default (default:false)IMAGE_MAX_FILE_SIZE_MB— Max image upload size (default:20)
src/
├── bot/
│ ├── bot.ts # Bot setup, handler registration
│ ├── handlers/
│ │ ├── command.handler.ts # All slash commands + inline keyboards
│ │ ├── message.handler.ts # Text routing, ForceReply dispatch
│ │ ├── voice.handler.ts # Voice download, transcription, agent relay
│ │ └── photo.handler.ts # Image save + agent notification
│ └── middleware/
│ ├── auth.middleware.ts # User whitelist
│ └── stale-filter.ts # Ignore stale messages on restart
├── claude/
│ ├── agent.ts # Claude Agent SDK, session resume, system prompt
│ ├── session-manager.ts # Per-chat session state
│ ├── request-queue.ts # Sequential request queue
│ ├── command-parser.ts # Help text + command descriptions
│ └── session-history.ts # Session persistence
├── media/
│ └── extract.ts # YouTube / Instagram / TikTok extraction
├── reddit/
│ ├── redditfetch.ts # Native TypeScript Reddit API client
│ └── vreddit.ts # Reddit video download + compression
├── medium/
│ └── freedium.ts # Freedium article fetcher
├── audio/
│ └── transcribe.ts # Groq Whisper integration
├── tts/
│ ├── tts.ts # TTS provider routing (Groq / OpenAI)
│ ├── tts-settings.ts # Per-chat voice settings
│ └── voice-reply.ts # TTS hook for agent responses
├── telegram/
│ ├── message-sender.ts # Streaming, chunking, Telegraph routing
│ ├── markdown.ts # MarkdownV2 escaping
│ ├── telegraph.ts # Telegraph Instant View client
│ ├── deduplication.ts # Message dedup
│ └── terminal-settings.ts # Terminal UI settings
├── discord/
│ ├── discord-bot.ts # Discord setup + slash command registration
│ ├── handlers/ # Message, interaction, voice handlers
│ ├── commands/ # 17 slash commands
│ └── voice-channel/ # Gemini Live audio pipeline
├── droid/
│ └── droid-bridge.ts # Factory Droid JSON/streaming bridge
├── utils/
│ ├── download.ts # Secure file downloads
│ ├── resolve-bin.ts # Binary path resolution (systemd-safe)
│ ├── sanitize.ts # Error/path sanitization
│ ├── proxy.ts # Proxy dispatcher for blocked content
│ └── file-type.ts # MIME type detection
├── config.ts # Zod-validated environment config
├── index.ts # Telegram entry point
└── discord-index.ts # Discord entry point
npm run dev # Dev mode with hot reload (tsx watch)
npm run typecheck # Type check only
npm run build # Compile to dist/
npm start # Run compiled build./scripts/claudegram-botctl.sh dev start # Start dev mode
./scripts/claudegram-botctl.sh dev restart # Restart dev
./scripts/claudegram-botctl.sh prod start # Start production
./scripts/claudegram-botctl.sh dev log # Tail logs
./scripts/claudegram-botctl.sh dev status # Check if runningIf Claudegram is editing its own codebase, use prod mode to avoid hot-reload restarts:
./scripts/claudegram-botctl.sh prod start # No hot reload
# ... let Claude edit files ...
./scripts/claudegram-botctl.sh prod restart # Apply changesThen /continue or /resume in Telegram to restore your session.
- User whitelist — only approved Telegram/Discord IDs can interact
- Project sandbox — Claude operates within the configured working directory
- Permission mode — uses
acceptEditsby default - Dangerous mode — opt-in auto-approve for all tool permissions
- SSRF protection — media extraction blocks private/internal hosts
- Secrets — loaded from
.env(gitignored), never committed
Original project by NachoSEO. Extended with media extraction (YouTube/Instagram/TikTok), Reddit integration (native TypeScript client + video downloads), voice transcription (Groq Whisper), dual TTS (Groq Orpheus + OpenAI), Medium/Freedium integration, Telegraph output, image uploads, Discord bot with Gemini Live voice channels, session continuity, terminal UI, and Factory Droid integration.
MIT