Otterbot 🦦

A Telegram chatbot that serves as a board game assistant for the NCS MA boardgames group. It answers queries about board games, researches game rules from the web, and acts as a knowledgeable assistant during game sessions.

Features

Core Capabilities

🔍 Research Mode: Downloads and indexes game rules, PDFs, YouTube captions, and documentation from the web (up to 30 sources per game)
💬 Hybrid Q&A: Combines internal knowledge base with live web search for comprehensive, up-to-date answers
📚 Web Interface: Beautiful HTML interface to browse downloaded game resources with PDF previews
🤖 Agentic Routing: AI-powered intent classification understands natural language queries
🎯 Context-Aware: Remembers conversation history to infer which game you're asking about

Quick Start

Prerequisites

Python 3.13+
Telegram bot token (from @BotFather)
OpenAI API key
API Key from Google Console for Youtube
Domain name with nginx (so users can see your files from API webpage)

Installation

# Clone the repository
git clone https://github.com/WildSphee/otterbot
cd otterbot

# Install dependencies
python3 -m venv venv
pip install poetry
poetry install

# Activate virtual environment (assume your on linux)
source venv/bin/activate

# Create .env file in bot/ with your credentials
cat > bot/.env << EOF
OPENAI_API_KEY=
OTTER_BOT_TOKEN=
YOUTUBE_API_KEY=
STORAGE_DIR=storage
DATABASE_NAME=database
API_BASE_URL=http://localhost:8000
EOF

Running the Bot

Option 1: Run both services with one command (Recommended)

bash scripts/start.sh

This starts:

Telegram bot (listens for messages)
FastAPI web server on 0.0.0.0:8000 (browse game files)

Press Ctrl+C to stop both services cleanly.

Using the Bot

In Group Chats (requires "otter" in the first 2 words mentioned):

hey otter, research Catan
otter what games do you have?
otter how do you win in Wingspan?

In Direct Messages (no prefix needed):

research Catan
what games are available?
how do tiebreakers work in Catan?

Available Commands:

Research a game: otter research [game name] - Downloads rules, PDFs, YouTube tutorials
Ask questions: otter [question about game] - Answers using internal docs + web search
List games: otter what games do you have? - Shows library with AI-generated descriptions
General chat: otter hello - Friendly conversation

Browse Files:

In Telegram: Use WebApp buttons sent by the bot (tap "📂 View [Game] Files")
In Browser: Visit http://your-server:8000/games/{game_id}/files for a beautiful interface

How It Works

Research Logic Flow

When you ask OtterBot to research a game (e.g., otter research Catan), here's what happens under the hood:

Key Steps Explained

Intent Classification (bot/otterrouter.py:60)
- Uses LLM to classify user intent
- Extracts game name from natural language
BGG URL Discovery (bot/tools.py:118-165)
- First: Try BGG XML API with exact=1 parameter for better matching
- If 401: BGG now requires authentication - falls back to Google search
- Google Fallback: Searches site:boardgamegeek.com/boardgame for accurate URL
Parallel Data Fetching (bot/tools.py:390-418) - 3 concurrent tasks:
- Task 1 - Web Research: OpenAI Responses API finds 20-30 authoritative sources
- Task 2 - BGG Metadata: Fetches actual BGG page HTML, extracts 8000 chars including JSON-LD, LLM parses difficulty & player count
- Task 3 - YouTube: Searches for tutorial, validates URL with oEmbed API, falls back to Google if needed
YouTube Validation (bot/tools.py:73-105, 425-431)
- Uses YouTube oEmbed API to check if video exists
- Filters out deleted/unavailable videos
- Google search fallback if initial search fails
Source Download & Processing (bot/tools.py:244-361)
- PDFs: Downloaded and stored as-is
- HTML Pages: Downloaded + text extracted to companion .txt file
- YouTube Videos: Captions fetched via YouTube Transcript API and saved as .txt
- External Links: URL saved without download (for references, videos without captions)
Vector Index Creation (datasources/ingest.py)
- All text files chunked into ~500-token segments
- Embedded using OpenAI text-embedding-3-small
- Stored in FAISS index for semantic search
Metadata Enrichment (bot/llms/openai.py:161-240, bot/tools.py:467-511)
- LLM extracts actual difficulty score and player count from real BGG HTML content
- Auto-generates 2-3 sentence game description from downloaded sources
- Saves BGG URL (validated), YouTube link (validated), difficulty, player count, description
Response (bot/tools.py:525-561, bot/otterrouter.py:135-140)
- Sends message with game description, metadata, and links
- Includes difficulty, player count, YouTube tutorial, BGG link
- Attaches WebApp button to browse files in Telegram
- Updates game status to "ready"

Query Logic Flow

When you ask a question about a game (e.g., otter how do you win in Catan?):

Query Steps Explained

Game Name Extraction (bot/tools.py:164-207)
- LLM extracts game name from question
- Fuzzy matches against available games (60% similarity threshold)
- Falls back to recent chat history if no explicit mention
Context Retrieval (bot/tools.py:481-507)
- If game is researched: FAISS semantic search for relevant chunks
- Returns top 5 most relevant passages + source citations
Hybrid Answer Generation (bot/llms/openai.py:63-110)
- OpenAI Responses API with web_search tool
- Combines internal knowledge base + live web search
- Ensures fresh, comprehensive answers
Source Attribution (bot/tools.py:566-600)
- Researched games: Shows internal file citations + link to full file browser
- Non-researched games: Adds disclaimer suggesting research for better results
- All answers end with 🦦

Development

Code Quality

# Format and lint code
bash scripts/lint.sh

# Check only (no fixes)
bash scripts/lint.sh . --check

License

This project is licensed under the MIT License.

See the LICENSE file for the full license text.

Contributing

Contributions are welcome! Feel free to open issues or submit pull requests.

Name		Name	Last commit message	Last commit date
Latest commit History 38 Commits
api		api
assets/images		assets/images
bot		bot
scripts		scripts
.gitignore		.gitignore
CLAUDE.md		CLAUDE.md
LICENSE		LICENSE
README.md		README.md
bggapi.py		bggapi.py
poetry.lock		poetry.lock
pyproject.toml		pyproject.toml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Otterbot 🦦

Features

Core Capabilities

Quick Start

Prerequisites

Installation

Running the Bot

Using the Bot

How It Works

Research Logic Flow

Key Steps Explained

Query Logic Flow

Query Steps Explained

Development

Code Quality

License

Contributing

About

Uh oh!

Releases

Packages

Uh oh!

Contributors 2

Uh oh!

Languages

License

WildSphee/otterbot

Folders and files

Latest commit

History

Repository files navigation

Otterbot 🦦

Features

Core Capabilities

Quick Start

Prerequisites

Installation

Running the Bot

Using the Bot

How It Works

Research Logic Flow

Key Steps Explained

Query Logic Flow

Query Steps Explained

Development

Code Quality

License

Contributing

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors 2

Uh oh!

Languages

Packages