Pixly is a desktop overlay that acts as your gaming assistant, combining AI chat with automated, privacy-friendly screenshot capture and a game-specific Retrieval-Augmented Generation (RAG) knowledge base. Pixly detects what game you're playing, retrieves relevant, curated knowledge (wikis, user-supplied YouTube descriptions, and forum posts) via a local vector database, and grounds Gemini responses on those sources.
Make sure to star our repository, your support is much appreciated.
Important
๐ Hacktoberfest 2025 Participant Please make sure to star this repo.
- ๐ Table of Contents
- ๐ค Contributing, Setup and Install
- ๐ฎ What Pixly Does
- ๐๏ธ Architecture Overview
- ๐ Knowledge Base & Data Flow
- ๐ฏ Game Detection
- ๐ API Surface (Selected)
- ๐ Project Structure
- ๐ ๏ธ Technology Stack
- โ๏ธ How Components Work Together
- ๐ Security & Privacy
- ๐ License
- ๐ Acknowledgments
We welcome Hacktoberfest 2025 contributors! Whether you're adding new games to the knowledge base, improving the UI, or enhancing AI capabilities, your contributions matter.
- ๐ For Contributing Visit CONTRIBUTING.md
- โ๏ธ For Setup and Installation visit INSTALL.md
- ๐ค Intelligent, game-focused chat using Google Gemini with a "Game Expert" system prompt
- ๐ฏ Contextual help based on your active game (process detection and/or user message)
- ๐ธ Optional screenshot-powered context for visual analysis
- ๐ RAG pipeline over per-game CSV knowledge with local vector search (Chroma)
- ๐ป Modern desktop overlay for chatting, settings, and screenshot gallery
Pixly is organized into three main layers: UI Overlay, Backend API, and AI/RAG services, all running locally.
- CustomTkinter-based floating overlay, always-on-top, draggable
- Chat window with typing indicator and styled messages (user vs assistant)
- Settings window to manage screenshot capture and set the Google API key (persisted to
.envvia backend) - Screenshot gallery with View and Delete actions
- FastAPI server exposes HTTP endpoints on 127.0.0.1:8000
- Responsibilities:
- Route chat requests to Gemini
- Manage screenshots (start/stop capture, list, view, delete)
- Detect current game (process + manual keywords)
- Manage per-game knowledge ingestion and vectorization
- Provide vector search and knowledge stats
- Manage API key configuration (.env persistence + live reconfigure)
Key modules:
backend/backend.py: API endpoints and routingbackend/chatbot.py: Gemini client configuration, runtime reconfigure, and chat logic (with RAG context injection)backend/screenshot.py: Encrypted screenshot capture and storage; database operations; delete supportbackend/game_detection.py: Game detection via running processes, recent screenshots, and user message keywordsbackend/knowledge_manager.py: CSV ingestion and text extraction (wiki + forum; YouTube entries use user-provided description)backend/vector_service.py: Chroma persistent client, collection management, chunking, embeddings, and semantic queries
- Model:
Google Gemini 2.5 Flash Litefor responses - System prompt (
PROMPTS.txt) defines "Game Expert" persona and instructs grounding answers in retrieved snippets (WIKI / YOUTUBE / FORUM) with URLs - Vector DB:
Chroma(persistent on disk invector_db/) - Embeddings: sentence-transformers by default (configurable); text is chunked and embedded per content piece
- Retrieval: top-k relevant chunks by cosine similarity; included as context in the prompt
Pixly's knowledge is curated per game via CSV files that live in games_info/. The CSV schema is simple and contributor-friendly:
wiki,wiki_desc,youtube,yt_desc,forum,forum_desc
- wiki: URL to a relevant wiki page; Pixly extracts textual content
- wiki_desc: Contributor-provided description of the wiki URL
- youtube: URL to a relevant video; Pixly does not auto-transcribe; it uses the contributor-provided description
- yt_desc: Contributor-provided description of the YouTube URL
- forum: URL to a relevant forum/thread; Pixly extracts textual content
- forum_desc: Contributor-provided description of the forum URL
Processing pipeline per game:
- Load CSV for the game (e.g.,
games_info/minecraft.csv) - Extract text from wiki and forum URLs; keep YouTube descriptions as-is
- Clean and chunk text into manageable segments (e.g., ~512 tokens)
- Generate embeddings for each chunk and persist into Chroma collections
- On chat, detect game and run a semantic search to retrieve top snippets, then ground Gemini's response on those
Vector DB collections are organized by game and source type, e.g. minecraft_wiki, minecraft_youtube, minecraft_forum.
Pixly uses a layered strategy to infer the current game:
- Process Detection: Scans running processes for known executables
- Screenshot Context: Uses recent screenshot metadata (app/window) when available
- Manual Override: Detects game mentions in the user's message (e.g., "I'm playing Minecraft")
The detection result is passed into the RAG layer to scope retrieval to the active game's knowledge base.
POST /chat: Chat with Gemini; auto-detects game; augments prompt with retrieved snippetsPOST /screenshots/start?interval=30: Start periodic capturePOST /screenshots/stop: Stop captureGET /screenshots/recent?limit=10&application=...: List recent screenshots (metadata)GET /screenshots/{id}: Fetch a screenshot's image data (base64)DELETE /screenshots/{id}: Delete a screenshot entryPOST /games/detect: Detect current game (optionally pass message for keyword hints)GET /games/list: Enumerate detection-supported games, CSV-available games, and games with vectorsGET /games/{game}/knowledge/validate: Validate CSV schemaPOST /games/{game}/knowledge/process: Ingest CSV and build vectors in ChromaPOST /games/{game}/knowledge/search: Vector search within a game (query, content_types, limit)GET /games/{game}/knowledge/stats: Document counts per source typeGET /settings/api-key: Report whether the Gemini API key is configured (masked preview)POST /settings/api-key: Persist API key to.envand live-reconfigure the chatbot
pixly/
โโโ backend/
โ โโโ backend.py # FastAPI app initialization
โโโ routers/ # Contains all the API Routers
| โโโ chat.py # Stores chat endpoints
| โโโ game_detection.py # Stores game detection and vector search endpoints
| โโโ screenshot.py # Stores screenshot endpoints
| โโโ setting.py # Stores settings endpoints
โโโ services/ # Contains all the backend services.
โ โโโ chatbot.py # Gemini integration, RAG-aware chat, runtime reconfigure
โ โโโ screenshot.py # Encrypted screenshot capture, DB ops, delete support
โ โโโ game_detection.py # Process/message/screenshot-based game detection
โ โโโ knowledge_manager.py # CSV ingestion and content extraction (wiki/forum)
โ โโโ vector_service.py # Chroma collections, embeddings, and search
โโโ schemas/ # Contains the schemas for the various requests
| โโโ chat.py
| โโโ game_detection.py
| โโโ knowledge_search.py
| โโโ settings.py
โโโ overlay.py # CustomTkinter overlay (chat, settings, screenshot viewer)
โโโ games_info/ # Per-game CSVs (e.g., minecraft.csv)
โโโ vector_db/ # Chroma persistent storage
โโโ PROMPTS.txt # System persona + RAG grounding instructions
โโโ run.py # Backend server launcher
โโโ pyproject.toml # Dependencies and metadata
โโโ screenshots.db # Encrypted screenshot database (auto-created)
โโโ screenshot_key.key # Encryption key (auto-generated)
โโโ README.md # Project documentation
- UI/Frontend: CustomTkinter (modern theming and widgets for Python GUIs)
- API/Backend: FastAPI (async Python web framework) + Uvicorn (ASGI server)
- AI: Google Gemini 2.5 Flash Lite via
google-generativeai - RAG: Chroma (persistent local vector DB) + sentence-transformers (embeddings)
- Data: CSV-based per-game knowledge; SQLite for screenshots; Fernet for encryption
- System: psutil + pywin32 for Windows process/window info; Pillow for imaging
Notes:
- The embedding model is configurable; by default we use a sentence-transformers model suitable for local inference. The system can be switched to a different embedder (e.g., Mistral embeddings) with minor changes in
vector_service.py. - The persona and grounding behavior are controlled by
PROMPTS.txtso Gemini cites sources from retrieved snippets and focuses answers on gaming topics.
- The overlay sends chat requests to the backend.
- The backend detects the current game and queries Chroma for relevant snippets (wiki, YouTube description, forum).
- Retrieved snippets are added to the prompt to ground Gemini's response.
- If a screenshot is provided, it's included as a multimodal input to Gemini for visual context.
- The overlay displays a typing indicator while waiting and distinguishes user vs assistant messages for readability.
- Local-first design: screenshots, vectors, and CSVs are stored on your machine
- Encrypted screenshot blobs at rest using Fernet (AES)
- API key managed locally via the settings UI and
.envpersistence - No telemetry or external data collection
MIT License โ see LICENSE.
- Google Gemini for AI capabilities
- CustomTkinter for modern GUI components
- FastAPI for a robust backend framework
- The Hacktoberfest 2025 community for open-source collaboration
- All our amazing contributors who make this project possible!