This project was developed by:
- Manav Pandya - Core Architecture & Backend Development
- Tushar Jain - AI Integration & Query Engine
- Prayag Bhatt - API Development & Infrastructure
- Overview
- Architecture
- Problem Statement
- Solution
- Features
- Getting Started
- API Documentation
- Technology Stack
Githubify is an intelligent codebase comprehension platform designed to accelerate developer onboarding and reduce the learning curve for new team members, interns, and open-source contributors. By leveraging AI-powered analysis, Githubify transforms complex repositories into understandable, queryable knowledge bases.
Organizations face significant challenges when onboarding new developers:
- Time-Intensive Onboarding: New developers spend weeks understanding codebases, consuming valuable senior developer time
- Knowledge Bottleneck: Limited availability of senior developers to guide newcomers through complex architectures
- Context Loss: Documentation often becomes outdated, making it difficult to understand design decisions and code relationships
- Steep Learning Curve: Large repositories with multiple modules and dependencies are overwhelming for newcomers
- Reduced Productivity: Delayed understanding leads to slower contribution cycles and increased frustration
- The Cost: Companies lose an average of 30-50 hours of senior developer time per new hire during onboarding, while new developers take 3-6 months to become fully productive.
- Intelligent Repository Analysis: Automatically ingests and analyzes GitHub repositories to build a comprehensive knowledge graph
- Natural Language Queries: Ask questions in plain English about code functionality, architecture, and implementation
- Fast Onboarding: Reduce onboarding time from weeks to days
- Repository Insights: Understand codebase structure, patterns, and best practices
- For Senior Developers: Reduced time spent on repetitive onboarding explanations
- For Teams & New Developers: Faster time-to-productivity and more consistent knowledge sharing
- For Open Source Projects: Lower barrier to entry for new contributors
- Ingestion Service: Clones and analyzes repositories, extracting code structure and relationships
- Knowledge Base: Stores processed repository information in a queryable format
- Query Engine: Processes natural language queries and retrieves relevant information
- API Layer: FastAPI-based REST interface for client interactions
Ensure you have the following installed:
- Python: Version 3.7 or higher
- UV Package Manager: For dependency management
If you don't have UV installed, install it first:
# On macOS/Linux
curl -LsSf https://astral.sh/uv/install.sh | sh
# On Windows
powershell -c "irm https://astral.sh/uv/install.ps1 | iex"
git clone https://github.com/Prayag2003/githubify-ethereum-bots-futurestack-hack
cd githubify-ethereum-bots-futurestack-hack/server
# Create virtual environment
uv venv
# Activate virtual environment
# On macOS/Linux:
source .venv/bin/activate
# On Windows:
.venv\Scripts\activate
# Install dependencies
uv pip install -r requirements.txt
Create a .env
file in the server directory:
CEREBRAS_API_KEY=""
PINECONE_API_KEY=""
uvicorn app.main:app --reload
The server will start on http://localhost:8000
. You can access the interactive API documentation at http://localhost:8000/docs
.
http://localhost:8000
Analyze and ingest a GitHub repository into the knowledge base.
Endpoint: POST /repos/ingest
Request Body:
{
"github_url": "https://github.com/username/repository-name"
}
Response:
{
"repo_id": "ab4de4a4-6262-4dfa-8f68-9863d5a7082d",
"status": "success",
"message": "Repository ingested successfully",
"files_processed": 42,
"analysis_time": "15.3s"
}
Status Codes:
200 OK
: Repository ingested successfully400 Bad Request
: Invalid GitHub URL404 Not Found
: Repository not found500 Internal Server Error
: Processing error
Ask questions about an ingested repository.
Endpoint: POST /query
Request Body:
{
"repo_id": "ab4de4a4-6262-4dfa-8f68-9863d5a7082d",
"query": "What does this codebase do?"
}
Response:
{
"answer": "This codebase implements a Discord bot for the Across Protocol...",
"relevant_files": [
"src/bot/main.py",
"src/handlers/message_handler.py"
],
"confidence": 0.95,
"timestamp": "2025-10-04T10:30:00Z"
}
Status Codes:
200 OK
: Query processed successfully400 Bad Request
: Invalid request format404 Not Found
: Repository ID not found500 Internal Server Error
: Query processing error
Get all ingested repositories.
Endpoint: GET /repos
Response:
{
"repositories": [
{
"repo_id": "ab4de4a4-6262-4dfa-8f68-9863d5a7082d",
"name": "across-protocol-discord-bot",
"url": "https://github.com/Prayag2003/across-protocol-discord-bot",
"ingested_at": "2025-10-04T09:15:00Z",
"files_count": 42
}
]
}
curl -X POST \
http://localhost:8000/repos/ingest \
-H 'Content-Type: application/json' \
-d '{
"github_url": "https://github.com/Prayag2003/across-protocol-discord-bot"
}'
curl -X POST \
http://localhost:8000/query \
-H 'Content-Type: application/json' \
-d '{
"repo_id": "ab4de4a4-6262-4dfa-8f68-9863d5a7082d",
"query": "Explain the main architecture of this application"
}'
curl -X POST \
http://localhost:8000/query \
-H 'Content-Type: application/json' \
-d '{
"repo_id": "ab4de4a4-6262-4dfa-8f68-9863d5a7082d",
"query": "Where is the authentication logic implemented?"
}'
- FastAPI: Modern, fast web framework for building APIs
- Python 3.7+: Core programming language
- UV: Fast Python package installer and resolver
- Uvicorn: ASGI server for running FastAPI applications
- OpenAI GPT: Natural language understanding and response generation
- LangChain: Framework for developing LLM-powered applications
- Vector Embeddings: For semantic code search
- SQLite: Lightweight database for metadata
- Vector Database: For code embeddings (Pinecone Vector DB)
- Git/GitHub API: Repository access and management
- AST Parsers: Code structure analysis
Made with β€οΈ by Team Ethereum Bots