Skip to content

Add comprehensive Codegen SDK + graph-sitter integration#80

Draft
codegen-sh[bot] wants to merge 4 commits intomainfrom
codegen/zam-995-codegen-sdk-use-with-graph_sitter
Draft

Add comprehensive Codegen SDK + graph-sitter integration#80
codegen-sh[bot] wants to merge 4 commits intomainfrom
codegen/zam-995-codegen-sdk-use-with-graph_sitter

Conversation

@codegen-sh
Copy link

@codegen-sh codegen-sh bot commented May 31, 2025

🚀 Comprehensive Codegen SDK + graph-sitter Integration

This PR implements a complete integration between the Codegen SDK and graph-sitter for advanced codebase analysis and AI-powered improvements.

✨ Key Features

  • 🔍 Structural Analysis: Analyze code structure, dependencies, call graphs, and relationships
  • 🤖 AI-Powered Insights: Use Codegen SDK for intelligent code analysis and suggestions
  • 📊 Function Context: Get detailed context for any function including call sites and dependencies
  • 💡 Improvement Suggestions: Generate actionable improvement recommendations
  • 🔧 Custom AI Provider: Configure graph-sitter to use Codegen SDK as the AI backend
  • 🎯 Interactive Mode: Run interactive sessions for exploratory code analysis

📁 Files Added

  • codegen_graph_sitter_integration.py - Main integration class
  • examples/basic_usage.py - Basic usage examples
  • examples/advanced_integration.py - Advanced integration with custom AI provider
  • README.md - Comprehensive documentation
  • requirements.txt - Updated dependencies

🎯 Usage Examples

Basic Analysis

from codegen_graph_sitter_integration import CodegenGraphSitterIntegration

integration = CodegenGraphSitterIntegration(
    org_id="your-org-id", 
    token="your-token",
    repo_path="fastapi/fastapi"
)

# Analyze codebase structure
analysis = integration.analyze_codebase_structure()
print(f"Most called function: {analysis.most_called_function['name']}")

AI-Powered Analysis

# Get improvement suggestions
suggestions = integration.suggest_improvements()

# Custom analysis with context
result = integration.analyze_with_codegen_ai(
    "Analyze for security vulnerabilities",
    context_data={"functions": len(integration.codebase.functions)}
)

Advanced Integration

from examples.advanced_integration import AdvancedCodegenGraphSitter

advanced = AdvancedCodegenGraphSitter(org_id, token, repo_path)

# Use codebase.ai() with Codegen backend
function = advanced.codebase.get_function("process_data")
result = advanced.codebase.ai(
    "Improve this function's implementation",
    target=function,
    context={"call_sites": function.call_sites}
)

🔧 Configuration Options

Option 1: Use Codegen API

export CODEGEN_ORG_ID="your-org-id"
export CODEGEN_TOKEN="your-api-token"

Option 2: Use OpenAI API (alternative)

export OPENAI_API_KEY="your-openai-key"

🎯 Use Cases

  • Code Quality Assessment: Identify unused functions, complexity metrics
  • Performance Optimization: Find bottlenecks in most-called functions
  • Security Review: AI-powered security vulnerability analysis
  • Documentation Generation: Auto-generate comprehensive docs
  • Refactoring Assistance: AI-guided code improvements

🧪 Testing

Run the examples to test the integration:

# Basic examples
python examples/basic_usage.py

# Advanced examples  
python examples/advanced_integration.py

# Interactive mode
python codegen_graph_sitter_integration.py

This implementation provides exactly what was requested - a seamless integration between Codegen SDK and graph-sitter that enables comprehensive codebase analysis with AI-powered insights!


💻 View my workAbout Codegen

Summary by Sourcery

Add a full featured integration between the Codegen SDK and graph-sitter to enable advanced codebase analysis and AI-driven improvements, including a new integration module, example scripts, updated dependencies, and comprehensive documentation.

New Features:

  • Perform structural code analysis with graph-sitter (call graphs, dependencies, dead code detection).
  • Leverage the Codegen SDK for AI-powered insights and contextual code suggestions.
  • Retrieve detailed function context including call sites, dependencies, and implementation metrics.
  • Generate actionable improvement recommendations via AI.
  • Configure a custom Codegen AI provider for graph-sitter’s codebase.ai().
  • Run interactive analysis sessions for exploratory workflows.

Enhancements:

  • Add a new "codegen_graph_sitter_integration.py" module that orchestrates SDK and graph-sitter functionality.
  • Provide basic and advanced usage examples in dedicated scripts under the examples/ directory.
  • Refactor the README.md to document installation, configuration, API reference, usage patterns, and troubleshooting.

Build:

  • Update requirements.txt with codegen, graph-sitter, and optional development dependencies.

Documentation:

  • Completely overhaul the project README to cover new integration features, configuration options, API reference, example workflows, and contributing guidelines.

Chores:

  • Remove legacy Cloudflare Postgres setup tooling and configuration.

codegen-sh bot added 4 commits May 28, 2025 01:36
- Automated setup script for local Postgres exposure via Cloudflare Workers
- Creates dedicated database and read-only user for Codegen
- Deploys Cloudflare Worker proxy with health endpoints
- Saves credentials to .env file for easy integration
- Includes Windows batch and PowerShell scripts for easy setup
- Comprehensive testing and status reporting
- Full documentation with troubleshooting guide
- Add support for multiple authentication methods
- Try common default passwords automatically
- Support environment variables for admin credentials
- Add interactive password prompt as fallback
- Update documentation with authentication troubleshooting
- Handle Windows authentication scenarios
- Switch from API token to Global API Key authentication
- Add support for Cloudflare email requirement
- Update environment variables and batch scripts
- Create specialized script with user's credentials
- Fix Cloudflare Worker creation authentication
- Created main integration class with structural analysis
- Added function context analysis with dependencies and call sites
- Implemented AI-powered improvement suggestions
- Added advanced integration with custom AI provider
- Created interactive analysis mode
- Added comprehensive examples and documentation
- Supports both Codegen API and OpenAI API backends
@sourcery-ai
Copy link

sourcery-ai bot commented May 31, 2025

Reviewer's Guide

This PR introduces a comprehensive integration between the Codegen SDK and graph-sitter by adding a new integration class, advanced and basic usage examples, a fully rewritten README, and updated dependencies to enable AI-driven code analysis and improvement workflows.

Sequence Diagram for Codebase Structural Analysis

sequenceDiagram
    actor User
    participant CGSI as CodegenGraphSitterIntegration
    participant GSCodebase as graph_sitter.Codebase

    User->>CGSI: new CodegenGraphSitterIntegration(org_id, token, repo_path)
    activate CGSI
    CGSI->>GSCodebase: Codebase.from_repo(repo_path)
    activate GSCodebase
    GSCodebase-->>CGSI: codebase instance
    deactivate GSCodebase
    deactivate CGSI

    User->>CGSI: analyze_codebase_structure()
    activate CGSI
    CGSI->>GSCodebase: Access functions, call_sites, dependencies etc.
    activate GSCodebase
    GSCodebase-->>CGSI: Structural data (details for CodebaseAnalysis)
    deactivate GSCodebase
    CGSI-->>User: CodebaseAnalysis object
    deactivate CGSI
Loading

Sequence Diagram for AI-Powered Analysis via Codegen SDK

sequenceDiagram
    actor User
    participant CGSI as CodegenGraphSitterIntegration
    participant CodegenAgent as codegen.Agent

    User->>CGSI: analyze_with_codegen_ai(prompt, context_data)
    activate CGSI
    CGSI->>CodegenAgent: agent.run(full_prompt)
    activate CodegenAgent
    CodegenAgent-->>CGSI: Task object
    deactivate CodegenAgent
    Note right of CGSI: CGSI polls task.refresh() until completed
    CGSI->>CodegenAgent: task.refresh()
    activate CodegenAgent
    CodegenAgent-->>CGSI: Updated Task (status: completed)
    deactivate CodegenAgent
    CGSI-->>User: AI analysis result (task.result)
    deactivate CGSI
Loading

Class Diagram for Advanced Integration Components

classDiagram
    class Agent {
        <<SDK Class>>
        +run(prompt: str) object
    }
    class Codebase {
        <<SDK Class>>
        +from_repo(repo_path: str) Codebase
        +set_ai_provider(provider: object) (conceptual)
        +ai(prompt: str, target: object, context: Dict) str
        +get_function(name: str) object
        +get_class(name: str) object
    }
    class CodegenAIProvider {
        -agent: Agent
        +__init__(org_id: str, token: str)
        +generate(prompt: str, context: Dict): str
    }
    class AdvancedCodegenGraphSitter {
        -org_id: str
        -token: str
        -repo_path: str
        -ai_provider: CodegenAIProvider
        -codebase: Codebase
        +__init__(org_id: str, token: str, repo_path: str)
        -_configure_codebase_ai()
        +analyze_function_with_ai(function_name: str, analysis_type: str): str
        +refactor_function_with_ai(function_name: str, refactor_goal: str): str
        +generate_documentation_with_ai(target_type: str, target_name: str): str
        +batch_analyze_functions(analysis_type: str): Dict
        +create_improvement_plan(): str
    }
    AdvancedCodegenGraphSitter "1" *-- "1" CodegenAIProvider : uses
    AdvancedCodegenGraphSitter "1" *-- "1" Codebase : uses
    CodegenAIProvider "1" *-- "1" Agent : uses
    AdvancedCodegenGraphSitter ..> CodegenAIProvider : configures Codebase with
Loading

File-Level Changes

Change Details Files
Added new integration implementation between Codegen SDK and graph-sitter
  • Implemented CodegenGraphSitterIntegration class with initialization and codebase loading
  • Defined CodebaseAnalysis dataclass and methods for structural and dependency analysis
  • Integrated codegen.Agent with graph_sitter.Codebase for AI-powered queries
  • Built methods for analysis, AI prompts, improvement suggestions, PR creation, and interactive mode
codegen_graph_sitter_integration.py
Introduced advanced AI provider and custom integration for graph-sitter
  • Created CodegenAIProvider that wraps codegen.Agent for prompt execution
  • Developed AdvancedCodegenGraphSitter to configure graph-sitter's AI backend
  • Implemented methods for function analysis, AI-driven refactoring, documentation, batch workflows, and improvement planning
examples/advanced_integration.py
Provided basic usage examples showcasing key integration features
  • Added scripts demonstrating structural analysis, function context, AI analysis, and improvement suggestions
  • Illustrated custom security and code-quality analysis using CodegenGraphSitterIntegration
examples/basic_usage.py
Revamped README with comprehensive instructions and documentation
  • Rewrote project overview and feature list for Codegen SDK + graph-sitter integration
  • Updated installation, setup, quick start, and usage sections with new code snippets
  • Removed legacy Cloudflare Postgres content and added API reference, examples, contributing, and support info
README.md
Updated project dependencies to support the new integration
  • Replaced old database and request libraries with core packages (codegen, graph-sitter)
  • Added optional dependencies (rich, click, pydantic) and development tools (pytest, black, flake8, mypy)
  • Expanded requirements.txt to reflect new integration needs
requirements.txt

Tips and commands

Interacting with Sourcery

  • Trigger a new review: Comment @sourcery-ai review on the pull request.
  • Continue discussions: Reply directly to Sourcery's review comments.
  • Generate a GitHub issue from a review comment: Ask Sourcery to create an
    issue from a review comment by replying to it. You can also reply to a
    review comment with @sourcery-ai issue to create an issue from it.
  • Generate a pull request title: Write @sourcery-ai anywhere in the pull
    request title to generate a title at any time. You can also comment
    @sourcery-ai title on the pull request to (re-)generate the title at any time.
  • Generate a pull request summary: Write @sourcery-ai summary anywhere in
    the pull request body to generate a PR summary at any time exactly where you
    want it. You can also comment @sourcery-ai summary on the pull request to
    (re-)generate the summary at any time.
  • Generate reviewer's guide: Comment @sourcery-ai guide on the pull
    request to (re-)generate the reviewer's guide at any time.
  • Resolve all Sourcery comments: Comment @sourcery-ai resolve on the
    pull request to resolve all Sourcery comments. Useful if you've already
    addressed all the comments and don't want to see them anymore.
  • Dismiss all Sourcery reviews: Comment @sourcery-ai dismiss on the pull
    request to dismiss all existing Sourcery reviews. Especially useful if you
    want to start fresh with a new review - don't forget to comment
    @sourcery-ai review to trigger a new review!

Customizing Your Experience

Access your dashboard to:

  • Enable or disable review features such as the Sourcery-generated pull request
    summary, the reviewer's guide, and others.
  • Change the review language.
  • Add, remove or edit custom review instructions.
  • Adjust other review settings.

Getting Help

@korbit-ai
Copy link

korbit-ai bot commented May 31, 2025

By default, I don't review pull requests opened by bots. If you would like me to review this pull request anyway, you can request a review via the /korbit-review command in a comment.

@coderabbitai
Copy link

coderabbitai bot commented May 31, 2025

Important

Review skipped

Bot user detected.

To trigger a single review, invoke the @coderabbitai review command.

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.


🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>, please review it.
    • Explain this complex logic.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
    • @coderabbitai explain this code block.
    • @coderabbitai modularize this function.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read src/utils.ts and explain its main purpose.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
    • @coderabbitai help me debug CodeRabbit configuration file.

Support

Need help? Join our Discord community for assistance with any issues or questions.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

  • @coderabbitai pause to pause the reviews on a PR.
  • @coderabbitai resume to resume the paused reviews.
  • @coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
  • @coderabbitai full review to do a full review from scratch and review all the files again.
  • @coderabbitai summary to regenerate the summary of the PR.
  • @coderabbitai generate sequence diagram to generate a sequence diagram of the changes in this PR.
  • @coderabbitai resolve resolve all the CodeRabbit review comments.
  • @coderabbitai configuration to show the current CodeRabbit configuration for the repository.
  • @coderabbitai help to get help.

Other keywords and placeholders

  • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • Please see the configuration documentation for more information.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

0 participants