Skip to content

Conversation

@codegen-sh
Copy link

@codegen-sh codegen-sh bot commented Mar 23, 2025

This PR adds a new example that demonstrates a complete CI/CD flow using Codegen's components. The integrated CI/CD flow combines several existing examples into a cohesive pipeline that handles everything from requirements gathering to deployment.

Key Features

  • Requirements & Planning Hub: Captures and analyzes requirements from Linear, breaks them down into manageable subtasks
  • AI-Assisted Development: Generates code changes based on requirements, creates PRs with detailed documentation
  • Comprehensive Code Review: Reviews PRs with multiple perspectives, provides feedback via GitHub and Slack
  • Continuous Knowledge & Assistance: Provides context and assistance throughout the pipeline via Slack

Implementation Details

  • Uses the modern FileIndex instead of the deprecated VectorIndex for semantic search
  • Implements an event bus for communication between components
  • Provides both Modal-based deployment and standalone mode for local development
  • Includes comprehensive documentation and configuration templates

Usage

  1. Create a .env file from the template
  2. Deploy with Modal: modal deploy app.py
  3. Create a Linear ticket with the "Codegen" label
  4. The system will automatically analyze the ticket, generate code changes, create a PR, and review it

This example serves as a reference architecture for building AI-powered CI/CD pipelines with Codegen.

Comment on lines +109 to +133
@modal.method()
def process_linear_event(self, event: Dict[str, Any]) -> Dict[str, Any]:
"""Process a Linear event.

Args:
event: Linear webhook event

Returns:
Response data
"""
logger.info(f"Processing Linear event: {event.get('action')}")
return app.linear.handle(event)

@modal.method()
def process_github_event(self, event: Dict[str, Any]) -> Dict[str, Any]:
"""Process a GitHub event.

Args:
event: GitHub webhook event

Returns:
Response data
"""
logger.info(f"Processing GitHub event: {event.get('action')}")
return app.github.handle(event)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The methods process_linear_event and process_github_event lack error handling for the operations app.linear.handle(event) and app.github.handle(event). This could lead to unhandled exceptions if these methods fail or if the event data is malformed, potentially crashing the application or causing unexpected behavior.

Recommendation:
Implement try-except blocks around these calls to handle possible exceptions gracefully. Log the errors and return appropriate error responses to ensure the application remains stable and provides useful feedback on failures.

Comment on lines +109 to +133
@modal.method()
def process_linear_event(self, event: Dict[str, Any]) -> Dict[str, Any]:
"""Process a Linear event.

Args:
event: Linear webhook event

Returns:
Response data
"""
logger.info(f"Processing Linear event: {event.get('action')}")
return app.linear.handle(event)

@modal.method()
def process_github_event(self, event: Dict[str, Any]) -> Dict[str, Any]:
"""Process a GitHub event.

Args:
event: GitHub webhook event

Returns:
Response data
"""
logger.info(f"Processing GitHub event: {event.get('action')}")
return app.github.handle(event)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The methods process_linear_event and process_github_event are called within asynchronous endpoints but are themselves not asynchronous. This could block the event loop if the processing is I/O intensive, affecting the application's performance.

Recommendation:
Convert process_linear_event and process_github_event to asynchronous methods by defining them with async def and ensuring that any I/O operations within them are performed asynchronously. This will help maintain the non-blocking nature of the application and improve responsiveness.

Comment on lines +53 to +70
def handle_ticket_created(self, event: Event) -> None:
"""Handle ticket created event.

Args:
event: Ticket created event
"""
logger.info("Handling ticket created event")

# Extract issue data
issue_data = event.data.get("issue", {})
subtasks = event.data.get("subtasks", [])

# Create LinearIssue object
issue = LinearIssue(**issue_data)

# Process the ticket
self.process_ticket(issue, subtasks)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The methods handle_ticket_created and handle_ticket_updated lack error handling mechanisms. This could lead to unhandled exceptions if there are issues with the data extraction or during the processing of the tickets.

Recommended Solution:
Implement try-except blocks to catch and handle potential exceptions appropriately. Log the errors and consider providing a fallback or retry mechanism if applicable.

Comment on lines +90 to +134
def process_ticket(self, issue: LinearIssue, subtasks: List[str]) -> None:
"""Process a Linear ticket and generate code changes.

Args:
issue: Linear issue
subtasks: List of subtasks extracted from the issue
"""
logger.info(f"Processing ticket {issue.identifier}")

# Initialize codebase
self.initialize_codebase()

# Perform code research to understand the context
research_query = f"Research context for: {issue.title}"
research_result = perform_code_research(self.codebase, research_query)

# Comment on the issue with research findings
self.linear_client.comment_on_issue(
issue.id,
f"I've researched the codebase and found relevant context for your request. Here's what I found:\n\n{research_result.findings}\n\nRelevant files:\n" + "\n".join([f"- {file}" for file in research_result.relevant_files]),
)

# Format the message for the code agent
query = format_linear_message(issue.title, issue.description)

# Create a CodeAgent to generate code changes
agent = CodeAgent(self.codebase)

# Run the agent to generate code changes
logger.info("Generating code changes")
agent.run(query)

# Create a PR with the changes
branch_name = generate_branch_name(issue.identifier, issue.title)
pr_title = f"[{issue.identifier}] {issue.title}"
pr_body = f"Codegen generated PR for issue: {issue.url}\n\n{issue.description}"

logger.info(f"Creating PR with branch {branch_name}")
create_pr_result = create_pr(self.codebase, pr_title, pr_body, head_branch=branch_name)

# Comment on the issue with the PR link
self.linear_client.comment_on_issue(
issue.id,
f"I've created a PR with the changes: {create_pr_result.url}",
)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The method process_ticket uses issue data directly in forming queries and messages without sanitization or validation. This could expose the system to injection attacks or processing incorrect data.

Recommended Solution:
Implement input validation and sanitization for issue.title and issue.description before using them in queries or other sensitive operations. This will help prevent potential security vulnerabilities and ensure data integrity.

Comment on lines +54 to +70
def initialize_codebase(self) -> None:
"""Initialize the codebase and index if not already initialized."""
if self.codebase is None:
logger.info(f"Initializing codebase for {self.config.github.repo}")
self.codebase = create_codebase(self.config.github.repo, ProgrammingLanguage.PYTHON)

# Initialize file index
self.index = FileIndex(self.codebase)

# Try to load existing index or create new one
index_path = os.path.join(self.codebase.repo_path, ".codegen", "file_index.pkl")
try:
self.index.load(index_path)
except FileNotFoundError:
logger.info("Creating new file index")
self.index.create()
self.index.save(index_path)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The initialize_codebase method handles FileNotFoundError when attempting to load an existing index, but it does not account for other possible exceptions such as permission errors or issues with file corruption. This lack of comprehensive error handling could lead to unhandled exceptions and disrupt the application flow.

Recommendation:
Extend the exception handling to cover other potential issues such as PermissionError and IOError. Use a more generic exception handler or multiple specific handlers to ensure the application can gracefully handle and log different types of errors that might occur during file operations.


filepath: str
content: str
action: str # "create", "modify", "delete"

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The action field in the CodeChange class is expected to have specific values ('create', 'modify', 'delete'). However, there is no validation enforcing this, which could lead to errors if incorrect values are provided. Consider adding validation in the constructor to ensure that the action value is one of the expected options:

@dataclass
class CodeChange:
    filepath: str
    content: str
    action: str
    old_content: Optional[str] = None

    def __post_init__(self):
        if self.action not in ['create', 'modify', 'delete']:
            raise ValueError(f"Invalid action: {self.action}")

"""
return Event(
type=event_type,
timestamp=datetime.now(),

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The use of datetime.now() for generating event timestamps does not consider time zone differences, which can lead to inconsistencies when events are logged from different geographical locations. Recommendation: Use datetime.utcnow() for a more consistent UTC-based timestamp, or better yet, utilize datetime.now(timezone.utc) to explicitly set the timezone to UTC.

Comment on lines +138 to +143
try:
index.load(index_path)
except FileNotFoundError:
logger.info("Creating new file index")
index.create()
index.save(index_path)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The error handling for file index operations only catches FileNotFoundError and assumes no other types of I/O errors could occur. This might not be sufficient for robust error handling. Recommendation: Extend the error handling to catch more general exceptions such as IOError or OSError to cover other potential file operation errors. Additionally, consider logging the error details to help with troubleshooting.

Comment on lines +41 to +54
def setup(self):
"""Set up the standalone app."""
logger.info("Setting up standalone app")

# Load configuration
self.config = load_config_from_env()

# Initialize components
# Note: In standalone mode, we don't have a CodegenApp instance
# so we'll need to mock it or adapt the components
self.development_engine = create_development_engine()
self.review_system = create_review_system()
self.knowledge_assistant = create_knowledge_assistant()

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Lack of Error Handling

The setup method initializes various components such as development_engine, review_system, and knowledge_assistant without any error handling mechanisms. If any of these initializations fail, the application could crash or behave unpredictably.

Recommendation:
Wrap the initialization calls within try-except blocks. Log the errors and handle them appropriately, possibly with a graceful shutdown or a retry mechanism, depending on the nature of the error.

Comment on lines +77 to +92
@web_app.post("/github/webhook")
async def github_webhook(request: Request):
"""Handle GitHub webhook events."""
event = await request.json()
logger.info(f"Received GitHub webhook: {event.get('action')}")
# In standalone mode, we'll just log the event
return {"status": "ok"}

# Add Linear webhook endpoint
@web_app.post("/linear/webhook")
async def linear_webhook(request: Request):
"""Handle Linear webhook events."""
event = await request.json()
logger.info(f"Received Linear webhook: {event.get('action')}")
# In standalone mode, we'll just log the event
return {"status": "ok"}

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Security Concern: Unvalidated and Unsanitized Input

The webhook handlers for GitHub and Linear directly process and log incoming requests without any form of input validation or sanitization. This could expose the application to security risks such as logging sensitive information or injection attacks.

Recommendation:
Implement input validation and sanitization before processing or logging the data. Ensure that only expected and safe data is handled by the application.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant