Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

mainj in #14

Open
wants to merge 312 commits into
base: main
Choose a base branch
from
Open

mainj in #14

wants to merge 312 commits into from

Conversation

kaigouthro
Copy link
Owner

@kaigouthro kaigouthro commented Sep 29, 2024

hm

Summary by Sourcery

Refactor the CLI tool to support new features such as vision models and enhanced prompt handling, introduce a new benchmarking system for custom agents, and improve the overall structure and usability of the application. Add support for linting, improve error handling in diff processing, and update documentation and tests to reflect the new changes.

New Features:

  • Introduce a new decorator dcommand to simplify the creation of Typer command classes with dataclass fields.
  • Add support for loading prompts from files and directories, including handling of image directories for vision-capable models.
  • Implement a new CLI command structure with enhanced options for model selection, prompt files, and execution modes.
  • Add support for linting Python files using the black library, with configuration options for line length.
  • Introduce a new Prompt class to handle text and image inputs for language models, supporting both text and image URLs.
  • Add support for running benchmarks with custom agents using the bench command, including configuration via TOML files.

Bug Fixes:

  • Fix issues with handling of diffs in the salvage_correct_hunks function, including improved error handling and validation.
  • Resolve issues with the load_prompt function to correctly handle cases where the prompt file is a directory or empty.
  • Fix token usage calculation to account for both text and base64 encoded images in messages.

Enhancements:

  • Refactor the CLI tool to use a more modular and extensible structure, allowing for easier addition of new features and options.
  • Improve the handling of system information output for debugging purposes, including detailed package and environment information.
  • Enhance the file selection process in the CLI tool to support sorting and skipping file selection based on configuration.
  • Refactor the AI class to support vision models and improve message collapsing logic for more efficient processing.
  • Improve the logging and archiving of logs in the DiskMemory class, including support for appending to logs and archiving old logs.

Build:

  • Update Docker setup instructions and scripts to improve usability and ensure correct permissions for generated files.
  • Add a new Docker Compose setup for easier deployment and management of the application in containerized environments.

CI:

  • Update CI workflow to include specific paths for triggering builds and improve concurrency management.
  • Temporarily disable code coverage reporting due to rate limiting issues with the coverage service.

Documentation:

  • Update documentation to include new features and usage instructions for the CLI tool, including vision model support.
  • Add detailed setup instructions for using open-source models with the application, including hardware acceleration options.
  • Provide examples and test scripts for verifying the setup of open LLMs using both OpenAI API and Langchain interfaces.

Tests:

  • Add new test cases for the salvage_correct_hunks function to ensure correct handling of complex diffs and error scenarios.
  • Introduce tests for the new Prompt class and its integration with the CLI tool, ensuring correct handling of text and image inputs.
  • Add tests for the new linting functionality to verify correct application of linting rules and handling of different file types.

Chores:

  • Remove deprecated and unused files from the repository to clean up the codebase and improve maintainability.

ATheorell and others added 30 commits March 19, 2024 20:08
This replaces the `main()` function docstring which was confusing for
end users and also not well formatted.
image prompts - Entrypoint prompt - additional CLI argument
Copy link

sourcery-ai bot commented Sep 29, 2024

Reviewer's Guide by Sourcery

This pull request introduces significant changes to the gpt-engineer project, including updates to the core functionality, improvements to the benchmarking system, and additions of new features such as support for open-source models and vision capabilities. The changes span across multiple files and include modifications to the project structure, dependencies, and configuration.

Sequence Diagrams

Main CLI Execution Flow

sequenceDiagram
    participant User
    participant CLI
    participant AI
    participant FileSystem
    User->>CLI: Run gpte command
    CLI->>AI: Initialize AI with config
    CLI->>FileSystem: Load prompt and files
    alt Improve mode
        CLI->>AI: Improve existing code
    else Generate mode
        CLI->>AI: Generate new code
    end
    AI->>FileSystem: Write generated/improved code
    CLI->>User: Display results and stats
Loading

Benchmark Execution Flow

sequenceDiagram
    participant User
    participant BenchCLI
    participant BenchConfig
    participant Benchmark
    participant Agent
    User->>BenchCLI: Run benchmark command
    BenchCLI->>BenchConfig: Load configuration
    BenchCLI->>Benchmark: Initialize benchmark
    loop For each task
        Benchmark->>Agent: Run task
        Agent->>Benchmark: Return result
    end
    BenchCLI->>User: Display benchmark results
Loading

File-Level Changes

Change Details Files
Refactored and enhanced the main CLI interface
  • Added new command-line options for improved flexibility
  • Implemented support for vision-capable models
  • Added functionality to handle image inputs
  • Improved error handling and logging
  • Updated the execution flow to accommodate new features
gpt_engineer/applications/cli/main.py
Updated AI core functionality and token usage tracking
  • Added support for vision models
  • Improved message handling and collapsing
  • Enhanced token usage tracking for different model types
  • Updated chat model creation logic
gpt_engineer/core/ai.py
gpt_engineer/core/token_usage.py
Refactored and improved the benchmarking system
  • Added support for APPS and MBPP benchmarks
  • Implemented a new configuration system for benchmarks
  • Updated the benchmark running and result reporting logic
gpt_engineer/benchmark/__main__.py
gpt_engineer/benchmark/run.py
gpt_engineer/benchmark/bench_config.py
gpt_engineer/benchmark/benchmarks/apps/load.py
gpt_engineer/benchmark/benchmarks/mbpp/load.py
Added support for open-source and local models
  • Implemented functionality to use llama.cpp for inference
  • Added documentation for setting up and using open-source models
  • Created example scripts for testing open LLM setups
docs/open_models.md
docs/examples/open_llms/README.md
docs/examples/open_llms/openai_api_interface.py
docs/examples/open_llms/langchain_interface.py
Enhanced file handling and project structure
  • Implemented a new Prompt class for handling text and image inputs
  • Updated file selection logic and added linting capabilities
  • Improved disk memory handling and logging
gpt_engineer/core/prompt.py
gpt_engineer/applications/cli/file_selector.py
gpt_engineer/core/linting.py
gpt_engineer/core/default/disk_memory.py
Updated project configuration and dependencies
  • Modified CI workflow
  • Updated Docker configuration
  • Added new dependencies and updated existing ones
.github/workflows/ci.yaml
docker/README.md
docker/entrypoint.sh
docker-compose.yml
Improved documentation and project information
  • Updated README with new features and instructions
  • Modified contribution guidelines
  • Added information about significant contributors
README.md
.github/CONTRIBUTING.md

Tips and commands

Interacting with Sourcery

  • Trigger a new review: Comment @sourcery-ai review on the pull request.
  • Continue discussions: Reply directly to Sourcery's review comments.
  • Generate a GitHub issue from a review comment: Ask Sourcery to create an
    issue from a review comment by replying to it.

Customizing Your Experience

Access your dashboard to:

  • Enable or disable review features such as the Sourcery-generated pull request
    summary, the reviewer's guide, and others.
  • Change the review language.
  • Add, remove or edit custom review instructions.
  • Adjust other review settings.

Getting Help

Copy link

@sourcery-ai sourcery-ai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey @kaigouthro - I've reviewed your changes and they look great!

Here's what I looked at during the review
  • 🟡 General issues: 3 issues found
  • 🟢 Security: all looks good
  • 🟡 Testing: 3 issues found
  • 🟢 Complexity: all looks good
  • 🟢 Documentation: all looks good

Sourcery is free for open source - if you like our reviews please consider sharing them ✨
Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.

load_dotenv(dotenv_path=os.path.join(os.getcwd(), ".env"))


def concatenate_paths(base_path, sub_path):
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

suggestion: Consider using os.path.join instead of custom path concatenation

The custom concatenate_paths function might be unnecessary. Python's os.path.join() could be used to achieve the same result more reliably across different operating systems.

@@ -105,12 +107,17 @@ def __init__(
self.azure_endpoint = azure_endpoint
self.model_name = model_name
self.streaming = streaming
self.vision = (
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

suggestion: Reconsider the method of determining vision capability

The current method of determining vision capability based on model name might be fragile. Consider implementing a more robust method, perhaps by querying the model's capabilities directly if possible.

        self.vision = self._check_vision_capability()

    def _check_vision_capability(self):
        # TODO: Implement a more robust method to check vision capability
        # This could involve querying the model's capabilities directly
        return ("vision-preview" in self.model_name or
                ("gpt-4-turbo" in self.model_name and "preview" not in self.model_name))

@@ -89,9 +93,55 @@ def num_tokens(self, txt: str) -> int:
"""
return len(self._tiktoken_tokenizer.encode(txt))

def num_tokens_for_base64_image(
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

suggestion: Simplify the token calculation for base64 images

The current implementation for calculating tokens for base64 images is quite complex. Consider if there's a simpler way to estimate this, perhaps using a fixed ratio of bytes to tokens, which might be good enough for most use cases.

    def estimate_tokens_for_base64_image(self, image_base64: str) -> int:
        # Assuming a rough estimate of 0.75 tokens per byte
        return int(len(image_base64) * 0.75)

return dcommand_decorator


@dcommand(main.main)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

suggestion (testing): Consider adding tests for new CLI options

The main function has been updated with several new options like use_cache, skip_file_selection, no_execution, etc. It would be beneficial to add tests that cover these new options to ensure they work as expected.

def test_use_cache_option(self, tmp_path):
    p = tmp_path / "projects/example"
    p.mkdir(parents=True)
    (p / "prompt").write_text(prompt_text)
    args = DefaultArgumentsMain(str(p), use_cache=True, llm_via_clipboard=True, no_execution=True)
    args()
    assert args.use_cache, "use_cache option not set"

def test_no_execution_option(self, tmp_path):
    p = tmp_path / "projects/example"
    p.mkdir(parents=True)
    (p / "prompt").write_text(prompt_text)
    args = DefaultArgumentsMain(str(p), no_execution=True, llm_via_clipboard=True)
    args()
    assert args.no_execution, "no_execution option not set"



def load_and_test_diff(
def parse_chats_with_regex(
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

suggestion (testing): Rename function to better reflect its purpose

The function name parse_chats_with_regex doesn't accurately describe what it does. Consider renaming it to something like load_and_parse_diff to better reflect its functionality.

@@ -14,7 +14,7 @@ def test_start(monkeypatch):
ai = AI("gpt-4")

# act
response_messages = ai.start("system prompt", "user prompt", "step name")
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

suggestion (testing): Updated AI start method call to use keyword argument

The test has been updated to use a keyword argument for step_name. This change improves clarity and matches any changes in the AI class interface. Ensure that this change is consistent with the actual implementation.

response_messages = ai.start(
    system_prompt="system prompt",
    user_prompt="user prompt",
    step_name="step name"
)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.