-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
mainj in #14
base: main
Are you sure you want to change the base?
mainj in #14
Conversation
This replaces the `main()` function docstring which was confusing for end users and also not well formatted.
image prompts - Entrypoint prompt - additional CLI argument
Improve `--help` output
…port-template Update bug report template to include installation method
…new-codes-from-gpte 1197 failed to update new codes from gpte
Usability improvements
Reviewer's Guide by SourceryThis pull request introduces significant changes to the gpt-engineer project, including updates to the core functionality, improvements to the benchmarking system, and additions of new features such as support for open-source models and vision capabilities. The changes span across multiple files and include modifications to the project structure, dependencies, and configuration. Sequence DiagramsMain CLI Execution FlowsequenceDiagram
participant User
participant CLI
participant AI
participant FileSystem
User->>CLI: Run gpte command
CLI->>AI: Initialize AI with config
CLI->>FileSystem: Load prompt and files
alt Improve mode
CLI->>AI: Improve existing code
else Generate mode
CLI->>AI: Generate new code
end
AI->>FileSystem: Write generated/improved code
CLI->>User: Display results and stats
Benchmark Execution FlowsequenceDiagram
participant User
participant BenchCLI
participant BenchConfig
participant Benchmark
participant Agent
User->>BenchCLI: Run benchmark command
BenchCLI->>BenchConfig: Load configuration
BenchCLI->>Benchmark: Initialize benchmark
loop For each task
Benchmark->>Agent: Run task
Agent->>Benchmark: Return result
end
BenchCLI->>User: Display benchmark results
File-Level Changes
Tips and commandsInteracting with Sourcery
Customizing Your ExperienceAccess your dashboard to:
Getting Help
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hey @kaigouthro - I've reviewed your changes and they look great!
Here's what I looked at during the review
- 🟡 General issues: 3 issues found
- 🟢 Security: all looks good
- 🟡 Testing: 3 issues found
- 🟢 Complexity: all looks good
- 🟢 Documentation: all looks good
Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.
load_dotenv(dotenv_path=os.path.join(os.getcwd(), ".env")) | ||
|
||
|
||
def concatenate_paths(base_path, sub_path): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
suggestion: Consider using os.path.join instead of custom path concatenation
The custom concatenate_paths function might be unnecessary. Python's os.path.join() could be used to achieve the same result more reliably across different operating systems.
@@ -105,12 +107,17 @@ def __init__( | |||
self.azure_endpoint = azure_endpoint | |||
self.model_name = model_name | |||
self.streaming = streaming | |||
self.vision = ( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
suggestion: Reconsider the method of determining vision capability
The current method of determining vision capability based on model name might be fragile. Consider implementing a more robust method, perhaps by querying the model's capabilities directly if possible.
self.vision = self._check_vision_capability()
def _check_vision_capability(self):
# TODO: Implement a more robust method to check vision capability
# This could involve querying the model's capabilities directly
return ("vision-preview" in self.model_name or
("gpt-4-turbo" in self.model_name and "preview" not in self.model_name))
@@ -89,9 +93,55 @@ def num_tokens(self, txt: str) -> int: | |||
""" | |||
return len(self._tiktoken_tokenizer.encode(txt)) | |||
|
|||
def num_tokens_for_base64_image( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
suggestion: Simplify the token calculation for base64 images
The current implementation for calculating tokens for base64 images is quite complex. Consider if there's a simpler way to estimate this, perhaps using a fixed ratio of bytes to tokens, which might be good enough for most use cases.
def estimate_tokens_for_base64_image(self, image_base64: str) -> int:
# Assuming a rough estimate of 0.75 tokens per byte
return int(len(image_base64) * 0.75)
return dcommand_decorator | ||
|
||
|
||
@dcommand(main.main) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
suggestion (testing): Consider adding tests for new CLI options
The main function has been updated with several new options like use_cache
, skip_file_selection
, no_execution
, etc. It would be beneficial to add tests that cover these new options to ensure they work as expected.
def test_use_cache_option(self, tmp_path):
p = tmp_path / "projects/example"
p.mkdir(parents=True)
(p / "prompt").write_text(prompt_text)
args = DefaultArgumentsMain(str(p), use_cache=True, llm_via_clipboard=True, no_execution=True)
args()
assert args.use_cache, "use_cache option not set"
def test_no_execution_option(self, tmp_path):
p = tmp_path / "projects/example"
p.mkdir(parents=True)
(p / "prompt").write_text(prompt_text)
args = DefaultArgumentsMain(str(p), no_execution=True, llm_via_clipboard=True)
args()
assert args.no_execution, "no_execution option not set"
|
||
|
||
def load_and_test_diff( | ||
def parse_chats_with_regex( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
suggestion (testing): Rename function to better reflect its purpose
The function name parse_chats_with_regex
doesn't accurately describe what it does. Consider renaming it to something like load_and_parse_diff
to better reflect its functionality.
@@ -14,7 +14,7 @@ def test_start(monkeypatch): | |||
ai = AI("gpt-4") | |||
|
|||
# act | |||
response_messages = ai.start("system prompt", "user prompt", "step name") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
suggestion (testing): Updated AI start method call to use keyword argument
The test has been updated to use a keyword argument for step_name. This change improves clarity and matches any changes in the AI class interface. Ensure that this change is consistent with the actual implementation.
response_messages = ai.start(
system_prompt="system prompt",
user_prompt="user prompt",
step_name="step name"
)
hm
Summary by Sourcery
Refactor the CLI tool to support new features such as vision models and enhanced prompt handling, introduce a new benchmarking system for custom agents, and improve the overall structure and usability of the application. Add support for linting, improve error handling in diff processing, and update documentation and tests to reflect the new changes.
New Features:
dcommand
to simplify the creation of Typer command classes with dataclass fields.black
library, with configuration options for line length.Prompt
class to handle text and image inputs for language models, supporting both text and image URLs.bench
command, including configuration via TOML files.Bug Fixes:
salvage_correct_hunks
function, including improved error handling and validation.load_prompt
function to correctly handle cases where the prompt file is a directory or empty.Enhancements:
DiskMemory
class, including support for appending to logs and archiving old logs.Build:
CI:
Documentation:
Tests:
salvage_correct_hunks
function to ensure correct handling of complex diffs and error scenarios.Prompt
class and its integration with the CLI tool, ensuring correct handling of text and image inputs.Chores: