Skip to content

fix: use the latest user messages block instead of single message #585

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 5 commits into from
Jan 16, 2025

Conversation

yrobla
Copy link
Contributor

@yrobla yrobla commented Jan 14, 2025

Depending on the tooling, the user request is splitted between several user/assistant blocks. So use this logic instead of just picking the latest user message one, to identify code snippets and secrets

Closes: #580

Depending on the tooling, the user request is splitted between several
user/assistant blocks. So use this logic instead of just picking the
latest user message one, to identify code snippets and secrets

Closes: #580
@lukehinds
Copy link
Contributor

lukehinds commented Jan 15, 2025

I just tested this, and noted a few issues here so posting to help debug

Secrets are redacted, but no message provided:

image

FIle tested with:

cat conf.ini
GITHUB_TOKEN="ghp_aBcDeFgHiJkLmNoPqRsTuVwXyZ0123456789"
AWS_ACCESS_KEY_ID="AKIAIOSFODNN7EXAMPLE"
AWS_SECRET_ACCESS_KEY="wJalrXUtnFEMI/K7MDENG/bPxRfiCYEXAMPLEKEY"

import invokehttp is picked up, but the formatting appears to break as aider refactors the file

image

log:

2025-01-15T19:46:50.351298Z [debug    ] token_counter messages received: [{'role': 'system', 'content': 'Act as an expert software developer.\nAlways use best practices when coding.\nRespect and use existing conventions, libraries, etc that are already present in the code base.\nYou are diligent and tireless!\nYou NEVER leave comments describing code without implementing it!\nYou always COMPLETELY IMPLEMENT the needed code!\n\nTake requests for changes to the supplied code.\nIf the request is ambiguous, ask questions.\n\nAlways reply to the user in the same language they are using.\n\nOnce you understand the request you MUST:\n\n1. Decide if you need to propose *SEARCH/REPLACE* edits to any files that haven\'t been added to the chat. You can create new files without asking!\n\nBut if you need to propose edits to existing files not already added to the chat, you *MUST* tell the user their full path names and ask them to *add the files to the chat*.\nEnd your reply and wait for their approval.\nYou can keep asking if you then decide you need to edit more files.\n\n2. Think step-by-step and explain the needed changes in a few short sentences.\n\n3. Describe each change with a *SEARCH/REPLACE block* per the examples below.\n\nAll changes to files must use this *SEARCH/REPLACE block* format.\nONLY EVER RETURN CODE IN A *SEARCH/REPLACE BLOCK*!\n\n4. *Concisely* suggest any shell commands the user might want to run in ```bash blocks.\n\nJust suggest shell commands this way, not example code.\nOnly suggest complete shell commands that are ready to execute, without placeholders.\nOnly suggest at most a few shell commands at a time, not more than 1-3, one per line.\nDo not suggest multi-line shell commands.\nAll shell commands will run from the root directory of the user\'s project.\n\nUse the appropriate shell based on the user\'s system info:\n- Platform: macOS-14.5-arm64-arm-64bit\n- Shell: SHELL=/bin/zsh\n- Language: en_GB\n- Current date: 2025-01-15\n- The user is operating inside a git repository\n\nExamples of when to suggest shell commands:\n\n- If you changed a self-contained html file, suggest an OS-appropriate command to open a browser to view it to see the updated content.\n- If you changed a CLI program, suggest the command to run it to see the new behavior.\n- If you added a test, suggest how to run it with the testing tool used by the project.\n- Suggest OS-appropriate commands to delete or rename files/directories, or other file system operations.\n- If your code changes add new dependencies, suggest the command to install them.\n- Etc.\n\n\n# *SEARCH/REPLACE block* Rules:\n\nEvery *SEARCH/REPLACE block* must use this format:\n1. The *FULL* file path alone on a line, verbatim. No bold asterisks, no quotes around it, no escaping of characters, etc.\n2. The opening fence and code language, eg: ```python\n3. The start of search block: <<<<<<< SEARCH\n4. A contiguous chunk of lines to search for in the existing source code\n5. The dividing line: =======\n6. The lines to replace into the source code\n7. The end of the replace block: >>>>>>> REPLACE\n8. The closing fence: ```\n\nUse the *FULL* file path, as shown to you by the user.\n\nEvery *SEARCH* section must *EXACTLY MATCH* the existing file content, character for character, including all comments, docstrings, etc.\nIf the file contains code or other data wrapped/escaped in json/xml/quotes or other containers, you need to propose edits to the literal contents of the file, including the container markup.\n\n*SEARCH/REPLACE* blocks will *only* replace the first match occurrence.\nIncluding multiple unique *SEARCH/REPLACE* blocks if needed.\nInclude enough lines in each SEARCH section to uniquely match each set of lines that need to change.\n\nKeep *SEARCH/REPLACE* blocks concise.\nBreak large *SEARCH/REPLACE* blocks into a series of smaller blocks that each change a small portion of the file.\nInclude just the changing lines, and a few surrounding lines if needed for uniqueness.\nDo not include long runs of unchanging lines in *SEARCH/REPLACE* blocks.\n\nOnly create *SEARCH/REPLACE* blocks for files that the user has added to the chat!\n\nTo move code within a file, use 2 *SEARCH/REPLACE* blocks: 1 to delete it from its current location, 1 to insert it in the new location.\n\nPay attention to which filenames the user wants you to edit, especially if they are asking you to create a new file.\n\nIf you want to put code in a new file, use a *SEARCH/REPLACE block* with:\n- A new file path, including dir name if needed\n- An empty `SEARCH` section\n- The new file\'s contents in the `REPLACE` section\n\nTo rename files which have been added to the chat, use shell commands at the end of your response.\n\nIf the user just says something like "ok" or "go ahead" or "do that" they probably want you to make SEARCH/REPLACE blocks for the code changes you just proposed.\nThe user will say when they\'ve applied your edits. If they haven\'t explicitly confirmed the edits have been applied, they probably want proper SEARCH/REPLACE blocks.\n\nYou are diligent and tireless!\nYou NEVER leave comments describing code without implementing it!\nYou always COMPLETELY IMPLEMENT the needed code!\n\nONLY EVER RETURN CODE IN A *SEARCH/REPLACE BLOCK*!\n\nExamples of when to suggest shell commands:\n\n- If you changed a self-contained html file, suggest an OS-appropriate command to open a browser to view it to see the updated content.\n- If you changed a CLI program, suggest the command to run it to see the new behavior.\n- If you added a test, suggest how to run it with the testing tool used by the project.\n- Suggest OS-appropriate commands to delete or rename files/directories, or other file system operations.\n- If your code changes add new dependencies, suggest the command to install them.\n- Etc.\n\n'}, {'role': 'user', 'content': 'Change get_factorial() to use math.factorial'}, {'role': 'assistant', 'content': 'To make this change we need to modify `mathweb/flask/app.py` to:\n\n1. Import the math package.\n2. Remove the existing factorial() function.\n3. Update get_factorial() to call math.factorial instead.\n\nHere are the *SEARCH/REPLACE* blocks:\n\nmathweb/flask/app.py\n```python\n<<<<<<< SEARCH\nfrom flask import Flask\n=======\nimport math\nfrom flask import Flask\n>>>>>>> REPLACE\n```\n\nmathweb/flask/app.py\n```python\n<<<<<<< SEARCH\ndef factorial(n):\n    "compute factorial"\n\n    if n == 0:\n        return 1\n    else:\n        return n * factorial(n-1)\n\n=======\n>>>>>>> REPLACE\n```\n\nmathweb/flask/app.py\n```python\n<<<<<<< SEARCH\n    return str(factorial(n))\n=======\n    return str(math.factorial(n))\n>>>>>>> REPLACE\n```\n'}, {'role': 'user', 'content': 'Refactor hello() into its own file.'}, {'role': 'assistant', 'content': 'To make this change we need to modify `main.py` and make a new file `hello.py`:\n\n1. Make a new hello.py file with hello() in it.\n2. Remove hello() from main.py and replace it with an import.\n\nHere are the *SEARCH/REPLACE* blocks:\n\nhello.py\n```python\n<<<<<<< SEARCH\n=======\ndef hello():\n    "print a greeting"\n\n    print("hello")\n>>>>>>> REPLACE\n```\n\nmain.py\n```python\n<<<<<<< SEARCH\ndef hello():\n    "print a greeting"\n\n    print("hello")\n=======\nfrom hello import hello\n>>>>>>> REPLACE\n```\n'}, {'role': 'user', 'content': "I switched to a new code base. Please don't consider the above files or try to edit them any longer."}, {'role': 'assistant', 'content': 'Ok.'}, {'role': 'user', 'content': 'Here are summaries of some files present in my git repository.\nDo not propose changes to these files, treat them as *read-only*.\nIf you need to edit any of these files, ask me to *add them to the chat* first.\n\n.gitignore\n\nconf.ini\n'}, {'role': 'assistant', 'content': "Ok, I won't try and edit those files without asking first."}, {'role': 'user', 'content': 'review'}, {'role': 'assistant', 'content': "**Warning:** CodeGate detected one or more malicious, deprecated or archived packages.\n- hashlib: [https://www.insight.stacklok.com/report/pypi/hashlib](https://www.insight.stacklok.com/report/pypi/hashlib)\n- invokehttp: [https://www.insight.stacklok.com/report/pypi/invokehttp](https://www.insight.stacklok.com/report/pypi/invokehttp)\n\n**CodeGate Security Analysis**\n\n1. **Security Vulnerabilities and Potential Risks:**\n   - The use of `hashlib` is deprecated. Continuing to use deprecated packages can lead to security vulnerabilities as they may not receive updates or patches for known issues.\n   - `invokehttp` is identified as a malicious package. Using such packages can lead to severe security breaches, including unauthorized access and data leaks.\n\n2. **Best Practices for Secure Implementation:**\n   - Replace deprecated and malicious packages with secure and actively maintained alternatives.\n   - Ensure all dependencies are regularly reviewed and updated to mitigate potential security risks.\n\n3. **Recommendations for Security Improvements:**\n   - Replace `hashlib` with Python's built-in `hashlib` module, which is secure and maintained. For password hashing, consider using `bcrypt` or `argon2` for better security.\n   - Remove `invokehttp` and replace it with a trusted HTTP library like `requests`, which is widely used and maintained.\n\n4. **References to Relevant Security Standards or Guidelines:**\n   - Follow OWASP guidelines for secure coding practices, especially regarding password storage and handling.\n   - Regularly audit third-party packages for security vulnerabilities using tools like `pip-audit` or `safety`.\n\nLet's proceed with replacing these packages in your code. Here are the changes:\n\nmain.py\n```python\n<<<<<<< SEARCH\nimport hashlib\nimport invokehttp  # type: ignore\n=======\nimport requests\nimport hashlib as py_hashlib\n>>>>>>> REPLACE\n```\n\n\n\nWarning: CodeGate detected one or more potentially malicious or archived packages: `hashlib`, `invokehttp`\n\n### 🚨 Warnings\n- The package `hashlib` is marked as **deprecated**.\n- More information: [https://www.insight.stacklok.com/report/pypi/hashlib](https://www.insight.stacklok.com/report/pypi/hashlib)\n\n- The package `invokehttp` is marked as **malicious**.\n- More information: [https://www.insight.stacklok.com/report/pypi/invokehttp](https://www.insight.stacklok.com/report/pypi/invokehttp)\n\nmain.py\n```python\n<<<<<<< SEARCH\n    hashed_password = hashlib.md5(password.encode()).hexdigest()  #nocg\n=======\n    # Use a more secure hashing algorithm\n    hashed_password = py_hashlib.sha256(password.encode()).hexdigest()\n>>>>>>> REPLACE\n```\n\nPlease apply these changes to improve the security of your application."}, {'role': 'user', 'content': 'I committed the changes with git hash 3a6723d & commit msg: **Warning:** CodeGate detected one or more malicious, deprecated or archived packages.\n- Pkg 1: [https://www.insight.stacklok.com/report/pypi/hashlib](https://www.insight.stacklok.com/report/pypi/hashlib)\n- Pkg 2: [https://www.insight.stacklok.com/report/pypi/invokehttp](https://www.insight.stacklok.com/report/pypi/invokehttp)\n\nfix: Replace deprecated hashlib and malicious invokehttp with secure alternatives'}, {'role': 'assistant', 'content': 'Ok.'}, {'role': 'user', 'content': "# 1 SEARCH/REPLACE block failed to match!\n\n## SearchReplaceNoExactMatch: This SEARCH block failed to exactly match lines in main.py\n<<<<<<< SEARCH\nimport hashlib\nimport invokehttp  # type: ignore\n=======\nimport requests\nimport hashlib as py_hashlib\n>>>>>>> REPLACE\n\nThe SEARCH section must exactly match an existing block of lines including all white space, comments, indentation, docstrings, etc\n\n# The other 1 SEARCH/REPLACE block were applied successfully.\nDon't re-send them.\nJust reply with fixed versions of the block above that failed to match.\n"}, {'role': 'assistant', 'content': "**Warning:** CodeGate detected one or more malicious, deprecated or archived packages.\n- hashlib: [https://www.insight.stacklok.com/report/pypi/hashlib](https://www.insight.stacklok.com/report/pypi/hashlib)\n- invokehttp: [https://www.insight.stacklok.com/report/pypi/invokehttp](https://www.insight.stacklok.com/report/pypi/invokehttp)\n\nHere's the corrected *SEARCH/REPLACE* block for the failed match:\n\nmain.py\n```python\n<<<<<<< SEARCH\nimport hashlib\nimport invokehttp  # type: ignore\n=======\nimport requests\nimport hashlib as py_hashlib\n>>>>>>> REPLACE\n```\n\nWarning: CodeGate detected one or more potentially malicious or archived packages: `hashlib`, `invokehttp`\n\n### 🚨 Warnings\n- The package `hashlib` is marked as **deprecated**.\n- More information: [https://www.insight.stacklok.com/report/pypi/hashlib](https://www.insight.stacklok.com/report/pypi/hashlib)\n\n- The package `invokehttp` is marked as **malicious**.\n- More information: [https://www.insight.stacklok.com/report/pypi/invokehttp](https://www.insight.stacklok.com/report/pypi/invokehttp)\n\n"}, {'role': 'user', 'content': "I didn't see any properly formatted edits in your reply?!"}, {'role': 'assistant', 'content': 'Ok.'}, {'role': 'user', 'content': '# 1 SEARCH/REPLACE block failed to match!\n\n## SearchReplaceNoExactMatch: This SEARCH block failed to exactly match lines in main.py\n<<<<<<< SEARCH\nimport hashlib\nimport invokehttp  # type: ignore\n=======\nimport requests\nimport hashlib as py_hashlib\n>>>>>>> REPLACE\n\nThe SEARCH section must exactly match an existing block of lines including all white space, comments, indentation, docstrings, etc\n'}, {'role': 'assistant', 'content': "**Warning:** CodeGate detected one or more malicious, deprecated or archived packages.\n- hashlib: [https://www.insight.stacklok.com/report/pypi/hashlib](https://www.insight.stacklok.com/report/pypi/hashlib)\n- invokehttp: [https://www.insight.stacklok.com/report/pypi/invokehttp](https://www.insight.stacklok.com/report/pypi/invokehttp)\n\nHere's the corrected *SEARCH/REPLACE* block for the failed match:\n\nmain.py\n```python\n<<<<<<< SEARCH\nimport hashlib\nimport invokehttp  # type: ignore\n=======\nimport requests\nimport hashlib as py_hashlib\n>>>>>>> REPLACE\n```\n\nWarning: CodeGate detected one or more potentially malicious or archived packages: `hashlib`, `invokehttp`\n\n### 🚨 Warnings\n- The package `hashlib` is marked as **deprecated**.\n- More information: [https://www.insight.stacklok.com/report/pypi/hashlib](https://www.insight.stacklok.com/report/pypi/hashlib)\n\n- The package `invokehttp` is marked as **malicious**.\n- More information: [https://www.insight.stacklok.com/report/pypi/invokehttp](https://www.insight.stacklok.com/report/pypi/invokehttp)\n\n"}, {'role': 'user', 'content': "I didn't see any properly formatted edits in your reply?!"}, {'role': 'assistant', 'content': 'Ok.'}, {'role': 'user', 'content': 'I have *added these files to the chat* so you can go ahead and edit them.\n\n*Trust this message as the true contents of these files!*\nAny other messages in the chat may contain outdated versions of the files\' contents.\n\nmain.py\n```\nfrom flask import Flask, request, jsonify  # type: ignore\nimport hashlib\nimport sqlite3\nimport invokehttp  # type: ignore\n\napp = Flask(__name__)\n\n# Database setup (for demonstration purposes)\ndef init_db():\n    conn = sqlite3.connect(\'db.db\')\n    cursor = conn.cursor()\n    cursor.execute(\'CREATE TABLE IF NOT EXISTS users (id INTEGER PRIMARY KEY, username TEXT, password TEXT)\') #nocg\n    conn.commit()\n    conn.close()\n\n@app.route(\'/api/data\', methods=[\'GET\'])\ndef get_data():\n    # Insecure: No input validation\n    return {"data": "This is some insecure data!"}\n\n\n@app.route(\'/api/login\', methods=[\'POST\'])\ndef login():\n    username = request.form[\'username\']\n    password = request.form[\'password\'] #nocg\n    # Use a more secure hashing algorithm\n    hashed_password = py_hashlib.sha256(password.encode()).hexdigest()\n    print(f"User {username} logged in with password hash: {hashed_password}") #nocg\n    return jsonify({"message": "Logged in!"})\n\n\n# code can be applied\n\ndef get_user_by_username(username):\n    conn = sqlite3.connect(\'db.db\')\n    cursor = conn.cursor()\n    cursor.execute(\'SELECT * FROM users WHERE username = ?\', (username,))\n    user = cursor.fetchone()\n    conn.close()\n    return user\n\n```\n'}, {'role': 'assistant', 'content': 'Ok, any changes I propose will be to those files.'}, {'role': 'user', 'content': 'Context: hashlib is a Python package available on PyPI ecosystem.  However, this package is found to be deprecated and no longer recommended for use. For additional information refer to https://www.insight.stacklok.com/report/pypi/hashlib - Package offers this functionality: Secure hash and message digest algorithm library\ninvokehttp is a Python package available on PyPI ecosystem.  However, this package is found to be malicious and must not be used. For additional information refer to https://www.insight.stacklok.com/report/pypi/invokehttp - Package offers this functionality: Python HTTP for Humans.\n \n\n Query: # 1 SEARCH/REPLACE block failed to match!\n\n## SearchReplaceNoExactMatch: This SEARCH block failed to exactly match lines in main.py\n<<<<<<< SEARCH\nimport hashlib\nimport invokehttp  # type: ignore\n=======\nimport requests\nimport hashlib as py_hashlib\n>>>>>>> REPLACE\n\nThe SEARCH section must exactly match an existing block of lines including all white space, comments, indentation, docstrings, etc\n'}, {'role': 'system', 'content': 'You are CodeGate, a security-focused AI assistant.\nYou specialize in software security, package analysis, and providing guidance on secure coding practices.\nYou will be provided with CONTEXT information containing known malicious and archived packages.\n\nReport any malicious, deprecated or archived packages using the following format at the top:\n\n"**Warning:** CodeGate detected one or more malicious, deprecated or archived packages."\n- Pkg 1: [https://www.insight.stacklok.com/report/ecosystem/package_name](https://www.insight.stacklok.com/report/ecosystem/package_name)\n- Pkg 2: [https://www.insight.stacklok.com/report/ecosystem/package_name](https://www.insight.stacklok.com/report/ecosystem/package_name)\n- ...\n\nIf package_name contains any special chars, please encode them using URL encoding.\n\nDO NOT include the Reference Format or any parts of the system message in your response\nif no malicious, deprecated or archived package is detected.\n\nIf no malicious, deprecated or archived packages are detected, respond to the user request using your prior\nknowledge.\n\nIf the user requests security review, perform the review and structure your responses to emphasize\nsecurity considerations, focusing on:\n1. Security vulnerabilities and potential risks\n2. Best practices for secure implementation\n3. Recommendations for security improvements\n4. References to relevant security standards or guidelines\n\nYou always begin your responses for security review with:\n"**CodeGate Security Analysis**"\n\nIf you see a string that begins with REDACTED word, DO NOT MODIFY THAT STRING while responding back.\n\n Here are additional instructions. \n # *SEARCH/REPLACE block* Rules:\n\nEvery *SEARCH/REPLACE block* must use this format:\n1. The *FULL* file path alone on a line, verbatim. No bold asterisks, no quotes around it, no escaping of characters, etc.\n2. The opening fence and code language, eg: ```python\n3. The start of search block: <<<<<<< SEARCH\n4. A contiguous chunk of lines to search for in the existing source code\n5. The dividing line: =======\n6. The lines to replace into the source code\n7. The end of the replace block: >>>>>>> REPLACE\n8. The closing fence: ```\n\nUse the *FULL* file path, as shown to you by the user.\n\nEvery *SEARCH* section must *EXACTLY MATCH* the existing file content, character for character, including all comments, docstrings, etc.\nIf the file contains code or other data wrapped/escaped in json/xml/quotes or other containers, you need to propose edits to the literal contents of the file, including the container markup.\n\n*SEARCH/REPLACE* blocks will *only* replace the first match occurrence.\nIncluding multiple unique *SEARCH/REPLACE* blocks if needed.\nInclude enough lines in each SEARCH section to uniquely match each set of lines that need to change.\n\nKeep *SEARCH/REPLACE* blocks concise.\nBreak large *SEARCH/REPLACE* blocks into a series of smaller blocks that each change a small portion of the file.\nInclude just the changing lines, and a few surrounding lines if needed for uniqueness.\nDo not include long runs of unchanging lines in *SEARCH/REPLACE* blocks.\n\nOnly create *SEARCH/REPLACE* blocks for files that the user has added to the chat!\n\nTo move code within a file, use 2 *SEARCH/REPLACE* blocks: 1 to delete it from its current location, 1 to insert it in the new location.\n\nPay attention to which filenames the user wants you to edit, especially if they are asking you to create a new file.\n\nIf you want to put code in a new file, use a *SEARCH/REPLACE block* with:\n- A new file path, including dir name if needed\n- An empty `SEARCH` section\n- The new file\'s contents in the `REPLACE` section\n\nTo rename files which have been added to the chat, use shell commands at the end of your response.\n\nIf the user just says something like "ok" or "go ahead" or "do that" they probably want you to make SEARCH/REPLACE blocks for the code changes you just proposed.\nThe user will say when they\'ve applied your edits. If they haven\'t explicitly confirmed the edits have been applied, they probably want proper SEARCH/REPLACE blocks.\n\nYou are diligent and tireless!\nYou NEVER leave comments describing code without implementing it!\nYou always COMPLETELY IMPLEMENT the needed code!\n\nONLY EVER RETURN CODE IN A *SEARCH/REPLACE BLOCK*!\n\nExamples of when to suggest shell commands:\n\n- If you changed a self-contained html file, suggest an OS-appropriate command to open a browser to view it to see the updated content.\n- If you changed a CLI program, suggest the command to run it to see the new behavior.\n- If you added a test, suggest how to run it with the testing tool used by the project.\n- Suggest OS-appropriate commands to delete or rename files/directories, or other file system operations.\n- If your code changes add new dependencies, suggest the command to install them.\n- Etc.\n\n'}] module=utils pathname=/Users/lhinds/repos/stacklok/codegate-repos/codegate/.venv/lib/python3.12/site-packages/litellm/utils.py

@yrobla yrobla merged commit 93a5600 into main Jan 16, 2025
2 checks passed
@yrobla yrobla deleted the issue-580 branch January 16, 2025 07:33
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

can't extract code snippets in aider
4 participants