fix: critical security vulnerabilities in tools #740

MervinPraison · 2025-07-07T11:49:14Z

User description

Fixed command injection in shell_tools.py by forcing shell=False
Implemented sandboxed execution in python_tools.py with restricted builtins
Fixed SQL injection in duckdb_tools.py using parameterized queries
Added path traversal protection in file_tools.py
Added SSRF protection in spider_tools.py with URL validation

These changes address the critical security issues identified in issue #737

PR Type

Bug fix

Description

Fixed critical security vulnerabilities across multiple tool modules
Implemented SQL injection prevention with parameterized queries
Added path traversal protection with input validation
Enforced sandboxed Python execution with restricted builtins
Prevented SSRF attacks with URL validation
Disabled shell execution to prevent command injection

Changes diagram

flowchart LR
  A["Input Validation"] --> B["SQL Injection Prevention"]
  A --> C["Path Traversal Protection"]
  A --> D["SSRF Protection"]
  B --> E["Parameterized Queries"]
  C --> F["Path Normalization"]
  D --> G["URL Validation"]
  H["Execution Sandboxing"] --> I["Restricted Builtins"]
  H --> J["Shell Disabled"]

Changes walkthrough 📝

Relevant files

Bug fix

duckdb_tools.py `SQL injection prevention with parameterized queries` src/praisonai-agents/praisonaiagents/tools/duckdb_tools.py Added `_validate_identifier()` method to prevent SQL injection Implemented parameterized queries for table existence checks Added input validation for column names and types Used safe identifiers with proper quoting throughout	+47/-16
file_tools.py `Path traversal protection for file operations` src/praisonai-agents/praisonaiagents/tools/file_tools.py Added `_validate_path()` method to prevent path traversal attacks Implemented path normalization and suspicious pattern detection Applied path validation to all file operations (read, write, copy, move, delete) Added checks for '..' and '~' patterns in file paths	+52/-10
python_tools.py `Sandboxed Python execution with restricted builtins` src/praisonai-agents/praisonaiagents/tools/python_tools.py Created restricted `safe_builtins` dictionary with limited functions Added security validation for dangerous code patterns Disabled dangerous functions like `__import__`, `eval`, `exec`, `open` Implemented pattern matching to block restricted operations	+84/-4
shell_tools.py `Command injection prevention by disabling shell` src/praisonai-agents/praisonaiagents/tools/shell_tools.py Removed `shell` parameter from `execute_command()` method Forced `shell=False` in subprocess execution for security Always use `shlex.split()` to properly parse commands Prevented shell injection by disabling shell execution	+8/-11
spider_tools.py `SSRF protection with comprehensive URL validation` src/praisonai-agents/praisonaiagents/tools/spider_tools.py Added `_validate_url()` method to prevent SSRF attacks Implemented URL scheme validation (only http/https allowed) Added IP address validation to block private/internal ranges Blocked localhost, metadata services, and internal domains	+55/-0

Need help?
Type /help how to ... in the comments thread for any questions about Qodo Merge usage.
Check out the documentation for more information.

Summary by CodeRabbit

Security Improvements
- Enhanced SQL operations to validate and safely quote identifiers, reducing SQL injection risks.
- Added robust file path validation to prevent unauthorized file system access.
- Strengthened Python code execution by restricting built-in functions and blocking dangerous code patterns.
- Improved shell command execution by removing shell access and enforcing argument parsing for safer operations.
- Introduced comprehensive URL validation to block potentially unsafe or internal network requests during web scraping.

- Fixed command injection in shell_tools.py by forcing shell=False - Implemented sandboxed execution in python_tools.py with restricted builtins - Fixed SQL injection in duckdb_tools.py using parameterized queries - Added path traversal protection in file_tools.py - Added SSRF protection in spider_tools.py with URL validation These changes address the critical security issues identified in issue #737 Co-authored-by: Mervin Praison <MervinPraison@users.noreply.github.com>

coderabbitai · 2025-07-07T11:49:20Z

Walkthrough

Security enhancements were made across multiple utility modules. SQL identifier validation and parameterized queries were added to DuckDB tools. File operations now validate paths to prevent traversal. Python code execution restricts builtins and scans for dangerous patterns. Shell command execution enforces no shell usage. Web scraping now validates URLs to block SSRF vectors.

Changes

File(s)	Change Summary
.../tools/duckdb_tools.py	Added `_validate_identifier` for SQL identifier validation; parameterized queries and validation in `load_csv`.
.../tools/file_tools.py	Introduced `_validate_path` for path normalization and traversal prevention; applied to all file operations.
.../tools/python_tools.py	Enhanced `execute_code` with restricted builtins and code scanning for dangerous patterns.
.../tools/shell_tools.py	Removed `shell` parameter; always splits command and runs with `shell=False`.
.../tools/spider_tools.py	Added `_validate_url` for SSRF protection; `scrape_page` now validates URLs before scraping.

Sequence Diagram(s)

sequenceDiagram
    participant User
    participant DuckDBTools

    User->>DuckDBTools: load_csv(table_name, ...)
    DuckDBTools->>DuckDBTools: _validate_identifier(table_name)
    DuckDBTools->>DuckDBTools: Validate/quote identifiers
    DuckDBTools->>DuckDBTools: Use parameterized queries
    DuckDBTools-->>User: Result

sequenceDiagram
    participant User
    participant FileTools

    User->>FileTools: read_file(filepath)
    FileTools->>FileTools: _validate_path(filepath)
    FileTools->>FileTools: Normalize and check path
    FileTools->>FileTools: Perform file operation
    FileTools-->>User: File content or error

sequenceDiagram
    participant User
    participant PythonTools

    User->>PythonTools: execute_code(code)
    PythonTools->>PythonTools: Scan code for dangerous patterns
    alt Dangerous pattern found
        PythonTools-->>User: Security error
    else
        PythonTools->>PythonTools: Setup safe_builtins
        PythonTools->>PythonTools: Execute code in safe context
        PythonTools-->>User: Execution result
    end

sequenceDiagram
    participant User
    participant ShellTools

    User->>ShellTools: execute_command(command)
    ShellTools->>ShellTools: shlex.split(command)
    ShellTools->>ShellTools: subprocess.run(..., shell=False)
    ShellTools-->>User: Command output

sequenceDiagram
    participant User
    participant SpiderTools

    User->>SpiderTools: scrape_page(url)
    SpiderTools->>SpiderTools: _validate_url(url)
    alt URL invalid
        SpiderTools-->>User: Error (invalid/potentially dangerous URL)
    else
        SpiderTools->>SpiderTools: Perform HTTP request and scrape
        SpiderTools-->>User: Scraped data
    end

Poem

🐇
Security’s the word, we rabbits decree,
Paths are now safe, and SQL’s squeaky clean.
Shells are forbidden, our code’s on a leash,
URLs inspected before we unleash.
With every hop, we’re harder to breach—
A warren of safety, in code we teach!

✨ Finishing Touches

📝 Generate Docstrings

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

Review comments: Directly reply to a review comment made by CodeRabbit. Example:
- I pushed a fix in commit <commit_id>, please review it.
- Explain this complex logic.
- Open a follow-up GitHub issue for this discussion.
Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
- @coderabbitai explain this code block.
- @coderabbitai modularize this function.
PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
- @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
- @coderabbitai read src/utils.ts and explain its main purpose.
- @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
- @coderabbitai help me debug CodeRabbit configuration file.

Support

Need help? Create a ticket on our support page for assistance with any issues or questions.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

@coderabbitai pause to pause the reviews on a PR.
@coderabbitai resume to resume the paused reviews.
@coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
@coderabbitai full review to do a full review from scratch and review all the files again.
@coderabbitai summary to regenerate the summary of the PR.
@coderabbitai generate docstrings to generate docstrings for this PR.
@coderabbitai generate sequence diagram to generate a sequence diagram of the changes in this PR.
@coderabbitai resolve resolve all the CodeRabbit review comments.
@coderabbitai configuration to show the current CodeRabbit configuration for the repository.
@coderabbitai help to get help.

Other keywords and placeholders

Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (`.coderabbit.yaml`)

You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
Please see the configuration documentation for more information.
If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

Visit our Documentation for detailed information on how to use CodeRabbit.
Join our Discord Community to get help, request features, and share feedback.
Follow us on X/Twitter for updates and announcements.

gemini-code-assist

Summary of Changes

Hello @MervinPraison, I'm Gemini Code Assist¹! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request focuses on addressing a suite of critical security vulnerabilities across various utility modules. The changes harden the system against common attack vectors such as SQL injection, path traversal, command injection, and SSRF, while also implementing a more secure sandboxed environment for Python code execution. The primary goal is to significantly improve the overall security posture of the application's toolset.

Highlights

SQL Injection Prevention: Implemented robust SQL injection prevention in duckdb_tools.py by introducing a _validate_identifier method for sanitizing table and column names, and by utilizing parameterized queries for all data and identifier interactions.
Path Traversal Protection: Added comprehensive path traversal protection in file_tools.py by introducing a _validate_path static method. This method normalizes and validates all file and directory paths before any file system operations are performed, preventing access to unauthorized locations.
Sandboxed Python Execution: Enhanced python_tools.py to provide a more secure sandboxed execution environment. This was achieved by explicitly defining a safe_builtins dictionary to restrict dangerous built-in functions and by adding a pre-execution check to scan the input code for known dangerous patterns.
Command Injection Mitigation: Addressed command injection vulnerabilities in shell_tools.py by removing the shell parameter from the execute_command function and explicitly setting shell=False in all subprocess.Popen calls. Commands are now always safely split using shlex.
SSRF Protection: Introduced Server-Side Request Forgery (SSRF) protection in spider_tools.py with the addition of a _validate_url method. This method rigorously checks URLs for safe schemes, valid hostnames, and blocks access to local, private, and suspicious internal network addresses before making any HTTP requests.

Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point in your pull request via creating an issue comment (i.e. comment on the pull request page) using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands.

Feature	Command	Description
Code Review	`/gemini review`	Performs a code review for the current pull request in its current state.
Pull Request Summary	`/gemini summary`	Provides a summary of the current pull request in its current state.
Comment	@gemini-code-assist	Responds in comments when explicitly tagged, both in issue comments and review comments.
Help	`/gemini help`	Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist is currently in preview and may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments to provide feedback.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution. ↩

qodo-merge-pro · 2025-07-07T11:50:28Z

PR Reviewer Guide 🔍

Here are some key observations to aid the review process:

🎫 Ticket compliance analysis ✅ 737 - PR Code Verified Compliant requirements: • Fixed SQL injection vulnerabilities in duckdb_tools.py • Implemented path traversal protection in file_tools.py • Added sandboxed execution in python_tools.py • Prevented command injection in shell_tools.py • Added SSRF protection in spider_tools.py Requires further human verification: • Comprehensive security testing of all implemented protections • Validation that all attack vectors are properly blocked • Performance impact assessment of security measures
⏱️ Estimated effort to review: 4 🔵🔵🔵🔵⚪
🧪 No relevant tests
🔒 Security concerns Incomplete security implementations: While the PR addresses major security vulnerabilities, several implementations have potential weaknesses. The SQL injection prevention uses sqlite_master which may not work correctly with DuckDB. The Python sandbox can potentially be bypassed through object introspection or encoded strings. The path traversal protection lacks proper boundary enforcement and could be circumvented with symbolic links or other filesystem features. The SSRF protection doesn't handle all edge cases like IPv6 addresses or domain redirects.
⚡ Recommended focus areas for review SQL Injection Risk The parameterized query implementation uses sqlite_master table which may not exist in DuckDB. The table existence check and column type validation regex may have edge cases that could be exploited. exists = conn.execute(""" SELECT name FROM sqlite_master WHERE type='table' AND name=? """, [table_name]).fetchone() is not None Sandbox Bypass The dangerous pattern matching uses simple string matching which could be bypassed with obfuscation techniques. The safe_builtins approach may still allow access to dangerous functionality through object introspection. dangerous_patterns = [ '__import__', 'import ', 'from ', 'exec', 'eval', 'compile', 'open(', 'file(', 'input(', 'raw_input', '__subclasses__', '__bases__', '__globals__', '__code__', '__class__', 'globals(', 'locals(', 'vars(' ] code_lower = code.lower() for pattern in dangerous_patterns: if pattern.lower() in code_lower: return { 'result': None, 'stdout': '', 'stderr': f'Security Error: Code contains restricted pattern: {pattern}', 'success': False } Path Validation The path validation only checks for basic patterns but doesn't enforce boundaries or validate against a whitelist of allowed directories. The normpath and abspath operations may not prevent all traversal attacks. if '..' in filepath or filepath.startswith('~'): raise ValueError(f"Suspicious path pattern detected: {filepath}")

qodo-merge-pro · 2025-07-07T11:51:14Z

PR Code Suggestions ✨

Explore these optional code suggestions:

Category	Suggestion	Impact
Possible issue	Fix incorrect database syntax The query uses SQLite syntax (`sqlite_master`) instead of DuckDB syntax. DuckDB uses `information_schema.tables` for table existence checks. This will cause runtime errors when executing against a DuckDB database. src/praisonai-agents/praisonaiagents/tools/duckdb_tools.py [148-152] # Check if table exists using parameterized query exists = conn.execute(""" - SELECT name FROM sqlite_master - WHERE type='table' AND name=? + SELECT table_name FROM information_schema.tables + WHERE table_name = ? """, [table_name]).fetchone() is not None Apply / Chat Suggestion importance[1-10]: 9 __ Why: The suggestion correctly identifies that the PR uses SQLite syntax (`sqlite_master`) in a DuckDB tool, which would cause a runtime error, and provides the correct standard SQL syntax.	High
Security	Use AST for security validation The pattern matching is too simplistic and can be bypassed with techniques like string concatenation, getattr, or whitespace variations. Consider using AST parsing for more robust code analysis. src/praisonai-agents/praisonaiagents/tools/python_tools.py [120-135] -dangerous_patterns = [ - '__import__', 'import ', 'from ', 'exec', 'eval', - 'compile', 'open(', 'file(', 'input(', 'raw_input', - '__subclasses__', '__bases__', '__globals__', '__code__', - '__class__', 'globals(', 'locals(', 'vars(' -] +import ast -code_lower = code.lower() -for pattern in dangerous_patterns: - if pattern.lower() in code_lower: +try: + tree = ast.parse(code) + for node in ast.walk(tree): + if isinstance(node, ast.Import) or isinstance(node, ast.ImportFrom): + return {'result': None, 'stdout': '', 'stderr': 'Security Error: Import statements not allowed', 'success': False} + if isinstance(node, ast.Call) and isinstance(node.func, ast.Name): + if node.func.id in ['eval', 'exec', 'compile', 'open', '__import__']: + return {'result': None, 'stdout': '', 'stderr': f'Security Error: Function {node.func.id} not allowed', 'success': False} +except SyntaxError: + return {'result': None, 'stdout': '', 'stderr': 'Security Error: Invalid Python syntax', 'success': False} `[To ensure code accuracy, apply this suggestion manually]` Suggestion importance[1-10]: 8 __ Why: The suggestion correctly points out that simple string matching for security is brittle and proposes a much more robust solution using Abstract Syntax Tree (AST) parsing to prevent bypasses.	Medium
More

cursor

Bug: Security Check Overly Restrictive, Easily Bypassed

The new security check in execute_code uses simple substring matching for dangerous patterns, making it both overly restrictive and easily bypassable. It causes false positives by rejecting legitimate code containing patterns like 'import ', 'from ', or 'open(' in comments, strings, or variable names (e.g., "important_data", "reopen(file)"). This exact string matching is also easily circumvented by using alternative whitespace (e.g., import\tos) or formatting, rendering the security ineffective against determined attackers.

src/praisonai-agents/praisonaiagents/tools/python_tools.py#L119-L135

PraisonAI/src/praisonai-agents/praisonaiagents/tools/python_tools.py

Lines 119 to 135 in abe4d6c

    
           # Security check: validate code doesn't contain dangerous patterns 
        
           dangerous_patterns = [ 
        
               '__import__', 'import ', 'from ', 'exec', 'eval',  
        
               'compile', 'open(', 'file(', 'input(', 'raw_input', 
        
               '__subclasses__', '__bases__', '__globals__', '__code__', 
        
               '__class__', 'globals(', 'locals(', 'vars(' 
        
           ] 
        
           code_lower = code.lower() 
        
           for pattern in dangerous_patterns: 
        
               if pattern.lower() in code_lower: 
        
                   return { 
        
                       'result': None, 
        
                       'stdout': '', 
        
                       'stderr': f'Security Error: Code contains restricted pattern: {pattern}', 
        
                       'success': False 
        
                   }

Fix in Cursor • Fix in Web

Bug: Path Validation Flaws Enable Traversal

The _validate_path method, intended for path traversal prevention, has two flaws:

The if '..' in filepath check is overly broad, incorrectly rejecting legitimate filenames containing '..' (e.g., "file..txt") instead of specifically checking for path traversal sequences like ../ or ..\\.
It only validates the original input path for suspicious patterns. While it converts the path to an absolute path, it critically fails to validate that this resolved absolute path remains within an expected safe directory. This allows an attacker to bypass the .. check by providing an absolute path (e.g., /etc/passwd), leading to a path traversal vulnerability.

src/praisonai-agents/praisonaiagents/tools/file_tools.py#L24-L48

PraisonAI/src/praisonai-agents/praisonaiagents/tools/file_tools.py

Lines 24 to 48 in abe4d6c

    
               @staticmethod 
        
               def _validate_path(filepath: str) -> str: 
        
                   """ 
        
                   Validate and normalize a file path to prevent path traversal attacks. 
        
                   Args: 
        
                       filepath: Path to validate 
        
                   Returns: 
        
                       str: Normalized absolute path 
        
                   Raises: 
        
                       ValueError: If path contains suspicious patterns 
        
                   """ 
        
                   # Normalize the path 
        
                   normalized = os.path.normpath(filepath) 
        
                   absolute = os.path.abspath(normalized) 
        
                   # Check for suspicious patterns 
        
                   if '..' in filepath or filepath.startswith('~'): 
        
                       raise ValueError(f"Suspicious path pattern detected: {filepath}") 
        
                   # Additional check: ensure the resolved path doesn't escape expected boundaries 
        
                   # This is a basic check - in production, you'd want to define allowed directories 
        
                   return absolute

Fix in Cursor • Fix in Web

BugBot free trial expires on July 22, 2025
You have used $0.00 of your $50.00 spend limit so far. Manage your spend limit in the Cursor dashboard.

Was this report helpful? Give feedback by reacting with 👍 or 👎

gemini-code-assist

Code Review

This pull request introduces security fixes addressing command injection, SQL injection, path traversal, SSRF, and unsafe code execution vulnerabilities. The path traversal fix in file_tools.py still allows access to arbitrary files via absolute paths. The SSRF protection in spider_tools.py is vulnerable to bypass via HTTP redirects. The SQL type validation in duckdb_tools.py is a bit too strict and might reject valid types.

gemini-code-assist · 2025-07-07T11:52:09Z

src/praisonai-agents/praisonaiagents/tools/duckdb_tools.py

+                        if not re.match(r'^[A-Z][A-Z0-9_]*(\([0-9,]+\))?$', col_type.upper()):
+                            raise ValueError(f"Invalid column type: {col_type}")


The regex for validating column types is too restrictive. It doesn't allow for spaces within the parentheses, which is common for types like DECIMAL(10, 2). This will cause a ValueError for valid schema definitions.

The regex should be updated to accommodate optional whitespace.

if not re.match(r'^[A-Z][A-Z0-9_]*($[0-9,\s]+$)?$', col_type.upper()): raise ValueError(f"Invalid column type: {col_type}")

gemini-code-assist · 2025-07-07T11:52:10Z

src/praisonai-agents/praisonaiagents/tools/file_tools.py

+        # Normalize the path
+        normalized = os.path.normpath(filepath)
+        absolute = os.path.abspath(normalized)
+
+        # Check for suspicious patterns
+        if '..' in filepath or filepath.startswith('~'):
+            raise ValueError(f"Suspicious path pattern detected: {filepath}")
+
+        # Additional check: ensure the resolved path doesn't escape expected boundaries
+        # This is a basic check - in production, you'd want to define allowed directories
+        return absolute


The current path validation logic in _validate_path is an improvement but is not sufficient to prevent all path traversal attacks. An attacker can still access any file on the filesystem by providing an absolute path (e.g., /etc/passwd), as the check only looks for .. and ~ in the path string.

A more robust approach is to define a safe base directory and ensure that the resolved absolute path of any file operation is within that directory. This would prevent the tool from accessing unintended files outside its designated workspace.

agent_workspace = os.path.abspath(os.getcwd()) if os.path.isabs(filepath): raise ValueError(f"Absolute paths are not permitted: {filepath}") full_path = os.path.normpath(os.path.join(agent_workspace, filepath)) if not full_path.startswith(agent_workspace): raise ValueError(f"Path traversal attempt detected: {filepath}") return full_path

gemini-code-assist · 2025-07-07T11:52:10Z

src/praisonai-agents/praisonaiagents/tools/spider_tools.py

+            # Validate URL to prevent SSRF
+            if not self._validate_url(url):
+                return {"error": f"Invalid or potentially dangerous URL: {url}"}


The URL validation is a great step towards preventing Server-Side Request Forgery (SSRF). However, it can be bypassed if the initial, valid URL returns an HTTP redirect to a malicious or internal URL. The requests library follows redirects by default, and the validation is not performed on subsequent URLs in a redirect chain.

To fix this, you should disable automatic redirects in your session.get() call (allow_redirects=False) and handle them manually in a loop, validating each new URL before making the next request. You should also limit the number of redirects to prevent infinite loops.

coderabbitai

Actionable comments posted: 4

🧹 Nitpick comments (2)

src/praisonai-agents/praisonaiagents/tools/spider_tools.py (1)
73-76: Simplify the return statement.

The static analysis tool correctly identified that this can be simplified by returning the negated condition directly.

Apply this simplification:
-            # Block metadata service endpoints
-            if hostname in ['169.254.169.254', 'metadata.google.internal']:
-                return False
-            
-            return True
+            # Block metadata service endpoints
+            return hostname not in ['169.254.169.254', 'metadata.google.internal']
src/praisonai-agents/praisonaiagents/tools/duckdb_tools.py (1)
173-174: Column type validation may be too restrictive.

The regex pattern for column types is very strict and might reject valid DuckDB types like DECIMAL(10,2) or VARCHAR(255).

Consider a more flexible pattern:
-                        if not re.match(r'^[A-Z][A-Z0-9_]*($[0-9,]+$)?$', col_type.upper()):
+                        if not re.match(r'^[A-Z][A-Z0-9_]*($[0-9,\s]+$)?(\s+[A-Z]+)*$', col_type.upper()):
This allows for spaces in precision/scale and additional type modifiers.

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 48fae95 and abe4d6c.

📒 Files selected for processing (5)

src/praisonai-agents/praisonaiagents/tools/duckdb_tools.py (3 hunks)
src/praisonai-agents/praisonaiagents/tools/file_tools.py (8 hunks)
src/praisonai-agents/praisonaiagents/tools/python_tools.py (1 hunks)
src/praisonai-agents/praisonaiagents/tools/shell_tools.py (2 hunks)
src/praisonai-agents/praisonaiagents/tools/spider_tools.py (2 hunks)

🧰 Additional context used

🪛 Ruff (0.11.9)

src/praisonai-agents/praisonaiagents/tools/spider_tools.py

73-76: Return the negated condition directly

Inline condition

(SIM103)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (4)

GitHub Check: Cursor BugBot
GitHub Check: Run tests and collect coverage
GitHub Check: test-core (3.11)
GitHub Check: quick-test

🔇 Additional comments (9)

src/praisonai-agents/praisonaiagents/tools/shell_tools.py (2)

59-66: LGTM! Excellent security improvement.

The platform-specific command splitting using shlex.split with appropriate posix flags is well-implemented. This correctly handles quote parsing differences between Windows and Unix-like systems while preventing command injection.

79-79: Security enhancement implemented correctly.

Hardcoding shell=False effectively prevents command injection vulnerabilities by ensuring commands are never executed through a shell interpreter.

src/praisonai-agents/praisonaiagents/tools/file_tools.py (1)

63-65: Path validation integration looks good.

The integration of path validation into file operations is correctly implemented and will effectively prevent path traversal attacks when the validation logic is fixed.

src/praisonai-agents/praisonaiagents/tools/spider_tools.py (2)

30-79: Excellent SSRF protection implementation.

The URL validation covers comprehensive attack vectors including private IP ranges, localhost variants, metadata services, and internal domains. This significantly enhances security against SSRF attacks.

124-127: URL validation integration is secure.

The early validation check in scrape_page effectively prevents SSRF attacks by blocking dangerous URLs before any network requests.
src/praisonai-agents/praisonaiagents/tools/python_tools.py (2)

52-107: Sandboxing implementation provides good baseline security.

The restricted builtins dictionary effectively limits available functions and disables dangerous operations. The explicit None assignments for dangerous functions prevent their use.

153-157: Security concern: eval() usage in sandboxed environment.

The use of eval() for extracting expression values could potentially be exploited, even with restricted builtins.

Consider replacing this with safer alternatives or additional validation:
-                        result = eval(
-                            compile(ast.Expression(tree.body[-1].value), '<string>', 'eval'),
-                            globals_dict,
-                            locals_dict
-                        )
+                        # Safer approach: get the value from locals after execution
+                        if hasattr(tree.body[-1].value, 'id'):
+                            var_name = tree.body[-1].value.id
+                            result = locals_dict.get(var_name)
+                        else:
+                            result = None
src/praisonai-agents/praisonaiagents/tools/duckdb_tools.py (2)

34-51: SQL identifier validation implementation is secure.

The regex pattern correctly validates identifier format and the quoting approach safely handles reserved words and special characters.

186-191: Parameterized queries effectively prevent SQL injection.

The use of parameterized queries for file paths and validated identifiers for table names provides robust protection against SQL injection attacks.

coderabbitai · 2025-07-07T11:53:08Z

src/praisonai-agents/praisonaiagents/tools/file_tools.py

+        # Additional check: ensure the resolved path doesn't escape expected boundaries
+        # This is a basic check - in production, you'd want to define allowed directories
+        return absolute


🛠️ Refactor suggestion

Enhance boundary validation for production use.

The current implementation lacks enforcement of allowed directory boundaries, which is critical for preventing unauthorized file access.

Consider implementing a more robust boundary check:

+ # Define allowed base directories (configure as needed) + allowed_bases = [os.getcwd(), '/tmp', '/var/tmp'] # Example + # Additional check: ensure the resolved path doesn't escape expected boundaries - # This is a basic check - in production, you'd want to define allowed directories + if not any(absolute.startswith(base) for base in allowed_bases): + raise ValueError(f"Path outside allowed directories: {filepath}") + return absolute

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

# Additional check: ensure the resolved path doesn't escape expected boundaries

# This is a basic check - in production, you'd want to define allowed directories

return absolute

# Define allowed base directories (configure as needed)

allowed_bases = [os.getcwd(), '/tmp', '/var/tmp'] # Example

# Additional check: ensure the resolved path doesn't escape expected boundaries

if not any(absolute.startswith(base) for base in allowed_bases):

raise ValueError(f"Path outside allowed directories: {filepath}")

return absolute

🤖 Prompt for AI Agents

In src/praisonai-agents/praisonaiagents/tools/file_tools.py around lines 46 to 48, the current code returns the absolute path without verifying if it resides within allowed directories. To fix this, implement a check that compares the resolved absolute path against a predefined list of allowed base directories and only return the path if it is within these boundaries; otherwise, raise an exception or handle the error to prevent unauthorized file access.

coderabbitai · 2025-07-07T11:53:08Z

src/praisonai-agents/praisonaiagents/tools/file_tools.py

+        # Check for suspicious patterns
+        if '..' in filepath or filepath.startswith('~'):
+            raise ValueError(f"Suspicious path pattern detected: {filepath}")


⚠️ Potential issue

Fix path validation logic to prevent bypass.

The suspicious pattern check is performed on the original filepath before normalization, which could miss path traversal attempts that become apparent only after normalization.

Apply this fix to check patterns after normalization:

- # Check for suspicious patterns - if '..' in filepath or filepath.startswith('~'): - raise ValueError(f"Suspicious path pattern detected: {filepath}") + # Check for suspicious patterns in normalized path + if '..' in normalized or '~' in normalized: + raise ValueError(f"Suspicious path pattern detected: {filepath}")

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

# Check for suspicious patterns

if '..' in filepath or filepath.startswith('~'):

raise ValueError(f"Suspicious path pattern detected: {filepath}")

# Check for suspicious patterns in normalized path

if '..' in normalized or '~' in normalized:

raise ValueError(f"Suspicious path pattern detected: {filepath}")

🤖 Prompt for AI Agents

In src/praisonai-agents/praisonaiagents/tools/file_tools.py around lines 42 to 44, the path validation checks for suspicious patterns on the original filepath before normalization, which can be bypassed. To fix this, normalize the filepath first and then perform the checks for '..' or '~' patterns on the normalized path to ensure path traversal attempts are detected reliably.

coderabbitai · 2025-07-07T11:53:08Z

src/praisonai-agents/praisonaiagents/tools/python_tools.py

+            dangerous_patterns = [
+                '__import__', 'import ', 'from ', 'exec', 'eval', 
+                'compile', 'open(', 'file(', 'input(', 'raw_input',
+                '__subclasses__', '__bases__', '__globals__', '__code__',
+                '__class__', 'globals(', 'locals(', 'vars('
+            ]
+
+            code_lower = code.lower()
+            for pattern in dangerous_patterns:
+                if pattern.lower() in code_lower:
+                    return {
+                        'result': None,
+                        'stdout': '',
+                        'stderr': f'Security Error: Code contains restricted pattern: {pattern}',
+                        'success': False
+                    }


🛠️ Refactor suggestion

Improve dangerous pattern detection accuracy.

The current pattern detection has some issues that could lead to false positives or bypasses:

Mixed case sensitivity handling

Overly broad patterns like 'from ' that could block legitimate code

Missing some potential bypass patterns

Consider this more precise pattern detection:

- dangerous_patterns = [ - '__import__', 'import ', 'from ', 'exec', 'eval', - 'compile', 'open(', 'file(', 'input(', 'raw_input', - '__subclasses__', '__bases__', '__globals__', '__code__', - '__class__', 'globals(', 'locals(', 'vars(' - ] - - code_lower = code.lower() - for pattern in dangerous_patterns: - if pattern.lower() in code_lower: + # Use more precise regex patterns to reduce false positives + dangerous_patterns = [ + r'\b__import__\b', + r'\bimport\s+\w+', + r'\bfrom\s+\w+\s+import\b', + r'\bexec\s*\(', + r'\beval\s*\(', + r'\bcompile\s*\(', + r'\bopen\s*\(', + r'\bglobals\s*\(', + r'\blocals\s*\(', + r'__\w+__' # Dunder attributes + ] + + import re + for pattern in dangerous_patterns: + if re.search(pattern, code, re.IGNORECASE):

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

dangerous_patterns = [

'__import__', 'import ', 'from ', 'exec', 'eval',

'compile', 'open(', 'file(', 'input(', 'raw_input',

'__subclasses__', '__bases__', '__globals__', '__code__',

'__class__', 'globals(', 'locals(', 'vars('

]

code_lower = code.lower()

for pattern in dangerous_patterns:

if pattern.lower() in code_lower:

return {

'result': None,

'stdout': '',

'stderr': f'Security Error: Code contains restricted pattern: {pattern}',

'success': False

}

# Use more precise regex patterns to reduce false positives

dangerous_patterns = [

r'\b__import__\b',

r'\bimport\s+\w+',

r'\bfrom\s+\w+\s+import\b',

r'\bexec\s*\(',

r'\beval\s*\(',

r'\bcompile\s*\(',

r'\bopen\s*\(',

r'\bglobals\s*\(',

r'\blocals\s*\(',

r'__\w+__' # Dunder attributes

]

import re

for pattern in dangerous_patterns:

if re.search(pattern, code, re.IGNORECASE):

return {

'result': None,

'stdout': '',

'stderr': f'Security Error: Code contains restricted pattern: {pattern}',

'success': False

}

🤖 Prompt for AI Agents

In src/praisonai-agents/praisonaiagents/tools/python_tools.py around lines 120 to 135, improve the dangerous pattern detection by making it case-sensitive and more precise to avoid false positives and bypasses. Replace broad patterns like 'from ' with more specific ones, and add additional known bypass patterns. Use exact substring matching or regex with word boundaries instead of simple lowercase containment checks to increase accuracy.

coderabbitai · 2025-07-07T11:53:09Z

src/praisonai-agents/praisonaiagents/tools/duckdb_tools.py

+            exists = conn.execute("""
                SELECT name FROM sqlite_master 
-                WHERE type='table' AND name='{table_name}'
-            """).fetchone() is not None
+                WHERE type='table' AND name=?
+            """, [table_name]).fetchone() is not None


⚠️ Potential issue

Fix table existence check for DuckDB.

The query uses sqlite_master which is SQLite-specific, but this is a DuckDB tool. DuckDB uses different system tables.

Apply this fix for DuckDB compatibility:

- exists = conn.execute(""" - SELECT name FROM sqlite_master - WHERE type='table' AND name=? - """, [table_name]).fetchone() is not None + exists = conn.execute(""" + SELECT table_name FROM information_schema.tables + WHERE table_name=? + """, [table_name]).fetchone() is not None

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

exists = conn.execute("""

SELECT name FROM sqlite_master

WHERE type='table' AND name='{table_name}'

""").fetchone() is not None

WHERE type='table' AND name=?

""", [table_name]).fetchone() is not None

exists = conn.execute("""

SELECT table_name FROM information_schema.tables

WHERE table_name=?

""", [table_name]).fetchone() is not None

🤖 Prompt for AI Agents

In src/praisonai-agents/praisonaiagents/tools/duckdb_tools.py around lines 149 to 152, the table existence check uses a SQLite-specific query on sqlite_master, which is incompatible with DuckDB. Replace this query with a DuckDB-compatible one by querying the information_schema.tables or the DuckDB system catalog to check if the table exists. Adjust the SQL to use DuckDB's metadata tables and ensure the parameterized query checks for the table name correctly.

gemini-code-assist bot reviewed Jul 7, 2025

View reviewed changes

qodo-merge-pro bot added Review effort 4/5 Possible security concern labels Jul 7, 2025

MervinPraison merged commit d7fcaf8 into main Jul 7, 2025
10 of 11 checks passed

cursor bot reviewed Jul 7, 2025

View reviewed changes

gemini-code-assist bot reviewed Jul 7, 2025

View reviewed changes

coderabbitai bot reviewed Jul 7, 2025

View reviewed changes

	# Security check: validate code doesn't contain dangerous patterns
	dangerous_patterns = [
	'__import__', 'import ', 'from ', 'exec', 'eval',
	'compile', 'open(', 'file(', 'input(', 'raw_input',
	'__subclasses__', '__bases__', '__globals__', '__code__',
	'__class__', 'globals(', 'locals(', 'vars('
	]

	code_lower = code.lower()
	for pattern in dangerous_patterns:
	if pattern.lower() in code_lower:
	return {
	'result': None,
	'stdout': '',
	'stderr': f'Security Error: Code contains restricted pattern: {pattern}',
	'success': False
	}

	@staticmethod
	def _validate_path(filepath: str) -> str:
	"""
	Validate and normalize a file path to prevent path traversal attacks.

	Args:
	filepath: Path to validate

	Returns:
	str: Normalized absolute path

	Raises:
	ValueError: If path contains suspicious patterns
	"""
	# Normalize the path
	normalized = os.path.normpath(filepath)
	absolute = os.path.abspath(normalized)

	# Check for suspicious patterns
	if '..' in filepath or filepath.startswith('~'):
	raise ValueError(f"Suspicious path pattern detected: {filepath}")

	# Additional check: ensure the resolved path doesn't escape expected boundaries
	# This is a basic check - in production, you'd want to define allowed directories
	return absolute

		if not re.match(r'^[A-Z][A-Z0-9_]*(\([0-9,]+\))?$', col_type.upper()):
		raise ValueError(f"Invalid column type: {col_type}")

Uh oh!

fix: critical security vulnerabilities in tools #740

fix: critical security vulnerabilities in tools #740

Uh oh!

Conversation

MervinPraison commented Jul 7, 2025 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

User description

PR Type

Description

Changes diagram

Changes walkthrough 📝

Summary by CodeRabbit

Uh oh!

coderabbitai bot commented Jul 7, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Sequence Diagram(s)

Poem

Chat

Support

CodeRabbit Commands (Invoked using PR comments)

Other keywords and placeholders

CodeRabbit Configuration File (.coderabbit.yaml)

Documentation and Community

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Summary of Changes

Highlights

Footnotes

Uh oh!

qodo-merge-pro bot commented Jul 7, 2025

PR Reviewer Guide 🔍

Uh oh!

Uh oh!

qodo-merge-pro bot commented Jul 7, 2025

PR Code Suggestions ✨

Uh oh!

cursor bot left a comment

Choose a reason for hiding this comment

Bug: Security Check Overly Restrictive, Easily Bypassed

Bug: Path Validation Flaws Enable Traversal

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist bot Jul 7, 2025

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Jul 7, 2025

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Jul 7, 2025

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Jul 7, 2025

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Jul 7, 2025

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Jul 7, 2025

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Jul 7, 2025

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

MervinPraison commented Jul 7, 2025 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Jul 7, 2025 •

edited

Loading

CodeRabbit Configuration File (`.coderabbit.yaml`)