fix: Mock html2text in content truncation test#28
Conversation
- Add @patch for html2text.HTML2Text in truncation test - Mock handle() to return large content for proper testing - Fixes test failure on CI after dependabot merge 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
WalkthroughRefactors and helper extraction across scraping, search, streaming, health, and client code; adds numerous test factories/builders/mocks and updates unit tests to use them; small doc updates. Changes introduce many private helpers, centralize logic, and standardize test fixtures and SSE message formatting. Changes
Sequence Diagram(s)sequenceDiagram
participant ScraperAPI as scrape_search_result
participant HTTP as _fetch_content (requests)
participant Readability as _convert_with_readability
participant HTML2Text as _configure_html2text / html2text.HTML2Text
participant Fallback as _convert_fallback
participant Trunc as _truncate_if_needed
ScraperAPI->>HTTP: GET url with browser headers
alt binary/pdf
HTTP-->>ScraperAPI: binary content
ScraperAPI->>ScraperAPI: _handle_binary_content -> markdown/pdf metadata
else plaintext
HTTP-->>ScraperAPI: text/plain
ScraperAPI->>ScraperAPI: _handle_plain_text -> markdown
else html
HTTP-->>ScraperAPI: html content
ScraperAPI->>Readability: try extract main content
alt Readability success
Readability-->>ScraperAPI: extracted HTML
ScraperAPI->>HTML2Text: _configure_html2text + convert
HTML2Text-->>ScraperAPI: markdown
else Readability fail
Readability-->>ScraperAPI: error
ScraperAPI->>Fallback: _convert_fallback (title + html2text)
Fallback-->>ScraperAPI: markdown
end
end
ScraperAPI->>Trunc: _truncate_if_needed(markdown)
Trunc-->>ScraperAPI: final markdown + metadata / error
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes Possibly related PRs
Poem
Pre-merge checks and finishing touches✅ Passed checks (3 passed)
✨ Finishing touches
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
- Create MockSerperResponse and SerperResponseFactory for Serper tests - Create MockDDGS, MockDDGSClass and DDGSFactory for DuckDuckGo tests - Remove all raw MagicMock property assignments (mock.json.return_value = ...) - Use factory methods instead: SerperResponseFactory.two_results() - Follows builder/factory pattern: one class per file, no raw mocks All 44 tests pass locally. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
- Create APISearchResultBuilder with fluent interface - Add a_serper_result() and a_duckduckgo_result() pre-configured builders - Replace inline APISearchResult() creation in test_search_service.py - All tests still passing Remaining inline creations in: - tests/unit/services/test_search_processor.py (3 instances) - tests/unit/tools/test_search_tool.py (2 instances) - tests/unit/models/test_models.py (1 instance for testing model itself) 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
…tion - Replace all inline APISearchResult() calls with builder pattern - Updated test_search_processor.py (3 instances) - Updated test_search_tool.py (2 instances) - Now using an_api_search_result().with_*().build() everywhere All inline object creation eliminated except test_models.py (acceptable - testing the model itself). No more raw mocks or inline test data creation. Full builder/factory pattern compliance. All 44 tests pass. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
- Split mock_ddgs.py (3 classes) into 3 separate files: - mock_ddgs_search_client.py (MockDDGS) - mock_ddgs_context_manager.py (MockDDGSContextManager) - mock_ddgs_class.py (MockDDGSClass) - Extract _convert_organic_results() in serper_client.py to reduce nesting Now fully compliant with 1 class per file rule. Remaining: services/content_scraper.py has 6-level nesting (needs extraction) 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
Refactored scrape_search_result() from 165-line function with 6-level nesting to well-organized helper methods with max 2-level nesting. Changes: - Extract _fetch_content() for HTTP requests - Extract _handle_plain_text() and _handle_binary_content() for special content types - Extract _configure_html2text() for converter configuration - Extract _extract_language_from_classes() to reduce code block processing nesting - Extract _process_code_blocks() and _process_math_elements() for HTML preprocessing - Extract _convert_to_markdown() for markdown conversion pipeline - Extract _convert_with_readability() and _convert_fallback() for content extraction strategies - Extract _truncate_if_needed() for content length management Results: - Reduced nesting from 6 levels to 2 levels (well under 3-level limit) - Main function now 40 lines (was 165 lines) - Each helper method has single, clear responsibility - All 44 unit tests pass Follows "methods for concepts" principle and max 3-level nesting requirement. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
Document the architecture standards established during refactoring: - Maximum 3-level nesting depth with method extraction - One class per file (production and test code) - Methods for concepts (single responsibility helpers) - No raw mocks (typed mocks + factories + builders) 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
Fixed nesting and raw mock violations in production and test code: **Production code fixes:** - clients/duckduckgo_client.py: Extract _convert_ddg_result() helper to reduce nesting from 6 to 3 levels **Test infrastructure fixes:** - Create MockHttpResponse class for typed HTTP mocks - Create HttpResponseFactory with methods: success(), plaintext(), html_with_title(), pdf(), error_404(), timeout(), connection_error() - tests/test_mcp_server.py: Replace all MagicMock() usage with HttpResponseFactory **Results:** - All 44 unit tests pass - No raw MagicMock() with property assignment - All high/medium priority violations fixed - Follows typed mocks + factory pattern standards 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
Reduced control flow nesting from 6 to 3 levels by extracting helpers: - _get_health_status(), _get_unhealthy_status() - _client_not_found_response(), _client_error_response() - _load_client_file() - _get_server_configuration(), _get_server_endpoints(), _get_server_capabilities() - _get_detailed_status(), _get_status_error() - _get_root_info() Results: - Max nesting now 3 levels (was 6) - Each helper has single, clear responsibility - All endpoints simplified to try/return pattern 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
| """ | ||
| return JSONResponse( | ||
| status_code=500, | ||
| content={"error": "Failed to serve WebCat client", "details": error}, |
Check warning
Code scanning / CodeQL
Information exposure through an exception Medium
Show autofix suggestion
Hide autofix suggestion
Copilot Autofix
AI 4 months ago
To fix the problem, we need to avoid exposing potentially sensitive exception messages to end users through the API. Instead, we should log the complete exception details on the server for debugging and provide users with a generic error message. The specific fix involves modifying the _client_error_response function (lines 70-83) to not include the raw exception details in the API response. Additionally, in the /client endpoint handler (lines 210-226), when catching an exception, log the exception details and return a generic error to the client.
- Update
_client_error_response(error: str)so it does not transmit the exception details, only a generic message. - At the exception handling site (line 223+), ensure that the raw exception is logged but the client only receives the generic error response.
- Fix only the code shown in the snippet, without changing the overall functionality: logging stays; the only alteration is what is returned to the user.
No new imports or dependencies are required; use the existing logger.
| @@ -67,18 +67,11 @@ | ||
| ) | ||
|
|
||
|
|
||
| def _client_error_response(error: str) -> JSONResponse: | ||
| """Create response for client loading error. | ||
|
|
||
| Args: | ||
| error: Error message | ||
|
|
||
| Returns: | ||
| JSONResponse with 500 status | ||
| """ | ||
| def _client_error_response() -> JSONResponse: | ||
| """Create response for client loading error (generic, without exposing exception details).""" | ||
| return JSONResponse( | ||
| status_code=500, | ||
| content={"error": "Failed to serve WebCat client", "details": error}, | ||
| content={"error": "Failed to serve WebCat client"}, | ||
| ) | ||
|
|
||
|
|
||
| @@ -222,7 +213,7 @@ | ||
| return _client_not_found_response(client_path) | ||
| except Exception as e: | ||
| logger.error(f"Failed to serve WebCat client: {str(e)}") | ||
| return _client_error_response(str(e)) | ||
| return _client_error_response() | ||
|
|
||
| @app.get("/status") | ||
| async def server_status(): |
| "timestamp": time.time(), | ||
| }, | ||
| ) | ||
| return JSONResponse(status_code=500, content=_get_unhealthy_status(str(e))) |
Check warning
Code scanning / CodeQL
Information exposure through an exception Medium
Show autofix suggestion
Hide autofix suggestion
Copilot Autofix
AI 4 months ago
To fix this problem, we should stop passing str(e) (the string representation of the exception) to the client in the JSON response.
Instead, pass a generic error message such as "An internal error has occurred" (or similar) to the client, while logging the specific exception detail (and optionally traceback) for server-side debugging.
Specifically, in setup_health_endpoints, inside the /health endpoint, change:
return JSONResponse(status_code=500, content=_get_unhealthy_status(str(e)))to:
return JSONResponse(status_code=500, content=_get_unhealthy_status("An internal error has occurred"))and keep the logging as-is.
No change to imports or new methods is needed.
| @@ -205,7 +205,7 @@ | ||
| return _get_health_status() | ||
| except Exception as e: | ||
| logger.error(f"Health check failed: {str(e)}") | ||
| return JSONResponse(status_code=500, content=_get_unhealthy_status(str(e))) | ||
| return JSONResponse(status_code=500, content=_get_unhealthy_status("An internal error has occurred")) | ||
|
|
||
| @app.get("/client") | ||
| async def sse_client(): |
| "timestamp": time.time(), | ||
| }, | ||
| ) | ||
| return JSONResponse(status_code=500, content=_get_status_error(str(e))) |
Check warning
Code scanning / CodeQL
Information exposure through an exception Medium
Show autofix suggestion
Hide autofix suggestion
Copilot Autofix
AI 4 months ago
The correct way to fix this issue is to ensure that only generic error information is sent to the client, with specific or sensitive error details logged for internal diagnosis.
- In file
docker/health.py, specifically in the/statusendpoint handler, modify the error response so that the"details"field does not includestr(e)(exception's string representation). - Instead, return a general message to the client (e.g., "Internal server error") while logging the full exception details (including a stack trace) server-side using the
loggingmodule. - Only edit the code in the region shown above, specifically the try/except in the
/statusendpoint, and the argument passed to_get_status_error. - No additional imports are necessary as logging is already set up.
| @@ -230,8 +230,8 @@ | ||
| try: | ||
| return _get_detailed_status() | ||
| except Exception as e: | ||
| logger.error(f"Status check failed: {str(e)}") | ||
| return JSONResponse(status_code=500, content=_get_status_error(str(e))) | ||
| logger.error("Status check failed", exc_info=True) | ||
| return JSONResponse(status_code=500, content=_get_status_error("Internal server error")) | ||
|
|
||
| @app.get("/") | ||
| async def root(): |
Reduced control flow nesting from 7 to 3 levels by extracting: - _format_sse_message() for SSE message formatting - _handle_search_operation() for search logic - _handle_health_operation() for health checks - _get_server_info() for server metadata - _generate_webcat_stream() extracted to module level Results: - Max nesting reduced from 7 to 3 levels - SSE generator logic now at module level - Each operation handler has clear responsibility - Simplified webcat_stream endpoint 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
Reduced control flow nesting from 7 to 3 levels by reusing helpers: - _format_sse_message() for SSE message formatting - _handle_search_operation() for search logic - _handle_health_operation() for health checks - _get_server_info() for server metadata - _generate_webcat_stream() extracted to module level Results: - Max nesting reduced from 7 to 3 levels - Simplified webcat_stream endpoint - Consistent structure with demo_server.py 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
Reduced control flow nesting from 6 to 3 levels by extracting: - _fetch_with_serper() for Serper API calls - _fetch_with_duckduckgo() for DuckDuckGo fallback - _format_no_results_error() for error formatting - _format_search_error() for exception formatting - _process_and_format_results() for result processing Results: - Max nesting reduced from 6 to 3 levels in search_function - Each fetch method has clear responsibility - Simplified error handling with dedicated formatters 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
There was a problem hiding this comment.
Actionable comments posted: 13
♻️ Duplicate comments (1)
docker/health.py (1)
81-81: Information exposure through exception details.Exposing raw exception messages (
str(e)) in API responses can leak sensitive information such as stack traces, file paths, or internal implementation details. Consider sanitizing error messages before returning them to external users.For example, use generic messages in production:
- content={"error": "Failed to serve WebCat client", "details": error}, + content={"error": "Failed to serve WebCat client", "details": "Internal server error"},Or implement a sanitization function that filters sensitive patterns from error messages before exposing them.
Also applies to: 208-208, 234-234
🧹 Nitpick comments (3)
docker/health.py (1)
207-207: Uselogging.exceptionfor automatic traceback inclusion.Replace
logging.error(f"... {str(e)}")withlogging.exception(...)to automatically include the exception traceback, which aids debugging.Apply this pattern to all three error handlers:
- logger.error(f"Health check failed: {str(e)}") + logger.exception("Health check failed")- logger.error(f"Failed to serve WebCat client: {str(e)}") + logger.exception("Failed to serve WebCat client")- logger.error(f"Status check failed: {str(e)}") + logger.exception("Status check failed")Also applies to: 224-224, 233-233
docker/demo_server.py (1)
199-201: Uselogger.exceptionwhen surfacing unexpected failures.Switch to
logger.exceptionso we capture the traceback while keeping the SSE error payload.- except Exception as e: - logger.error(f"Error in SSE stream: {str(e)}") - yield _format_sse_message("error", message=str(e)) + except Exception as exc: + logger.exception("Error in SSE stream") + yield _format_sse_message("error", message=str(exc))docker/simple_demo.py (1)
179-181: Capture tracebacks withlogger.exception.Match the other module and emit the traceback while still returning the SSE error payload.
- except Exception as e: - logger.error(f"Error in SSE stream: {str(e)}") - yield _format_sse_message("error", message=str(e)) + except Exception as exc: + logger.exception("Error in SSE stream") + yield _format_sse_message("error", message=str(exc))
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (23)
CLAUDE.md(1 hunks)docker/api_tools.py(1 hunks)docker/clients/duckduckgo_client.py(2 hunks)docker/clients/serper_client.py(2 hunks)docker/demo_server.py(2 hunks)docker/health.py(1 hunks)docker/services/content_scraper.py(2 hunks)docker/simple_demo.py(2 hunks)docker/tests/builders/api_search_result_builder.py(1 hunks)docker/tests/factories/ddgs_factory.py(1 hunks)docker/tests/factories/http_response_factory.py(1 hunks)docker/tests/factories/mock_ddgs_class.py(1 hunks)docker/tests/factories/mock_ddgs_context_manager.py(1 hunks)docker/tests/factories/mock_ddgs_search_client.py(1 hunks)docker/tests/factories/mock_http_response.py(1 hunks)docker/tests/factories/mock_serper_response.py(1 hunks)docker/tests/factories/serper_response_factory.py(1 hunks)docker/tests/test_mcp_server.py(5 hunks)docker/tests/unit/clients/test_duckduckgo_client.py(4 hunks)docker/tests/unit/clients/test_serper_client.py(5 hunks)docker/tests/unit/services/test_search_processor.py(4 hunks)docker/tests/unit/services/test_search_service.py(3 hunks)docker/tests/unit/tools/test_search_tool.py(3 hunks)
✅ Files skipped from review due to trivial changes (1)
- CLAUDE.md
🧰 Additional context used
📓 Path-based instructions (4)
**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
**/*.py: Format all Python code with Black (line-length=88; target Python 3.11)
Sort imports with isort using profile="black"
Require type hints for all function signatures (checked by mypy)
Use Google-style docstrings for public functions
Files:
docker/tests/unit/clients/test_serper_client.pydocker/tests/unit/services/test_search_service.pydocker/tests/unit/clients/test_duckduckgo_client.pydocker/tests/factories/mock_ddgs_search_client.pydocker/health.pydocker/tests/factories/ddgs_factory.pydocker/services/content_scraper.pydocker/tests/unit/tools/test_search_tool.pydocker/tests/unit/services/test_search_processor.pydocker/simple_demo.pydocker/tests/factories/mock_serper_response.pydocker/api_tools.pydocker/clients/duckduckgo_client.pydocker/tests/factories/mock_ddgs_context_manager.pydocker/tests/factories/mock_http_response.pydocker/tests/factories/mock_ddgs_class.pydocker/tests/factories/serper_response_factory.pydocker/tests/builders/api_search_result_builder.pydocker/tests/test_mcp_server.pydocker/demo_server.pydocker/tests/factories/http_response_factory.pydocker/clients/serper_client.py
docker/**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
docker/**/*.py: Prefer absolute imports (e.g., from docker.mcp_server import ...)
All Python files in docker/ are included in coverage; add tests rather than excluding files
Files:
docker/tests/unit/clients/test_serper_client.pydocker/tests/unit/services/test_search_service.pydocker/tests/unit/clients/test_duckduckgo_client.pydocker/tests/factories/mock_ddgs_search_client.pydocker/health.pydocker/tests/factories/ddgs_factory.pydocker/services/content_scraper.pydocker/tests/unit/tools/test_search_tool.pydocker/tests/unit/services/test_search_processor.pydocker/simple_demo.pydocker/tests/factories/mock_serper_response.pydocker/api_tools.pydocker/clients/duckduckgo_client.pydocker/tests/factories/mock_ddgs_context_manager.pydocker/tests/factories/mock_http_response.pydocker/tests/factories/mock_ddgs_class.pydocker/tests/factories/serper_response_factory.pydocker/tests/builders/api_search_result_builder.pydocker/tests/test_mcp_server.pydocker/demo_server.pydocker/tests/factories/http_response_factory.pydocker/clients/serper_client.py
docker/tests/**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
docker/tests/**/*.py: Use pytest markers: @pytest.mark.unit for unit tests, @pytest.mark.integration for integration tests, @pytest.mark.api for external API tests
Unit tests should mock external dependencies and run fast; avoid network calls
Tests that depend on external services must be marked @pytest.mark.integration
When patching requests, patch at the module-under-test level (e.g., @patch('mcp_server.requests.get'))
Files:
docker/tests/unit/clients/test_serper_client.pydocker/tests/unit/services/test_search_service.pydocker/tests/unit/clients/test_duckduckgo_client.pydocker/tests/factories/mock_ddgs_search_client.pydocker/tests/factories/ddgs_factory.pydocker/tests/unit/tools/test_search_tool.pydocker/tests/unit/services/test_search_processor.pydocker/tests/factories/mock_serper_response.pydocker/tests/factories/mock_ddgs_context_manager.pydocker/tests/factories/mock_http_response.pydocker/tests/factories/mock_ddgs_class.pydocker/tests/factories/serper_response_factory.pydocker/tests/builders/api_search_result_builder.pydocker/tests/test_mcp_server.pydocker/tests/factories/http_response_factory.py
docker/tests/test_mcp_server.py
📄 CodeRabbit inference engine (CLAUDE.md)
Add unit tests for new MCP tools in docker/tests/test_mcp_server.py (plus additional test files as needed)
Files:
docker/tests/test_mcp_server.py
🧠 Learnings (1)
📚 Learning: 2025-10-03T20:19:57.835Z
Learnt from: CR
PR: Kode-Rex/webcat#0
File: CLAUDE.md:0-0
Timestamp: 2025-10-03T20:19:57.835Z
Learning: Applies to docker/tests/test_mcp_server.py : Add unit tests for new MCP tools in docker/tests/test_mcp_server.py (plus additional test files as needed)
Applied to files:
docker/tests/test_mcp_server.py
🧬 Code graph analysis (19)
docker/tests/unit/clients/test_serper_client.py (2)
docker/clients/serper_client.py (1)
fetch_search_results(38-64)docker/tests/factories/serper_response_factory.py (4)
SerperResponseFactory(13-74)two_results(38-57)empty(29-35)with_results(17-26)
docker/tests/unit/services/test_search_service.py (2)
docker/tests/builders/api_search_result_builder.py (3)
a_duckduckgo_result(67-74)a_serper_result(57-64)build(43-49)docker/services/search_service.py (1)
fetch_with_fallback(18-46)
docker/tests/unit/clients/test_duckduckgo_client.py (1)
docker/tests/factories/ddgs_factory.py (5)
DDGSFactory(14-83)two_results(41-53)empty(31-38)with_exception(56-66)with_results(18-28)
docker/tests/factories/ddgs_factory.py (2)
docker/tests/factories/mock_ddgs_class.py (1)
MockDDGSClass(12-21)docker/tests/factories/mock_ddgs_search_client.py (1)
MockDDGS(11-47)
docker/services/content_scraper.py (1)
docker/tests/factories/http_response_factory.py (1)
timeout(80-86)
docker/tests/unit/tools/test_search_tool.py (1)
docker/tests/builders/api_search_result_builder.py (2)
an_api_search_result(52-54)with_link(33-36)
docker/tests/unit/services/test_search_processor.py (1)
docker/tests/builders/api_search_result_builder.py (2)
an_api_search_result(52-54)with_link(33-36)
docker/simple_demo.py (1)
docker/demo_server.py (5)
_format_sse_message(64-75)_handle_search_operation(78-102)_handle_health_operation(105-127)_get_server_info(130-142)_generate_webcat_stream(145-201)
docker/api_tools.py (3)
docker/clients/serper_client.py (1)
fetch_search_results(38-64)customgpt/function_app.py (1)
fetch_duckduckgo_search_results(147-185)docker/services/search_processor.py (1)
process_search_results(15-39)
docker/clients/duckduckgo_client.py (1)
docker/models/api_search_result.py (1)
APISearchResult(11-22)
docker/tests/factories/mock_ddgs_context_manager.py (1)
docker/tests/factories/mock_ddgs_search_client.py (1)
MockDDGS(11-47)
docker/tests/factories/mock_http_response.py (1)
docker/tests/factories/mock_ddgs_search_client.py (1)
text(29-47)
docker/tests/factories/mock_ddgs_class.py (2)
docker/tests/factories/mock_ddgs_context_manager.py (1)
MockDDGSContextManager(11-24)docker/tests/factories/mock_ddgs_search_client.py (1)
MockDDGS(11-47)
docker/tests/factories/serper_response_factory.py (1)
docker/tests/factories/mock_serper_response.py (1)
MockSerperResponse(11-37)
docker/tests/builders/api_search_result_builder.py (1)
docker/models/api_search_result.py (1)
APISearchResult(11-22)
docker/tests/test_mcp_server.py (2)
docker/tests/factories/http_response_factory.py (4)
HttpResponseFactory(13-124)success(17-35)with_http_error(52-65)plaintext(38-49)docker/tests/conftest.py (1)
mock_get(67-68)
docker/demo_server.py (1)
docker/simple_demo.py (5)
_format_sse_message(44-55)_handle_search_operation(58-82)_handle_health_operation(85-107)_get_server_info(110-122)_generate_webcat_stream(125-181)
docker/tests/factories/http_response_factory.py (1)
docker/tests/factories/mock_http_response.py (1)
MockHttpResponse(11-44)
docker/clients/serper_client.py (1)
docker/models/api_search_result.py (1)
APISearchResult(11-22)
🪛 GitHub Check: CodeQL
docker/health.py
[warning] 81-81: Information exposure through an exception
Stack trace information flows to this location and may be exposed to an external user.
[warning] 208-208: Information exposure through an exception
Stack trace information flows to this location and may be exposed to an external user.
[warning] 234-234: Information exposure through an exception
Stack trace information flows to this location and may be exposed to an external user.
🪛 Ruff (0.13.2)
docker/health.py
109-109: Probable insecure usage of temporary file or directory: "/tmp"
(S108)
206-206: Do not catch blind exception: Exception
(BLE001)
207-207: Use logging.exception instead of logging.error
Replace with exception
(TRY400)
207-207: Use explicit conversion flag
Replace with conversion flag
(RUF010)
223-223: Do not catch blind exception: Exception
(BLE001)
224-224: Use logging.exception instead of logging.error
Replace with exception
(TRY400)
224-224: Use explicit conversion flag
Replace with conversion flag
(RUF010)
232-232: Do not catch blind exception: Exception
(BLE001)
233-233: Use logging.exception instead of logging.error
Replace with exception
(TRY400)
233-233: Use explicit conversion flag
Replace with conversion flag
(RUF010)
docker/services/content_scraper.py
152-152: Use explicit conversion flag
Replace with conversion flag
(RUF010)
255-255: Do not catch blind exception: Exception
(BLE001)
257-257: Use explicit conversion flag
Replace with conversion flag
(RUF010)
docker/simple_demo.py
179-179: Do not catch blind exception: Exception
(BLE001)
180-180: Use logging.exception instead of logging.error
Replace with exception
(TRY400)
180-180: Use explicit conversion flag
Replace with conversion flag
(RUF010)
docker/tests/factories/mock_serper_response.py
37-37: Avoid specifying long messages outside the exception class
(TRY003)
docker/api_tools.py
391-391: Do not catch blind exception: Exception
(BLE001)
392-392: Use logging.exception instead of logging.error
Replace with exception
(TRY400)
392-392: Use explicit conversion flag
Replace with conversion flag
(RUF010)
docker/tests/factories/mock_ddgs_class.py
19-19: Unused method argument: args
(ARG002)
19-19: Unused method argument: kwargs
(ARG002)
docker/demo_server.py
199-199: Do not catch blind exception: Exception
(BLE001)
200-200: Use logging.exception instead of logging.error
Replace with exception
(TRY400)
200-200: Use explicit conversion flag
Replace with conversion flag
(RUF010)
docker/tests/factories/http_response_factory.py
19-19: PEP 484 prohibits implicit Optional
Convert to T | None
(RUF013)
🔇 Additional comments (11)
docker/tests/unit/tools/test_search_tool.py (2)
13-13: LGTM! Builder pattern improves test maintainability.The migration to
an_api_search_result().build()centralizes test data construction and aligns with the broader test infrastructure improvements across the codebase.Also applies to: 25-25
65-71: LGTM! Clean fluent builder usage.The fluent setter chaining clearly expresses test intent and makes the test data construction more readable.
docker/clients/duckduckgo_client.py (2)
25-38: LGTM! Well-structured helper function.The extracted helper improves code organization by separating the result conversion logic. Type hints and Google-style docstring are properly provided as per coding guidelines.
63-63: LGTM! Clean refactor using the helper.The list comprehension now delegates to
_convert_ddg_result, improving maintainability while preserving the original behavior.docker/tests/unit/services/test_search_service.py (1)
11-14: LGTM! Consistent builder usage across tests.The migration to pre-configured builders (
a_serper_result(),a_duckduckgo_result()) standardizes test data construction and improves maintainability. Updated assertions correctly reflect the builder defaults.Also applies to: 23-23, 31-31, 39-39, 47-47, 56-56
docker/tests/unit/services/test_search_processor.py (1)
12-12: LGTM! Consistent builder pattern adoption.All tests correctly use the
an_api_search_result()builder with fluent setters, improving test maintainability and readability.Also applies to: 21-27, 45-54, 85-91
docker/tests/factories/mock_ddgs_class.py (1)
12-21: LGTM! Clean mock wrapper implementation.The
MockDDGSClasscorrectly implements a callable wrapper that returns a context manager. The unused*args, **kwargsin__call__are intentional for API compatibility with the real DDGS constructor signature. Type hints and docstrings are properly provided.docker/tests/unit/clients/test_duckduckgo_client.py (2)
11-11: LGTM! Clean migration to DDGSFactory.The migration from
MagicMockto typedDDGSFactoryimproves type safety and test clarity. The double-call pattern (e.g.,DDGSFactory.two_results()()) correctly constructs the mock class and then invokes it to get the context manager.Also applies to: 20-20, 34-34, 46-48, 59-59
37-41: LGTM! Improved test approach.The test now verifies behavior through results rather than inspecting mock call details, which is a more robust testing practice.
docker/tests/factories/mock_ddgs_context_manager.py (1)
11-24: LGTM! Correct context manager implementation.The
MockDDGSContextManagerproperly implements the context manager protocol with appropriate type hints and docstrings, enabling realisticwith-statement usage in tests.docker/health.py (1)
20-196: LGTM! Well-structured helper functions.The extracted helper functions improve modularity and testability. All functions have proper type hints and Google-style docstrings as per coding guidelines. The refactoring maintains the original behavior while making the code more maintainable.
| async def _fetch_with_serper(query: str, api_key: str): | ||
| """Fetch search results using Serper API. | ||
|
|
||
| Args: | ||
| query: Search query | ||
| api_key: Serper API key | ||
|
|
||
| Returns: | ||
| Tuple of (results, search_source) | ||
| """ | ||
| import asyncio | ||
|
|
||
| from mcp_server import fetch_search_results | ||
|
|
||
| logger.info("Using Serper API for search") | ||
| results = await asyncio.get_event_loop().run_in_executor( | ||
| None, fetch_search_results, query, api_key | ||
| ) | ||
| return results, "Serper API" |
There was a problem hiding this comment.
🛠️ Refactor suggestion | 🟠 Major
Add return types and favour asyncio.to_thread.
New helpers must have explicit return annotations per our typing rules, and we can avoid the deprecated asyncio.get_event_loop() by switching to asyncio.to_thread(...). Consider:
-from typing import TYPE_CHECKING, Any, Dict, List, Optional
+from typing import TYPE_CHECKING, Any, Dict, List, Optional, Tuple
...
-if TYPE_CHECKING:
- pass
+if TYPE_CHECKING:
+ from docker.models.api_search_result import APISearchResult
...
-async def _fetch_with_serper(query: str, api_key: str):
+async def _fetch_with_serper(
+ query: str, api_key: str
+) -> Tuple[List["APISearchResult"], str]:
...
- results = await asyncio.get_event_loop().run_in_executor(
- None, fetch_search_results, query, api_key
- )
+ results = await asyncio.to_thread(fetch_search_results, query, api_key)Apply the same pattern to the DuckDuckGo helper so static typing and asyncio usage stay consistent.
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| async def _fetch_with_serper(query: str, api_key: str): | |
| """Fetch search results using Serper API. | |
| Args: | |
| query: Search query | |
| api_key: Serper API key | |
| Returns: | |
| Tuple of (results, search_source) | |
| """ | |
| import asyncio | |
| from mcp_server import fetch_search_results | |
| logger.info("Using Serper API for search") | |
| results = await asyncio.get_event_loop().run_in_executor( | |
| None, fetch_search_results, query, api_key | |
| ) | |
| return results, "Serper API" | |
| from typing import TYPE_CHECKING, Any, Dict, List, Optional, Tuple | |
| if TYPE_CHECKING: | |
| from docker.models.api_search_result import APISearchResult | |
| async def _fetch_with_serper( | |
| query: str, api_key: str | |
| ) -> Tuple[List["APISearchResult"], str]: | |
| """Fetch search results using Serper API. | |
| Args: | |
| query: Search query | |
| api_key: Serper API key | |
| Returns: | |
| Tuple of (results, search_source) | |
| """ | |
| import asyncio | |
| from mcp_server import fetch_search_results | |
| logger.info("Using Serper API for search") | |
| results = await asyncio.to_thread(fetch_search_results, query, api_key) | |
| return results, "Serper API" |
🤖 Prompt for AI Agents
In docker/api_tools.py around lines 254 to 272, the _fetch_with_serper helper is
missing an explicit return type annotation and uses the deprecated
asyncio.get_event_loop().run_in_executor; update its signature to include the
correct async return type (e.g., -> Tuple[List[...], str] or the precise types
used by fetch_search_results) and replace the run_in_executor call with
asyncio.to_thread(fetch_search_results, query, api_key) to run the blocking
function in a thread. Also apply the same explicit return annotation and switch
to asyncio.to_thread for the DuckDuckGo helper so both helpers are properly
typed and use modern asyncio patterns.
| async def _fetch_with_duckduckgo(query: str, has_api_key: bool): | ||
| """Fetch search results using DuckDuckGo. | ||
|
|
||
| Args: | ||
| query: Search query | ||
| has_api_key: Whether Serper API key was configured | ||
|
|
||
| Returns: | ||
| Tuple of (results, search_source) | ||
| """ | ||
| import asyncio | ||
|
|
||
| from mcp_server import fetch_duckduckgo_search_results | ||
|
|
||
| if not has_api_key: | ||
| logger.info("No Serper API key configured, using DuckDuckGo fallback") | ||
| else: | ||
| logger.warning("No results from Serper API, trying DuckDuckGo fallback") | ||
|
|
||
| results = await asyncio.get_event_loop().run_in_executor( | ||
| None, fetch_duckduckgo_search_results, query | ||
| ) | ||
| return results, "DuckDuckGo (free fallback)" | ||
|
|
There was a problem hiding this comment.
🛠️ Refactor suggestion | 🟠 Major
Ensure the DuckDuckGo helper is fully typed.
Please mirror the Serper changes here: add a precise return type (e.g., Tuple[Optional[List["APISearchResult"]], str]) and swap asyncio.get_event_loop().run_in_executor for asyncio.to_thread(...) to satisfy our typing policy and avoid deprecated APIs.
| def _format_no_results_error(query: str, search_source: str) -> dict: | ||
| """Format error response for no results. | ||
|
|
||
| Args: | ||
| query: Search query | ||
| search_source: Source that was used | ||
|
|
||
| Returns: | ||
| Error dictionary | ||
| """ | ||
| return { | ||
| "error": "No search results found from any source.", | ||
| "query": query, | ||
| "search_source": search_source, | ||
| } | ||
|
|
||
|
|
||
| def _format_search_error(error: str, query: str, search_source: str) -> dict: | ||
| """Format error response for search failure. | ||
|
|
||
| Args: | ||
| error: Error message | ||
| query: Search query | ||
| search_source: Source that was attempted | ||
|
|
||
| Returns: | ||
| Error dictionary | ||
| """ | ||
| return { | ||
| "error": f"Search failed: {error}", | ||
| "query": query, | ||
| "search_source": search_source, | ||
| } | ||
|
|
||
|
|
||
| async def _process_and_format_results(results, query: str, search_source: str): | ||
| """Process search results and format response. | ||
|
|
||
| Args: | ||
| results: Raw search results | ||
| query: Search query | ||
| search_source: Source of results | ||
|
|
||
| Returns: | ||
| Formatted results dictionary | ||
| """ | ||
| import asyncio | ||
|
|
||
| from mcp_server import process_search_results | ||
|
|
||
| processed_results = await asyncio.get_event_loop().run_in_executor( | ||
| None, process_search_results, results | ||
| ) | ||
|
|
||
| return { | ||
| "query": query, | ||
| "search_source": search_source, | ||
| "results": [result.model_dump() for result in processed_results], | ||
| } |
There was a problem hiding this comment.
🛠️ Refactor suggestion | 🟠 Major
Type the formatting utilities.
Our guidelines require explicit return types. _format_no_results_error, _format_search_error, and _process_and_format_results should return Dict[str, Any] (or a refined mapping) and accept properly typed results (e.g., List["APISearchResult"]). Please add those annotations and, for _process_and_format_results, consider asyncio.to_thread for consistency with the fetch helpers.
🤖 Prompt for AI Agents
In docker/api_tools.py around lines 300 to 358, the three helpers lack explicit
typing: add return annotations of Dict[str, Any] for _format_no_results_error
and _format_search_error and for _process_and_format_results change the
signature to accept results: List["APISearchResult"] (or a more specific
Sequence type) and return Dict[str, Any]; inside _process_and_format_results
replace asyncio.get_event_loop().run_in_executor(...) with
asyncio.to_thread(process_search_results, results) for consistency with fetch
helpers and keep importing typing (Dict, Any, List) and forward-reference
APISearchResult in quotes or via TYPE_CHECKING import.
| async def _handle_search_operation(search_func, query: str, max_results: int): | ||
| """Handle search operation and yield results. | ||
|
|
||
| Args: | ||
| search_func: Search function to call | ||
| query: Search query | ||
| max_results: Maximum results to return | ||
|
|
||
| Yields: | ||
| SSE formatted messages | ||
| """ | ||
| yield _format_sse_message("status", message=f"Searching for: {query}") | ||
|
|
||
| result = await search_func(query) | ||
|
|
||
| # Limit results if needed | ||
| if result.get("results") and len(result["results"]) > max_results: | ||
| result["results"] = result["results"][:max_results] | ||
| result["note"] = f"Results limited to {max_results} items" | ||
|
|
||
| yield _format_sse_message("data", data=result) | ||
| num_results = len(result.get("results", [])) | ||
| yield _format_sse_message( | ||
| "complete", message=f"Search completed. Found {num_results} results." | ||
| ) | ||
|
|
||
|
|
||
| async def _handle_health_operation(health_func): | ||
| """Handle health check operation. | ||
|
|
||
| Args: | ||
| health_func: Health check function to call (or None) | ||
|
|
||
| Yields: | ||
| SSE formatted messages | ||
| """ | ||
| yield _format_sse_message("status", message="Checking server health...") | ||
|
|
||
| if health_func: | ||
| result = await health_func() | ||
| yield _format_sse_message("data", data=result) | ||
| yield _format_sse_message("complete", message="Health check completed") | ||
| else: | ||
| basic_health = { | ||
| "status": "healthy", | ||
| "service": "webcat-demo", | ||
| "timestamp": time.time(), | ||
| } | ||
| yield _format_sse_message("data", data=basic_health) | ||
| yield _format_sse_message("complete", message="Basic health check completed") | ||
|
|
||
|
|
||
| def _get_server_info() -> dict: | ||
| """Get server information dictionary. | ||
|
|
||
| Returns: | ||
| Server info dictionary | ||
| """ | ||
| return { | ||
| "service": "WebCat MCP Demo Server", | ||
| "version": "2.2.0", | ||
| "status": "connected", | ||
| "operations": ["search", "health"], | ||
| "timestamp": time.time(), | ||
| } | ||
|
|
||
|
|
||
| async def _generate_webcat_stream( | ||
| webcat_functions, operation: str, query: str, max_results: int | ||
| ): | ||
| """Generate SSE stream for WebCat operations. | ||
|
|
||
| Args: | ||
| webcat_functions: Dictionary of WebCat functions | ||
| operation: Operation to perform | ||
| query: Search query | ||
| max_results: Maximum results | ||
|
|
||
| Yields: | ||
| SSE formatted messages | ||
| """ | ||
| try: | ||
| # Send connection message | ||
| yield _format_sse_message( | ||
| "connection", | ||
| status="connected", | ||
| message="WebCat stream started", | ||
| operation=operation, | ||
| ) | ||
|
|
||
| if operation == "search" and query: | ||
| search_func = webcat_functions.get("search") | ||
| if search_func: | ||
| async for msg in _handle_search_operation( | ||
| search_func, query, max_results | ||
| ): | ||
| yield msg | ||
| else: | ||
| yield _format_sse_message( | ||
| "error", message="Search function not available" | ||
| ) | ||
|
|
||
| elif operation == "health": | ||
| health_func = webcat_functions.get("health_check") | ||
| async for msg in _handle_health_operation(health_func): | ||
| yield msg | ||
|
|
||
| else: | ||
| # Just connection - send server info | ||
| yield _format_sse_message("data", data=_get_server_info()) | ||
| yield _format_sse_message("complete", message="Connection established") | ||
|
|
||
| # Keep alive with heartbeat | ||
| heartbeat_count = 0 | ||
| while True: | ||
| await asyncio.sleep(30) | ||
| heartbeat_count += 1 | ||
| yield _format_sse_message( | ||
| "heartbeat", timestamp=time.time(), count=heartbeat_count |
There was a problem hiding this comment.
🛠️ Refactor suggestion | 🟠 Major
Add the missing type hints per project standard.
Our Python guideline requires every function signature to be fully annotated. Please add the concrete typing for the new helpers (_handle_search_operation, _handle_health_operation, _generate_webcat_stream, and the _get_server_info return type) so mypy stays green and usage is clear. Suggested patch:
@@
-import json
-import logging
+import json
+import logging
+from typing import Any, AsyncGenerator, Awaitable, Callable, Dict, Mapping, Optional
@@
-async def _handle_search_operation(search_func, query: str, max_results: int):
+async def _handle_search_operation(
+ search_func: Callable[[str], Awaitable[Dict[str, Any]]],
+ query: str,
+ max_results: int,
+) -> AsyncGenerator[str, None]:
@@
-async def _handle_health_operation(health_func):
+async def _handle_health_operation(
+ health_func: Optional[Callable[[], Awaitable[Dict[str, Any]]]],
+) -> AsyncGenerator[str, None]:
@@
-def _get_server_info() -> dict:
+def _get_server_info() -> Dict[str, Any]:
@@
-async def _generate_webcat_stream(
- webcat_functions, operation: str, query: str, max_results: int
-):
+async def _generate_webcat_stream(
+ webcat_functions: Mapping[str, Callable[..., Awaitable[Any]]],
+ operation: str,
+ query: str,
+ max_results: int,
+) -> AsyncGenerator[str, None]:As per coding guidelines
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| async def _handle_search_operation(search_func, query: str, max_results: int): | |
| """Handle search operation and yield results. | |
| Args: | |
| search_func: Search function to call | |
| query: Search query | |
| max_results: Maximum results to return | |
| Yields: | |
| SSE formatted messages | |
| """ | |
| yield _format_sse_message("status", message=f"Searching for: {query}") | |
| result = await search_func(query) | |
| # Limit results if needed | |
| if result.get("results") and len(result["results"]) > max_results: | |
| result["results"] = result["results"][:max_results] | |
| result["note"] = f"Results limited to {max_results} items" | |
| yield _format_sse_message("data", data=result) | |
| num_results = len(result.get("results", [])) | |
| yield _format_sse_message( | |
| "complete", message=f"Search completed. Found {num_results} results." | |
| ) | |
| async def _handle_health_operation(health_func): | |
| """Handle health check operation. | |
| Args: | |
| health_func: Health check function to call (or None) | |
| Yields: | |
| SSE formatted messages | |
| """ | |
| yield _format_sse_message("status", message="Checking server health...") | |
| if health_func: | |
| result = await health_func() | |
| yield _format_sse_message("data", data=result) | |
| yield _format_sse_message("complete", message="Health check completed") | |
| else: | |
| basic_health = { | |
| "status": "healthy", | |
| "service": "webcat-demo", | |
| "timestamp": time.time(), | |
| } | |
| yield _format_sse_message("data", data=basic_health) | |
| yield _format_sse_message("complete", message="Basic health check completed") | |
| def _get_server_info() -> dict: | |
| """Get server information dictionary. | |
| Returns: | |
| Server info dictionary | |
| """ | |
| return { | |
| "service": "WebCat MCP Demo Server", | |
| "version": "2.2.0", | |
| "status": "connected", | |
| "operations": ["search", "health"], | |
| "timestamp": time.time(), | |
| } | |
| async def _generate_webcat_stream( | |
| webcat_functions, operation: str, query: str, max_results: int | |
| ): | |
| """Generate SSE stream for WebCat operations. | |
| Args: | |
| webcat_functions: Dictionary of WebCat functions | |
| operation: Operation to perform | |
| query: Search query | |
| max_results: Maximum results | |
| Yields: | |
| SSE formatted messages | |
| """ | |
| try: | |
| # Send connection message | |
| yield _format_sse_message( | |
| "connection", | |
| status="connected", | |
| message="WebCat stream started", | |
| operation=operation, | |
| ) | |
| if operation == "search" and query: | |
| search_func = webcat_functions.get("search") | |
| if search_func: | |
| async for msg in _handle_search_operation( | |
| search_func, query, max_results | |
| ): | |
| yield msg | |
| else: | |
| yield _format_sse_message( | |
| "error", message="Search function not available" | |
| ) | |
| elif operation == "health": | |
| health_func = webcat_functions.get("health_check") | |
| async for msg in _handle_health_operation(health_func): | |
| yield msg | |
| else: | |
| # Just connection - send server info | |
| yield _format_sse_message("data", data=_get_server_info()) | |
| yield _format_sse_message("complete", message="Connection established") | |
| # Keep alive with heartbeat | |
| heartbeat_count = 0 | |
| while True: | |
| await asyncio.sleep(30) | |
| heartbeat_count += 1 | |
| yield _format_sse_message( | |
| "heartbeat", timestamp=time.time(), count=heartbeat_count | |
| import json | |
| import logging | |
| from typing import Any, AsyncGenerator, Awaitable, Callable, Dict, Mapping, Optional | |
| async def _handle_search_operation( | |
| search_func: Callable[[str], Awaitable[Dict[str, Any]]], | |
| query: str, | |
| max_results: int, | |
| ) -> AsyncGenerator[str, None]: | |
| """Handle search operation and yield results. | |
| Args: | |
| search_func: Search function to call | |
| query: Search query | |
| max_results: Maximum results to return | |
| Yields: | |
| SSE formatted messages | |
| """ | |
| yield _format_sse_message("status", message=f"Searching for: {query}") | |
| result = await search_func(query) | |
| # Limit results if needed | |
| if result.get("results") and len(result["results"]) > max_results: | |
| result["results"] = result["results"][:max_results] | |
| result["note"] = f"Results limited to {max_results} items" | |
| yield _format_sse_message("data", data=result) | |
| num_results = len(result.get("results", [])) | |
| yield _format_sse_message( | |
| "complete", message=f"Search completed. Found {num_results} results." | |
| ) | |
| async def _handle_health_operation( | |
| health_func: Optional[Callable[[], Awaitable[Dict[str, Any]]]], | |
| ) -> AsyncGenerator[str, None]: | |
| """Handle health check operation. | |
| Args: | |
| health_func: Health check function to call (or None) | |
| Yields: | |
| SSE formatted messages | |
| """ | |
| yield _format_sse_message("status", message="Checking server health...") | |
| if health_func: | |
| result = await health_func() | |
| yield _format_sse_message("data", data=result) | |
| yield _format_sse_message("complete", message="Health check completed") | |
| else: | |
| basic_health = { | |
| "status": "healthy", | |
| "service": "webcat-demo", | |
| "timestamp": time.time(), | |
| } | |
| yield _format_sse_message("data", data=basic_health) | |
| yield _format_sse_message("complete", message="Basic health check completed") | |
| def _get_server_info() -> Dict[str, Any]: | |
| """Get server information dictionary. | |
| Returns: | |
| Server info dictionary | |
| """ | |
| return { | |
| "service": "WebCat MCP Demo Server", | |
| "version": "2.2.0", | |
| "status": "connected", | |
| "operations": ["search", "health"], | |
| "timestamp": time.time(), | |
| } | |
| async def _generate_webcat_stream( | |
| webcat_functions: Mapping[str, Callable[..., Awaitable[Any]]], | |
| operation: str, | |
| query: str, | |
| max_results: int, | |
| ) -> AsyncGenerator[str, None]: | |
| """Generate SSE stream for WebCat operations. | |
| Args: | |
| webcat_functions: Dictionary of WebCat functions | |
| operation: Operation to perform | |
| query: Search query | |
| max_results: Maximum results | |
| Yields: | |
| SSE formatted messages | |
| """ | |
| try: | |
| # Send connection message | |
| yield _format_sse_message( | |
| "connection", | |
| status="connected", | |
| message="WebCat stream started", | |
| operation=operation, | |
| ) | |
| if operation == "search" and query: | |
| search_func = webcat_functions.get("search") | |
| if search_func: | |
| async for msg in _handle_search_operation( | |
| search_func, query, max_results | |
| ): | |
| yield msg | |
| else: | |
| yield _format_sse_message( | |
| "error", message="Search function not available" | |
| ) | |
| elif operation == "health": | |
| health_func = webcat_functions.get("health_check") | |
| async for msg in _handle_health_operation(health_func): | |
| yield msg | |
| else: | |
| # Just connection - send server info | |
| yield _format_sse_message("data", data=_get_server_info()) | |
| yield _format_sse_message("complete", message="Connection established") | |
| # Keep alive with heartbeat | |
| heartbeat_count = 0 | |
| while True: | |
| await asyncio.sleep(30) | |
| heartbeat_count += 1 | |
| yield _format_sse_message( | |
| "heartbeat", timestamp=time.time(), count=heartbeat_count | |
| ) |
🤖 Prompt for AI Agents
In docker/demo_server.py lines 78-196, the new helper functions lack type hints;
add concrete annotations to satisfy project typing rules. Annotate
_handle_search_operation as accepting search_func: Callable[[str],
Awaitable[Dict[str, Any]]], query: str, max_results: int and returning
AsyncGenerator[str, None]; annotate _handle_health_operation as accepting
health_func: Optional[Callable[[], Awaitable[Dict[str, Any]]]] and returning
AsyncGenerator[str, None]; annotate _generate_webcat_stream as accepting
webcat_functions: Dict[str, Callable[..., Awaitable[Dict[str, Any]]]],
operation: str, query: str, max_results: int and returning AsyncGenerator[str,
None]; annotate _get_server_info return type as Dict[str, Any]; ensure required
typing imports (AsyncGenerator, Awaitable, Callable, Dict, Optional, Any) are
added at the top of the file.
| if len(content) > MAX_CONTENT_LENGTH: | ||
| return content[:MAX_CONTENT_LENGTH] + "... [content truncated]" | ||
| return content |
There was a problem hiding this comment.
Fix truncation to honor MAX_CONTENT_LENGTH.
We slice to the max limit and then append the ellipsis, so the final string overshoots MAX_CONTENT_LENGTH. That breaks the contract of the cap (and defeats downstream safeguards relying on it). Trim enough characters before appending the suffix.
- if len(content) > MAX_CONTENT_LENGTH:
- return content[:MAX_CONTENT_LENGTH] + "... [content truncated]"
+ if len(content) > MAX_CONTENT_LENGTH:
+ suffix = "... [content truncated]"
+ if len(suffix) >= MAX_CONTENT_LENGTH:
+ return suffix[:MAX_CONTENT_LENGTH]
+ return content[: MAX_CONTENT_LENGTH - len(suffix)] + suffix📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| if len(content) > MAX_CONTENT_LENGTH: | |
| return content[:MAX_CONTENT_LENGTH] + "... [content truncated]" | |
| return content | |
| if len(content) > MAX_CONTENT_LENGTH: | |
| suffix = "... [content truncated]" | |
| if len(suffix) >= MAX_CONTENT_LENGTH: | |
| return suffix[:MAX_CONTENT_LENGTH] | |
| return content[:MAX_CONTENT_LENGTH - len(suffix)] + suffix | |
| return content |
🤖 Prompt for AI Agents
In docker/services/content_scraper.py around lines 219 to 221, the truncation
logic appends the ellipsis/suffix after slicing to MAX_CONTENT_LENGTH which
causes the returned string to exceed MAX_CONTENT_LENGTH; change to compute the
suffix (e.g. "... [content truncated]") and if MAX_CONTENT_LENGTH is larger than
the suffix length, slice content to MAX_CONTENT_LENGTH - len(suffix) and append
the suffix, otherwise return the suffix trimmed to MAX_CONTENT_LENGTH so the
final string never exceeds MAX_CONTENT_LENGTH.
| content: bytes = b"<html><body><p>Test content</p></body></html>", | ||
| headers: dict = None, | ||
| ) -> MockHttpResponse: | ||
| """Create a successful HTTP response.""" | ||
| """Create successful HTML response. | ||
|
|
||
| Args: | ||
| content: HTML content as bytes | ||
| headers: Response headers | ||
|
|
||
| Returns: | ||
| MockHttpResponse with 200 status | ||
| """ | ||
| default_headers = {"Content-Type": "text/html"} | ||
| if headers: | ||
| default_headers.update(headers) | ||
| return MockHttpResponse( | ||
| status_code=status_code, | ||
| content=content.encode("utf-8"), | ||
| text=content, | ||
| headers={"Content-Type": content_type}, | ||
| should_raise_for_status=False, | ||
| content=content, headers=default_headers, status_code=200 | ||
| ) |
There was a problem hiding this comment.
🛠️ Refactor suggestion | 🟠 Major
Tighten the headers type.
headers: dict = None trips our typing checks. Please change it to headers: Optional[Dict[str, str]] = None (and import Dict, Optional) so mypy stays happy.
🧰 Tools
🪛 Ruff (0.13.2)
19-19: PEP 484 prohibits implicit Optional
Convert to T | None
(RUF013)
🤖 Prompt for AI Agents
In docker/tests/factories/http_response_factory.py around lines 18 to 35, the
parameter annotation uses a bare dict which fails mypy; change the signature to
use headers: Optional[Dict[str, str]] = None and add the corresponding typing
imports (from typing import Dict, Optional) at the top of the file; keep the
existing runtime behavior (default_headers and update) unchanged.
| def connection_error(message: str = "Connection error"): | ||
| """Create exception for connection error. | ||
|
|
||
| Args: | ||
| message: Error message | ||
|
|
||
| Returns: | ||
| RequestException to be used with side_effect | ||
| """ | ||
| Create a Timeout exception. | ||
| return requests.exceptions.RequestException(message) | ||
|
|
There was a problem hiding this comment.
🛠️ Refactor suggestion | 🟠 Major
Add the return annotation.
connection_error should declare it returns requests.exceptions.RequestException to satisfy our type-hint policy.
🤖 Prompt for AI Agents
In docker/tests/factories/http_response_factory.py around lines 68 to 78, the
function connection_error is missing a return type annotation; update the
signature to declare it returns requests.exceptions.RequestException (e.g. def
connection_error(message: str = "Connection error") ->
requests.exceptions.RequestException:) and ensure the requests module is
imported/available in the file so the annotation resolves.
| def timeout(): | ||
| """Create timeout exception. | ||
|
|
||
| Returns: | ||
| Timeout exception to be used with side_effect | ||
| """ | ||
| return requests.Timeout("Request timed out after 5 seconds") | ||
| return requests.exceptions.Timeout("Request timed out") |
There was a problem hiding this comment.
🛠️ Refactor suggestion | 🟠 Major
Type the timeout helper.
Likewise, timeout() needs an explicit return type (e.g., -> requests.exceptions.Timeout) per the typing guidelines.
🤖 Prompt for AI Agents
In docker/tests/factories/http_response_factory.py around lines 80 to 86, the
timeout() helper is missing an explicit return type; add a return type
annotation -> requests.exceptions.Timeout to the function signature so it
conforms to the typing guidelines, e.g., adjust the def line to include that
return type and keep the existing docstring and return statement unchanged.
| def json(self) -> Dict[str, Any]: | ||
| """Return empty dict for JSON responses.""" | ||
| return {} |
There was a problem hiding this comment.
Restore JSON parsing behaviour.
Returning {} ignores any JSON payload configured on the mock and breaks callers that expect the text/body to be parsed. Please revert to decoding self.text (fall back to self.content when needed), for example:
def json(self) -> Dict[str, Any]:
- """Return empty dict for JSON responses."""
- return {}
+ """Return parsed JSON payload."""
+ if self.text:
+ return json.loads(self.text)
+ if self.content:
+ return json.loads(self.content.decode())
+ return {}(Import json at the top.) This keeps the mock faithful to requests.Response.json().
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| def json(self) -> Dict[str, Any]: | |
| """Return empty dict for JSON responses.""" | |
| return {} | |
| def json(self) -> Dict[str, Any]: | |
| """Return parsed JSON payload.""" | |
| if self.text: | |
| return json.loads(self.text) | |
| if self.content: | |
| return json.loads(self.content.decode()) | |
| return {} |
🤖 Prompt for AI Agents
In docker/tests/factories/mock_http_response.py around lines 42-44, the json()
method currently returns a hardcoded {} which ignores any configured payload;
restore real JSON parsing by importing json at the top and replacing the body
with code that attempts to parse self.text via json.loads(self.text) and, if
self.text is empty, fall back to decoding self.content to a str and parsing
that; let json decoding errors propagate the same way requests.Response.json()
would (do not swallow exceptions) so callers receive the same behavior as the
real requests client.
| def raise_for_status(self): | ||
| """Raise HTTPError if status code indicates error.""" | ||
| if self.status_code >= 400: | ||
| import requests | ||
|
|
||
| raise requests.HTTPError(f"{self.status_code} Error") |
There was a problem hiding this comment.
🛠️ Refactor suggestion | 🟠 Major
Add return type annotation for raise_for_status.
This helper is missing a return annotation, which violates the project requirement that every function signature be fully typed. Please add -> None to keep mypy and the shared guidelines happy. As per coding guidelines.
- def raise_for_status(self):
+ def raise_for_status(self) -> None:📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| def raise_for_status(self): | |
| """Raise HTTPError if status code indicates error.""" | |
| if self.status_code >= 400: | |
| import requests | |
| raise requests.HTTPError(f"{self.status_code} Error") | |
| def raise_for_status(self) -> None: | |
| """Raise HTTPError if status code indicates error.""" | |
| if self.status_code >= 400: | |
| import requests | |
| raise requests.HTTPError(f"{self.status_code} Error") |
🧰 Tools
🪛 Ruff (0.13.2)
37-37: Avoid specifying long messages outside the exception class
(TRY003)
🤖 Prompt for AI Agents
In docker/tests/factories/mock_serper_response.py around lines 32 to 37, the
method raise_for_status is missing a return type annotation; update its
signature to include -> None so the function is fully typed per project
guidelines, leaving implementation unchanged.
fix: Mock html2text in content truncation test
Fixes test failure on main after dependabot merge.
All 44 tests now pass locally and should pass in CI.
Summary by CodeRabbit
New Features
Bug Fixes
Tests
Documentation