Skip to content

Staging#214

Merged
gkorland merged 5 commits intomainfrom
staging
Sep 4, 2025
Merged

Staging#214
gkorland merged 5 commits intomainfrom
staging

Conversation

@gkorland
Copy link
Contributor

@gkorland gkorland commented Sep 3, 2025

Summary by CodeRabbit

  • Refactor
    • Standardized 401 Unauthorized responses across database, token, and graph endpoints for consistent API behavior.
  • Improvements
    • Clearer unauthorized message to help users resolve access issues.
    • Improved input validation and sanitization for graph-related requests, enhancing reliability and safer logging.
  • New Features
    • Added a deployment configuration for App Runner to simplify running the application in production.
  • Chores
    • Centralized shared response definitions to reduce duplication and ensure consistent messaging.

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Sep 3, 2025

Caution

Review failed

The pull request is closed.

Walkthrough

Centralized 401 Unauthorized response handling by introducing UNAUTHORIZED_RESPONSE and applying it across tokens, database, and graph routes. Added Pydantic request models and input-sanitization and DB-type helper utilities in api/routes/graphs.py. Added an App Runner deployment config apprunner.yaml. No route signatures changed.

Changes

Cohort / File(s) Summary
Standardized 401 response
api/routes/tokens.py, api/routes/database.py, api/routes/graphs.py
Added UNAUTHORIZED_RESPONSE constant and replaced inline 401 response descriptions with this shared constant across endpoints.
Graph route models & utilities
api/routes/graphs.py
Added Pydantic models GraphData, ChatRequest, ConfirmRequest. Added helpers get_database_type_and_loader, sanitize_query, sanitize_log_input, and internal _graph_name. Routing signatures unchanged.
App Runner config
apprunner.yaml
Added deployment config to run the app with Pipenv and Uvicorn exposing port 8000.

Sequence Diagram(s)

sequenceDiagram
  autonumber
  actor Client
  participant Router as API Router
  participant Auth as Auth Check
  participant Graphs as Graphs Controller
  participant Logger as Logger

  Client->>Router: POST /graphs/... (payload)
  Router->>Auth: Validate token/session
  alt Unauthorized
    Auth-->>Router: 401 UNAUTHORIZED_RESPONSE
    Router-->>Client: 401 {"description": "Unauthorized - Please log in or provide a valid API token"}
  else Authorized
    Router->>Graphs: Parse & validate (Pydantic models)
    Graphs->>Logger: sanitize_query / sanitize_log_input
    Graphs->>Graphs: get_database_type_and_loader (if DB URL used)
    Graphs-->>Client: 200 / 4xx (existing logic)
  end
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Possibly related PRs

  • Staging #214 — Appears to introduce and apply UNAUTHORIZED_RESPONSE across api/routes/*, matching the 401 centralization in this PR.
  • Staging #154 — Modifies api/routes/graphs.py with helpers like _graph_name and sanitization utilities similar to this change.

Poem

A twitch of whiskers, hops in code,
One 401 now wears a shared robe.
Graphs get models, queries scrubbed fine,
I stash deployments in a neat YAML line.
Carrots for tests, a tiny thump—hooray, commit is mine! 🥕


📜 Recent review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

💡 Knowledge Base configuration:

  • MCP integration is disabled by default for public repositories
  • Jira integration is disabled by default for public repositories
  • Linear integration is disabled by default for public repositories

You can enable these sources in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between c2b4581 and 9611590.

📒 Files selected for processing (1)
  • apprunner.yaml (1 hunks)
✨ Finishing Touches
🧪 Generate unit tests
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch staging

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share
🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>, please review it.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.

Support

Need help? Create a ticket on our support page for assistance with any issues or questions.

CodeRabbit Commands (Invoked using PR/Issue comments)

Type @coderabbitai help to get the list of available commands.

Other keywords and placeholders

  • Add @coderabbitai ignore or @coderabbit ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • Please see the configuration documentation for more information.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Status, Documentation and Community

  • Visit our Status Page to check the current availability of CodeRabbit.
  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

@github-actions
Copy link

github-actions bot commented Sep 3, 2025

Dependency Review

✅ No vulnerabilities or license issues or OpenSSF Scorecard issues found.

Scanned Files

None

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (3)
api/routes/tokens.py (1)

126-136: Risk: deleting by last 4 chars can remove multiple tokens.

RIGHT(token.id, 4) match is ambiguous; if two tokens share the same suffix for a user, both will be deleted. Guard and delete by exact id once uniquely identified.

Apply this refactor to first resolve the suffix to a single token, then delete by full id; return 409 when ambiguous:

-        # Delete the token
-        delete_query = """
-        MATCH (user:Identity {email:$user_email, provider:'api'})-[:HAS_TOKEN]->(token:Token)
-        WHERE RIGHT(token.id, 4)=$token_id
-        DELETE token
-        RETURN COUNT(*) AS deleted_count
-        """
-
-        result = await organizations_graph.query(delete_query, {
-            "user_email": user_email,
-            "token_id": token_id
-        })
+        # Resolve last-4 suffix to a single token id, then delete by exact id
+        match_query = """
+        MATCH (user:Identity {email:$user_email, provider:'api'})-[:HAS_TOKEN]->(token:Token)
+        WHERE RIGHT(token.id, 4) = $token_suffix
+        RETURN token.id AS id
+        """
+        match_res = await organizations_graph.query(match_query, {
+            "user_email": user_email,
+            "token_suffix": token_id,
+        })
+        ids = [row[0] for row in (match_res.result_set or [])]
+        if not ids:
+            raise HTTPException(status_code=status.HTTP_404_NOT_FOUND, detail="Token not found")
+        if len(ids) > 1:
+            raise HTTPException(status_code=status.HTTP_409_CONFLICT,
+                                detail="Multiple tokens match this suffix; specify a unique token")
+
+        delete_by_id = """
+        MATCH (user:Identity {email:$user_email, provider:'api'})-[:HAS_TOKEN]->(token:Token {id:$token_full_id})
+        DELETE token
+        RETURN 1 AS deleted_count
+        """
+        result = await organizations_graph.query(delete_by_id, {
+            "user_email": user_email,
+            "token_full_id": ids[0],
+        })

Optionally, document in the route that path parameter expects a last-4 suffix.

Also applies to: 142-147

api/routes/graphs.py (2)

34-63: Fix mutable default and tighten request models.

  • ConfirmRequest.chat uses a mutable default list; switch to Field(default_factory=list).
  • Optional: enforce a non-empty chat list at the schema level for ChatRequest to reduce runtime checks.
-from pydantic import BaseModel
+from pydantic import BaseModel, Field
@@
 class ChatRequest(BaseModel):
@@
-    chat: list[str]
+    chat: list[str]  # consider enforcing non-empty via validators/Field if desired
@@
 class ConfirmRequest(BaseModel):
@@
-    chat: list = []
+    chat: list[str] = Field(default_factory=list)

395-399: Sanitize SQL text before logging.

Even generated SQL can include user content; prevent log injection/newlines.

-            logging.info("Generated SQL query: %s", answer_an['sql_query'])  # nosemgrep
+            logging.info("Generated SQL query: %s",
+                         sanitize_log_input(answer_an.get('sql_query', '')))  # nosemgrep
🧹 Nitpick comments (8)
api/routes/tokens.py (1)

14-14: Decouple shared response constants from route modules.

To avoid inter-route dependencies (database/graphs importing tokens), move UNAUTHORIZED_RESPONSE to a small shared module (e.g., api/http/responses.py) and import from there.

api/routes/database.py (3)

20-21: Avoid duplicating MESSAGE_DELIMITER across modules.

Consider a single source (e.g., api.config or api.constants) used by both database.py and graphs.py to prevent drift with the frontend delimiter.


86-86: Minor style nit: assign without creating a list.

Slightly cleaner and clearer.

-            success, result = [False, ""]
+            success, result = False, ""

88-106: Consider adding a timeout to schema loading.

External DB connections can hang; wrap loader.load with a timeout configurable via Config to fail fast and surface an actionable error.

-                async for progress in loader.load(request.state.user_id, url):
+                # e.g., Config.DB_CONNECT_TIMEOUT seconds
+                try:
+                    async with asyncio.timeout(getattr(Config, "DB_CONNECT_TIMEOUT", 60)):
+                        async for progress in loader.load(request.state.user_id, url):
+                            success, result = progress
+                            if success:
+                                steps_counter += 1
+                                yield json.dumps(
+                                    {"type": "reasoning_step", "message": f"Step {steps_counter}: {result}"}
+                                ) + MESSAGE_DELIMITER
+                            else:
+                                break
+                except TimeoutError:
+                    yield json.dumps(
+                        {"type": "error", "message": "Timeout while connecting/loading schema"}
+                    ) + MESSAGE_DELIMITER
+                    return
api/routes/graphs.py (4)

92-101: Docstring vs implementation mismatch in sanitize_log_input.

Docstring says “wrap in repr()” but code doesn’t. Either update the docstring or actually wrap.

-    return value.replace('\n', ' ').replace('\r', ' ').replace('\t', ' ')
+    cleaned = value.replace('\n', ' ').replace('\r', ' ').replace('\t', ' ')
+    return repr(cleaned)

85-87: Defaulting unknown URLs to Postgres can misroute operations.

If backward-compat isn’t required, prefer failing fast by returning (None, None) on unknown scheme so callers surface a clear error.

-    # Default to PostgresLoader for backward compatibility
-    return 'postgresql', PostgresLoader
+    # Unknown scheme: fail fast
+    return None, None

27-28: Single source of truth for MESSAGE_DELIMITER.

Same suggestion as database.py—lift this to a shared constant to keep API and frontend coordinated.


102-114: Optional: strengthen graph_id validation.

If FalkorDB has name constraints, validate against a safe pattern (e.g., [A-Za-z0-9_-]{1,200}) to avoid problematic characters.

 def _graph_name(request: Request, graph_id:str) -> str:
-    if not graph_id or not isinstance(graph_id, str):
+    if not graph_id or not isinstance(graph_id, str):
         raise HTTPException(status_code=400, detail="Invalid graph_id")
 
-    graph_id = graph_id.strip()[:200]
+    graph_id = graph_id.strip()[:200]
+    # e.g., enforce allowed chars if desired
+    # import re at top if adopting this
+    # if not re.fullmatch(r"[A-Za-z0-9_-]{1,200}", graph_id):
+    #     raise HTTPException(status_code=400, detail="graph_id contains invalid characters")
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

💡 Knowledge Base configuration:

  • MCP integration is disabled by default for public repositories
  • Jira integration is disabled by default for public repositories
  • Linear integration is disabled by default for public repositories

You can enable these sources in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between 784acf0 and c2b4581.

📒 Files selected for processing (3)
  • api/routes/database.py (2 hunks)
  • api/routes/graphs.py (8 hunks)
  • api/routes/tokens.py (4 hunks)
🧰 Additional context used
📓 Path-based instructions (1)
**/*.py

📄 CodeRabbit inference engine (.github/copilot-instructions.md)

All Python code should pass pylint (use make lint)

Files:

  • api/routes/tokens.py
  • api/routes/database.py
  • api/routes/graphs.py
🧬 Code graph analysis (1)
api/routes/graphs.py (1)
app/ts/modules/config.ts (1)
  • MESSAGE_DELIMITER (5-5)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (5)
  • GitHub Check: deploy
  • GitHub Check: e2e-tests
  • GitHub Check: e2e-tests
  • GitHub Check: e2e-tests
  • GitHub Check: e2e-tests
🔇 Additional comments (5)
api/routes/tokens.py (2)

14-14: Good centralization of 401 response metadata.

Defining UNAUTHORIZED_RESPONSE once improves consistency and avoids drift.


29-30: Consistent reuse of UNAUTHORIZED_RESPONSE across endpoints.

Nice cleanup; reduces duplication and keeps docs uniform.

Also applies to: 74-75, 114-115

api/routes/database.py (1)

15-16: Adopting shared UNAUTHORIZED_RESPONSE looks good.

Keeps OpenAPI docs consistent with other routes.

Also applies to: 32-33

api/routes/graphs.py (2)

24-25: Good reuse of UNAUTHORIZED_RESPONSE.

This aligns graph routes with the standardized 401 docs.


116-117: 401 docs standardized across graph endpoints.

Nice consistency pass.

Also applies to: 138-139, 237-238, 273-274, 667-668, 831-832

@gkorland gkorland merged commit d25d91d into main Sep 4, 2025
16 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant