Conversation
|
Note Other AI code review bot(s) detectedCodeRabbit has detected other AI code review bot(s) in this pull request and will avoid duplicating their findings in the review comments. This may lead to a less comprehensive review. Important Review skippedAuto reviews are disabled on base/target branches other than the default branch. Please check the settings in the CodeRabbit UI or the You can disable this status message by setting the WalkthroughThis PR introduces MCP endpoints and token management, converts database connection and chat flows to streaming, refactors loaders to async generators with distinct-value extraction, updates auth flows (OAuth + API tokens), revises config/env variables, removes legacy loaders/examples/validators, updates frontend UI/TS for connect + tokens, and adjusts tests/CI to new flows. Changes
Sequence Diagram(s)sequenceDiagram
autonumber
participant U as User (UI)
participant FE as Frontend (TS)
participant BE as FastAPI /database
participant L as Loader (Postgres/MySQL)
Note over U,FE: Connect Database (streaming)
U->>FE: Click "Connect Database"
FE->>BE: POST /database {url,type}
activate BE
BE-->>FE: stream: reasoning_step("Starting…")
BE->>L: loader.load(url) (async generator)
loop Progress
L-->>BE: yield (ok,msg)
BE-->>FE: stream: reasoning_step(msg)
end
alt success
L-->>BE: final (True, "Loaded")
BE-->>FE: stream: final_result(success)
else error
L-->>BE: final (False, err)
BE-->>FE: stream: error(err)
end
deactivate BE
FE->>FE: Update steps UI / reload on success
sequenceDiagram
autonumber
participant U as User
participant FE as Frontend
participant BE as FastAPI /login/*
participant OA as OAuth Provider
participant UM as User Mgmt
Note over U,FE: OAuth login issuing API token
U->>FE: Click Login (Google/GitHub)
FE->>BE: GET /login/{provider}
BE->>OA: Redirect to provider auth
OA->>BE: Callback with code
BE->>OA: Exchange code for token
OA-->>BE: Access token
BE->>OA: Fetch userinfo
OA-->>BE: User profile
BE->>UM: ensure_user_in_organizations(user)
UM-->>BE: user stored
BE->>BE: callback_handler → issue api_token
BE-->>FE: Redirect "/" + Set-Cookie(api_token)
FE->>U: Authenticated session
Estimated code review effort🎯 5 (Critical) | ⏱️ ~120 minutes Possibly related PRs
Suggested reviewers
Poem
✨ Finishing Touches🧪 Generate unit tests
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. 🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
SupportNeed help? Create a ticket on our support page for assistance with any issues or questions. CodeRabbit Commands (Invoked using PR/Issue comments)Type Other keywords and placeholders
CodeRabbit Configuration File (
|
Dependency ReviewThe following issues were found:
License IssuesPipfile
Pipfile.lock
OpenSSF ScorecardScorecard details
Scanned Files
|
There was a problem hiding this comment.
Actionable comments posted: 27
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (17)
api/agents/relevancy_agent.py (1)
74-89: Async method calls a blocking client; switch to async to avoid event-loop blocking
completion(...)is synchronous. In an async FastAPI context this can block workers under load.Apply:
-from litellm import completion +from litellm import acompletion @@ - completion_result = completion( + completion_result = await acompletion( model=Config.COMPLETION_MODEL, messages=self.messages, temperature=0, )If keeping the sync client is required, wrap with
anyio.to_thread.run_sync(...)instead..github/workflows/e2e-tests.yml (1)
58-60: Wrong env var names: using FLASK_ in a FastAPI app will cause misconfiguration*Docs and code use FASTAPI_; E2E may not boot with FLASK_.
Apply:
- echo "FLASK_SECRET_KEY=test-secret-key-for-ci" >> .env - echo "FLASK_DEBUG=False" >> .env + echo "FASTAPI_SECRET_KEY=test-secret-key-for-ci" >> .env + echo "FASTAPI_DEBUG=False" >> .env + echo "APP_ENV=ci" >> .envtests/e2e/README.md (1)
1-238: Fix CI spellcheck failure in root README.mdSpellcheck reports a misspelling: “”. Replace with “HTML content” (and avoid angle brackets which look like tags) or add a dictionary whitelist if intentional.
Happy to open a follow-up PR to patch README.md to unblock CI.
api/graph.py (2)
304-310: Bug: results unpacking drops column-only results.When only column_embeddings exist, results[0] is ignored and nothing is returned.
Apply:
- # Unpack results based on what tasks we ran - tables_des = results[0] if table_embeddings else [] - tables_by_columns_des = results[1] if (table_embeddings and column_embeddings) else [] + # Unpack results based on what tasks we ran + if table_embeddings and column_embeddings: + tables_des = results[0] + tables_by_columns_des = results[1] + elif table_embeddings: + tables_des = results[0] + tables_by_columns_des = [] + elif column_embeddings: + tables_des = [] + tables_by_columns_des = results[0] + else: + tables_des = [] + tables_by_columns_des = []
257-260: Guard against empty queries_history to avoid IndexError.Calling queries_history[-1] with an empty list will crash.
Apply:
- graph = db.select_graph(graph_id) - user_query = queries_history[-1] - previous_queries = queries_history[:-1] + if not queries_history: + return [] + graph = db.select_graph(graph_id) + user_query = queries_history[-1] + previous_queries = queries_history[:-1]api/memory/graphiti_tool.py (3)
51-55: Vector index creation is not idempotent; repeated runs can fail.Guard the CREATE with an existence check or swallow the “already exists” error so factory init is safe.
Apply this diff:
- vector_size = Config.EMBEDDING_MODEL.get_vector_size() - driver = self.graphiti_client.driver - await driver.execute_query(f"CREATE VECTOR INDEX FOR (p:Query) ON (p.embeddings) OPTIONS {{dimension:{vector_size}, similarityFunction:'euclidean'}}") + vector_size = Config.EMBEDDING_MODEL.get_vector_size() + driver = self.graphiti_client.driver + try: + await driver.execute_query( + f"CREATE VECTOR INDEX FOR (p:Query) ON (p.embeddings) " + f"OPTIONS {{dimension:{vector_size}, similarityFunction:'euclidean'}}" + ) + except Exception as e: + # tolerate re-creation attempts + if "already exists" not in str(e).lower(): + raise
201-246: Cypher injection risk and brittle escaping; parameterize queries.Manual quote-escaping is error-prone. Use Cypher parameters for user/sql/error strings and rely on MERGE to dedupe.
Apply this diff:
- # Check if Query node with same user_query and sql_query already exists - relationship_type = "SUCCESS" if success else "FAILED" - - # Escape quotes in the query and SQL for Cypher - escaped_query = query.replace("'", "\\'").replace('"', '\\"') - escaped_sql = sql_query.replace("'", "\\'").replace('"', '\\"') - escaped_error = error.replace("'", "\\'").replace('"', '\\"') if error else "" - embeddings = Config.EMBEDDING_MODEL.embed(escaped_query)[0] + # Check if Query node with same user_query and sql_query already exists + relationship_type = "SUCCESS" if success else "FAILED" + embeddings = Config.EMBEDDING_MODEL.embed(query)[0] @@ - check_query = f""" - MATCH (db:Entity {{uuid: "{database_node_uuid}"}}) - MATCH (db)-[r]->(q:Query) - WHERE q.user_query = "{escaped_query}" AND q.sql_query = "{escaped_sql}" - RETURN q.uuid as existing_query_uuid - LIMIT 1 - """ + check_query = """ + MATCH (db:Entity {uuid: $database_node_uuid})-[]->(q:Query) + WHERE q.user_query = $user_query AND q.sql_query = $sql_query + RETURN q.uuid as existing_query_uuid + LIMIT 1 + """ @@ - check_result = await graph_driver.execute_query(check_query) + records, _, _ = await graph_driver.execute_query( + check_query, + database_node_uuid=database_node_uuid, + user_query=query, + sql_query=sql_query, + ) @@ - if check_result[0]: # If records exist + if records: # If records exist print(f"Query with same user_query and sql_query already exists, skipping creation") return True @@ - cypher_query = f""" - MATCH (db:Entity {{uuid: "{database_node_uuid}"}}) - MERGE (q:Query {{ - user_query: "{escaped_query}", - sql_query: "{escaped_sql}", - success: {str(success).lower()}, - error: "{escaped_error}", - timestamp: timestamp(), - embeddings: vecf32($embedding) - }}) - CREATE (db)-[:{relationship_type} {{timestamp: timestamp()}}]->(q) - RETURN q.uuid as query_uuid - """ + cypher_query = f""" + MATCH (db:Entity {{uuid: $database_node_uuid}}) + MERGE (q:Query {{ + user_query: $user_query, + sql_query: $sql_query + }}) + SET q.success = $success, + q.error = coalesce($error, ""), + q.timestamp = timestamp(), + q.embeddings = vecf32($embedding) + CREATE (db)-[:{relationship_type} {{timestamp: timestamp()}}]->(q) + RETURN q.uuid as query_uuid + """ @@ - result = await graph_driver.execute_query(cypher_query, embedding=embeddings) + await graph_driver.execute_query( + cypher_query, + database_node_uuid=database_node_uuid, + user_query=query, + sql_query=sql_query, + success=bool(success), + error=error, + embedding=embeddings, + ) return TrueAlso applies to: 229-240, 205-209
26-26: Avoid blocking the event loop; use litellm’s async client.summarize_conversation is async but uses the sync completion(); switch to acompletion().
Apply this diff:
-from litellm import completion +from litellm import acompletion- response = completion( + response = await acompletion( model=Config.COMPLETION_MODEL, messages=[{"role": "user", "content": prompt}], temperature=0.1 )Also applies to: 571-575
api/loaders/postgres_loader.py (1)
453-459: Bug: awaiting an async generator returns a generator object, not (success, message)
PostgresLoader.loadis an async generator. You must consume it and use the last tuple. Current code will raise TypeError and never refresh.- # Reuse the existing load method to reload the schema - success, message = await PostgresLoader.load(prefix, db_url) - - if success: - logging.info("Graph schema refreshed successfully.") - return True, message - - logging.error("Schema refresh failed") - return False, "Failed to reload schema" + # Reuse the existing load method to reload the schema (consume async generator) + last_success, last_message = False, "No progress emitted" + async for success, message in PostgresLoader.load(prefix, db_url): + last_success, last_message = success, message + + if last_success: + logging.info("Graph schema refreshed successfully.") + return True, last_message + + logging.error("Schema refresh failed") + return False, "Failed to reload schema"tests/e2e/test_basic_functionality.py (1)
1-3: Silence Ruff S101 for pytest assertsKeep idiomatic pytest asserts without lint noise.
+ # ruff: noqa: S101 # Allow `assert` in tests """ Test basic application functionality. """api/app_factory.py (1)
105-119: Don’t swallow HTTPException; also fix Ruff ARG001 by marking request unused.Reorder the checks so HTTPException is re-raised first, and rename the unused param to _request to satisfy Ruff.
Apply:
- async def handle_oauth_error(request: Request, exc: Exception): # pylint: disable=unused-argument + async def handle_oauth_error(_request: Request, exc: Exception): """Handle OAuth-related errors gracefully""" - # Check if it's an OAuth-related error - # TODO check this scenario, pylint: disable=fixme - if "token" in str(exc).lower() or "oauth" in str(exc).lower(): - logging.warning("OAuth error occurred: %s", exc) - return RedirectResponse(url="/", status_code=302) - - # If it's an HTTPException, re-raise so FastAPI handles it properly - if isinstance(exc, HTTPException): - raise exc + # If it's an HTTPException, re-raise so FastAPI handles it properly + if isinstance(exc, HTTPException): + raise exc + # Best-effort OAuth error handling + if "token" in str(exc).lower() or "oauth" in str(exc).lower(): + logging.warning("OAuth error occurred: %s", exc) + return RedirectResponse(url="/", status_code=302)api/routes/auth.py (1)
128-135: Guard against missing email to prevent downstream auth failures.token_required encodes email; missing email will 401 after setting a cookie. Fail early or derive a stable identifier.
if user_info: user_data = { 'id': user_info.get('id') or user_info.get('sub'), 'email': user_info.get('email'), 'name': user_info.get('name'), 'picture': user_info.get('picture'), } + if not user_data['email']: + logging.warning("Google user is missing email; refusing login") + raise HTTPException(status_code=400, detail="Email is required for authentication") @@ if user_info: user_data = { 'id': user_info.get('id'), 'email': email, 'name': user_info.get('name'), 'picture': user_info.get('avatar_url'), } + if not user_data['email']: + logging.warning("GitHub user is missing email; refusing login") + raise HTTPException(status_code=400, detail="Email is required for authentication")Also applies to: 219-226
api/loaders/mysql_loader.py (3)
175-212: Ensure connection/cursor are closed on all paths; log with stack traces.Current path closes only on success. On exceptions, resources can leak; also prefer logging.exception per TRY400.
Apply:
- try: + conn = None + cursor = None + try: # Parse connection URL conn_params = MySQLLoader._parse_mysql_url(connection_url) # Connect to MySQL database - conn = pymysql.connect(**conn_params) - cursor = conn.cursor(DictCursor) + conn = pymysql.connect(**conn_params) + cursor = conn.cursor(DictCursor) @@ - # Close database connection - cursor.close() - conn.close() + # (closed in finally) @@ - except pymysql.MySQLError as e: - logging.error("MySQL connection error: %s", e) - yield False, f"MySQL connection error: {str(e)}" - except Exception as e: # pylint: disable=broad-exception-caught - logging.error("Error loading MySQL schema: %s", e) - yield False, f"Error loading MySQL schema: {str(e)}" + except pymysql.MySQLError as e: + logging.exception("MySQL connection error") + yield False, f"MySQL connection error: {e!s}" + except Exception as e: # pylint: disable=broad-exception-caught + logging.exception("Error loading MySQL schema") + yield False, f"Error loading MySQL schema: {e!s}" + finally: + try: + if cursor is not None: + cursor.close() + except Exception: # best-effort + pass + try: + if conn is not None: + conn.close() + except Exception: + pass
486-494: Bug: treating async generator as awaitable.MySQLLoader.load() yields; you must consume it to get the final status/message.
Apply:
- success, message = await MySQLLoader.load(prefix, db_url) - - if success: - logging.info("Graph schema refreshed successfully.") - return True, message - - logging.error("Schema refresh failed") - return False, "Failed to reload schema" + success = False + message = "" + async for success, message in MySQLLoader.load(prefix, db_url): + pass + if success: + logging.info("Graph schema refreshed successfully.") + return True, message + else: + logging.error("Schema refresh failed") + return False, "Failed to reload schema"
566-574: Exception paths can raise UnboundLocalError; include stack traces.cursor may be undefined when only conn exists. Also prefer logging.exception.
Apply:
- except pymysql.MySQLError as e: - # Rollback in case of error - if 'conn' in locals(): - conn.rollback() - cursor.close() - conn.close() - logging.error("MySQL query execution error: %s", e) - raise MySQLQueryError(f"MySQL query execution error: {str(e)}") from e + except pymysql.MySQLError as e: + if 'conn' in locals(): + try: + conn.rollback() + except Exception: + pass + try: + if 'cursor' in locals(): + cursor.close() + except Exception: + pass + try: + conn.close() + except Exception: + pass + logging.exception("MySQL query execution error") + raise MySQLQueryError(f"MySQL query execution error: {e!s}") from e- except Exception as e: - # Rollback in case of error - if 'conn' in locals(): - conn.rollback() - cursor.close() - conn.close() - logging.error("Error executing SQL query: %s", e) - raise MySQLQueryError(f"Error executing SQL query: {str(e)}") from e + except Exception as e: + if 'conn' in locals(): + try: + conn.rollback() + except Exception: + pass + try: + if 'cursor' in locals(): + cursor.close() + except Exception: + pass + try: + conn.close() + except Exception: + pass + logging.exception("Error executing SQL query") + raise MySQLQueryError(f"Error executing SQL query: {e!s}") from eAlso applies to: 580-581
app/public/css/menu.css (1)
208-214: Scope dropdown styles to a dedicated class
.header-buttonis used globally (e.g. the “Connect Database” button in app/templates/components/chat_header.j2), so redefining it in app/public/css/menu.css (lines 208–214) will impact all header buttons. Rename this rule to a specific container class:-.header-button { +.custom-dropdown { position: relative; display: inline-block; min-width: 140px; }api/agents/analysis_agent.py (1)
24-33: Add unit tests for parse_response missing keys
Ensureparse_responsealways returns"sql_query","ambiguities", and"missing_information"and add tests covering cases where fields are absent or JSON parsing falls back.
There was a problem hiding this comment.
Pull Request Overview
This PR addresses pylint warnings by adding appropriate disable comments throughout the codebase to suppress specific lint violations that are acceptable for test files and specific use cases.
Key changes:
- Added pylint disable comments for commonly accepted violations in test files (line-too-long, broad-exception-caught)
- Resolved unused variable warnings by replacing unused assignments with underscore variables
- Added disable comments for specific pylint rules where the code pattern is intentional
Reviewed Changes
Copilot reviewed 18 out of 20 changed files in this pull request and generated 3 comments.
Show a summary per file
| File | Description |
|---|---|
| tests/test_mysql_loader.py | Added protected-access disable for test accessing private methods |
| tests/e2e/test_unauthenticated_flow.py | Removed unused import, added line-too-long disables, replaced unused variable |
| tests/e2e/test_file_upload.py | Added multiple pylint disables for test-specific patterns |
| tests/e2e/test_basic_functionality.py | Added line-too-long disable and replaced unused variables |
| tests/e2e/test_api_endpoints.py | Added unused-argument and line-too-long disables |
| tests/e2e/pages/home_page.py | Added broad-exception-caught and broad-exception-raised disables |
| tests/e2e/pages/base_page.py | Moved inline comment to separate line for better readability |
| tests/e2e/fixtures/test_data.py | Added consider-using-with disable for test fixture |
| tests/e2e/examples/dev_auth_bypass.py | Added imports and disable comments for development example |
| tests/conftest.py | Added multiple pylint disables for test configuration patterns |
| api/routes/tokens.py | Added line-too-long disables for logging statements |
| api/routes/graphs.py | Multiple fixes including unused variables, function signature disables |
| api/memory/graphiti_tool.py | Added disable=all for external integration |
| api/graph.py | Added too-many-locals disable for complex function |
| api/auth/user_management.py | Added too-many-positional-arguments disable |
| api/auth/oauth_handlers.py | Added broad-exception-caught disable with explanatory comment |
| api/app_factory.py | Added various disables for middleware and exception handler patterns |
| .github/wordlist.txt | Added "OpenAI's" to spell check whitelist |
Summary by CodeRabbit
New Features
Changes
Documentation
Tests
Chores