Enhance run information retrieval and trigger graph functionality#340
Enhance run information retrieval and trigger graph functionality#340NiveditJain merged 5 commits intoexospherehost:mainfrom
Conversation
- Updated `get_run_info` to calculate total run counts more efficiently by storing the result in a variable, improving readability and performance. - Modified the status assignment in `get_run_info` to return `RunStatusEnum.FAILED` when no runs are found, enhancing error handling. - Refactored `trigger_graph` to only insert new stores if there are any, preventing unnecessary database operations and improving efficiency.
|
Note Other AI code review bot(s) detectedCodeRabbit has detected other AI code review bot(s) in this pull request and will avoid duplicating their findings in the review comments. This may lead to a less comprehensive review. Caution Review failedThe pull request is closed. 📝 WalkthroughSummary by CodeRabbit
WalkthroughReplaced per-run counting with a bulk PyMongo aggregation in get_runs, removed two helper functions and build RunListItem entries from aggregation (with new status logic and early-empty return), added a guard to skip Store.insert_many when no items, and added a non-unique index on (run_id, status) in the State model. Changes
Sequence Diagram(s)sequenceDiagram
autonumber
participant Client
participant Controller as GetRuns Controller
participant DB as MongoDB (States)
participant Lookup as Local lookup_table
Client->>Controller: Request runs for namespace (page/size)
Controller->>DB: Query runs list (namespace) -> runs + total count
DB-->>Controller: runs list
alt no runs
Controller-->>Client: Return empty RunsResponse
else runs exist
Controller->>DB: Aggregation across run_id -> counts (total, success, pending, errored, retried)
DB-->>Controller: Aggregated rows
Controller->>Lookup: Build run_id -> Run (graph_name, created_at)
loop For each aggregation row
Controller->>Controller: Determine status (PENDING if pending>0 else FAILED if errored>0 else SUCCESS)
Controller-->>Client: Add RunListItem(run_id, counts, status, created_at)
end
alt runs missing from aggregation
Controller->>Controller: Append RunListItem for missing runs (zero counts, status=FAILED)
end
Controller-->>Client: Return RunsResponse(sorted by created_at desc, total)
end
sequenceDiagram
autonumber
participant Orchestrator as Trigger Graph
participant Store as Store Model / DB
Orchestrator->>Orchestrator: Build new_stores list
alt new_stores not empty
Orchestrator->>Store: insert_many(new_stores)
Store-->>Orchestrator: insert result
else new_stores empty
note right of Orchestrator: Skip DB insert
end
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes Possibly related PRs
Suggested labels
Suggested reviewers
Poem
📜 Recent review detailsConfiguration used: CodeRabbit UI Review profile: ASSERTIVE Plan: Pro 💡 Knowledge Base configuration:
You can enable these sources in your CodeRabbit configuration. 📒 Files selected for processing (2)
✨ Finishing Touches
🧪 Generate unit tests
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. 🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
SupportNeed help? Create a ticket on our support page for assistance with any issues or questions. CodeRabbit Commands (Invoked using PR/Issue comments)Type Other keywords and placeholders
CodeRabbit Configuration File (
|
There was a problem hiding this comment.
Summary of Changes
Hello @NiveditJain, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!
This pull request focuses on enhancing the efficiency and robustness of the state management system. It optimizes data retrieval for run information and refines database interactions during graph triggering, leading to more performant and reliable operations.
Highlights
- Run Information Retrieval Optimization: The
get_run_infofunction now calculates the total run count once and stores it in a variable, improving efficiency by reducing redundant database queries and enhancing code readability. - Improved Run Status Handling: The
get_run_infofunction has been updated to correctly assignRunStatusEnum.FAILEDwhen no runs are found, providing better error handling for empty run states. - Efficient Graph Triggering: The
trigger_graphfunction now conditionally inserts new stores into the database only if thenew_storeslist is not empty, preventing unnecessary database operations and improving performance.
Using Gemini Code Assist
The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.
Invoking Gemini
You can request assistance from Gemini at any point in your pull request via creating an issue comment (i.e. comment on the pull request page) using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands.
| Feature | Command | Description |
|---|---|---|
| Code Review | /gemini review |
Performs a code review for the current pull request in its current state. |
| Pull Request Summary | /gemini summary |
Provides a summary of the current pull request in its current state. |
| Comment | @gemini-code-assist | Responds in comments when explicitly tagged, both in issue comments and review comments. |
| Help | /gemini help |
Displays a list of available commands. |
Customization
To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.
Limitations & Feedback
Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.
You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.
Footnotes
-
Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution. ↩
There was a problem hiding this comment.
Code Review
This pull request introduces some good efficiency improvements. The change to get_run_info reduces one database call, and the change in trigger_graph avoids a database call for empty stores. My review focuses on a significant performance optimization in get_run_info by using a database aggregation query to fetch all run statistics in a single call, which will resolve a potential N+1 query problem. I also suggest a minor stylistic improvement in trigger_graph to follow Python best practices.
There was a problem hiding this comment.
Actionable comments posted: 3
📜 Review details
Configuration used: CodeRabbit UI
Review profile: ASSERTIVE
Plan: Pro
💡 Knowledge Base configuration:
- MCP integration is disabled by default for public repositories
- Jira integration is disabled by default for public repositories
- Linear integration is disabled by default for public repositories
You can enable these sources in your CodeRabbit configuration.
📒 Files selected for processing (2)
state-manager/app/controller/get_runs.py(1 hunks)state-manager/app/controller/trigger_graph.py(1 hunks)
🧰 Additional context used
🧬 Code graph analysis (2)
state-manager/app/controller/get_runs.py (4)
state-manager/app/models/db/state.py (1)
State(13-97)dashboard/src/types/state-manager.ts (1)
RunListItem(192-202)state-manager/app/models/run_models.py (2)
RunListItem(14-24)RunStatusEnum(9-12)state-manager/app/models/state_status_enum.py (1)
StateStatusEnum(4-20)
state-manager/app/controller/trigger_graph.py (2)
state-manager/app/models/db/store.py (1)
Store(5-36)state-manager/app/models/db/state.py (1)
insert_many(55-61)
🔇 Additional comments (1)
state-manager/app/controller/get_runs.py (1)
31-32: Behavior change: zero-state runs now surface as FAILED — confirm product intent.If a Run exists without any State rows (e.g., partial failures, legacy data), the UI will show FAILED. Verify this is desired; otherwise consider PENDING or a distinct “EMPTY” state.
- Updated the `get_runs` function to utilize MongoDB aggregation for improved efficiency in retrieving run statistics. - Introduced a lookup table to map run IDs to their corresponding run objects, streamlining data processing. - Enhanced error handling by returning an empty response when no runs are found, ensuring clarity in API responses. - Refactored the response structure to include detailed run statistics, improving the overall data returned to the client.
There was a problem hiding this comment.
Actionable comments posted: 1
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (2)
state-manager/app/controller/get_runs.py (2)
1-2: Remove unused imports to fix Ruff errors.These imports are no longer used after the refactoring and are causing pipeline failures.
-import asyncio -from beanie.operators import In, NotIn
136-138: Remove obsoleteget_run_statusreferences in tests
In state-manager/tests/unit/controller/test_get_runs.py, delete theget_run_statusimport and all its usages to match its removal from app/controller/get_runs.py.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: ASSERTIVE
Plan: Pro
💡 Knowledge Base configuration:
- MCP integration is disabled by default for public repositories
- Jira integration is disabled by default for public repositories
- Linear integration is disabled by default for public repositories
You can enable these sources in your CodeRabbit configuration.
📒 Files selected for processing (1)
state-manager/app/controller/get_runs.py(1 hunks)
🧰 Additional context used
🧬 Code graph analysis (1)
state-manager/app/controller/get_runs.py (5)
dashboard/src/types/state-manager.ts (2)
RunsResponse(204-210)RunListItem(192-202)state-manager/app/models/run_models.py (3)
RunsResponse(26-32)RunListItem(14-24)RunStatusEnum(9-12)state-manager/app/models/db/state.py (1)
State(13-104)state-manager/app/models/state_status_enum.py (1)
StateStatusEnum(4-20)state-manager/app/models/db/run.py (1)
Run(7-25)
🪛 GitHub Actions: State Manager Unit Tests
state-manager/app/controller/get_runs.py
[error] 1-1: ImportError: cannot import name 'get_run_status' from 'app.controller.get_runs'. The tests import get_run_status in tests/unit/controller/test_get_runs.py; ensure the function is defined or exported in get_runs.py and the import path is correct. Failing command: 'pytest tests/ --cov=app --cov-report=xml --cov-report=term-missing --cov-report=html -v --junitxml=full-pytest-report.xml'.
🪛 GitHub Actions: Ruff check on changed files only
state-manager/app/controller/get_runs.py
[error] 1-1: Ruff: F401 - 'asyncio' imported but unused.
[error] 2-2: Ruff: F401 - 'beanie.operators.In' imported but unused.
[error] 2-2: Ruff: F401 - 'beanie.operators.NotIn' imported but unused.
🔇 Additional comments (3)
state-manager/app/controller/get_runs.py (3)
33-87: Excellent optimization using bulk aggregation!The refactoring from multiple per-run queries to a single bulk aggregation significantly reduces database round-trips from O(n) to O(1), which will dramatically improve performance for pages with many runs.
105-105: Verify the status logic change for empty runs.The status logic has been changed from the original implementation. The new logic sets status to
PENDINGif there are pending states, otherwiseFAILEDif there are errors, otherwiseSUCCESS. This differs from the original which returnedFAILEDfor empty runs (total_count=0). Please confirm this behavior change is intentional, especially for runs with no states.The original implementation had:
FAILEDwhentotal_count == 0 or errored_count > 0SUCCESSwhenpending_count == 0PENDINGotherwiseYour implementation has:
PENDINGwhenpending_count > 0FAILEDwhenerrored_count > 0(andpending_count == 0)SUCCESSotherwiseThis means runs with both pending and errored states will now show as
PENDINGinstead ofFAILED.
111-126: Good handling of runs without states.The logic correctly handles runs that exist in the namespace but have no associated states, marking them as
FAILEDwith zero counts. This maintains data integrity and provides a complete view of all runs.
- Updated the `get_runs` function to utilize MongoDB aggregation for enhanced performance in retrieving run statistics. - Modified the total count calculation to directly query the database, ensuring accurate results when no runs are found. - Refactored test cases to include new scenarios for handling empty results and verifying aggregation pipeline structure. - Improved the overall clarity and organization of the test suite, ensuring comprehensive coverage of run status calculations and edge cases.
Codecov Report✅ All modified and coverable lines are covered by tests. 📢 Thoughts on this report? Let us know! |
get_run_infoto calculate total run counts more efficiently by storing the result in a variable, improving readability and performance.get_run_infoto returnRunStatusEnum.FAILEDwhen no runs are found, enhancing error handling.trigger_graphto only insert new stores if there are any, preventing unnecessary database operations and improving efficiency.