-
Notifications
You must be signed in to change notification settings - Fork 118
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix PR cycle time calculation for draft PRs #636
base: main
Are you sure you want to change the base?
Conversation
- Modified cycle time calculation to start timing when PR first becomes ready for review - Added logic to detect the first transition from draft to ready state - First ready event timestamp is used instead of PR creation time when applicable - Provides more accurate metrics by excluding draft preparation time Resolves issue middlewarehq#634
WalkthroughThe changes introduce a new optional parameter, Changes
Sequence Diagram(s)sequenceDiagram
participant Handler as GithubETLHandler
participant API as GithubApiService
participant Repo as GithubRepository
participant PR as PullRequest
Handler->>Repo: Retrieve pull requests data
Handler->>PR: Process pull request
Handler->>API: Get PR timeline with github_repo and pr.number
API-->>Handler: Return PR timeline events
Handler->>Handler: Determine earliest "ready for review" event
Handler->>Handler: Call create_pr_metrics with pr_earliest_event
Poem
✨ Finishing Touches
🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (Invoked using PR comments)
Other keywords and placeholders
CodeRabbit Configuration File (
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (1)
backend/analytics_server/mhq/service/code/sync/etl_code_analytics.py (1)
58-104
: Consider adding a small comment explaining the draft PR cycle time logicWhile the implementation is correct, adding a brief comment explaining why we're using
first_open_time
instead ofpr.created_at
would make the code more maintainable for future developers who might not be familiar with draft PR workflows.# Add comment before first_ready_event declaration + # For draft PRs, cycle time should start when the PR becomes ready for review (state changes to OPEN) + # rather than when it was initially created as a draft first_ready_event = next( ( e
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
backend/analytics_server/mhq/service/code/sync/etl_code_analytics.py
(2 hunks)
🔇 Additional comments (4)
backend/analytics_server/mhq/service/code/sync/etl_code_analytics.py (4)
58-65
: Good implementation of finding the first OPEN state eventThe implementation correctly uses
next()
with a generator expression to efficiently find the first event where the state is OPEN, with a proper fallback to None if no such event exists.
67-69
: Well-handled fallback logic for first_open_timeThe code appropriately sets
first_open_time
to either the timestamp of the first OPEN event (if found) or falls back to the PR creation time, which ensures backward compatibility for PRs without explicit state changes.
104-104
: Correctly updated cycle_time calculationThis change properly implements the fix by using
first_open_time
instead of the PR creation time for calculating cycle time, which accurately reflects the time a PR spent in active review.
58-104
: Verify this change with edge casesThe implementation looks correct, but it's worth verifying with some edge cases:
- PRs that were never in draft state
- PRs that have multiple open/draft transitions
- PRs created before this feature was implemented
The current implementation should handle all these cases correctly, but testing would confirm.
Please get the issue assigned to yourself by commenting on the issue, before you start working on it. This helps others not pick up tasks that are already being worked on by someone else. |
Apologies, I forgot to get the issue assigned before starting. I'll make sure to do that next time. Thanks for the reminder! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This PR is incomplete and the logic is flawed. Also please include tests for the changes you make.
backend/analytics_server/mhq/service/code/sync/etl_code_analytics.py
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (9)
backend/analytics_server/mhq/exapi/github.py (1)
117-119
: Add return type hint, docstring, and error handling.The new
get_pr_timeline
method works correctly but could be improved with:
- Return type hint
- Docstring explaining purpose and parameters
- Error handling for API exceptions
- def get_pr_timeline(self, github_repo:GithubRepository, pr_number:int): - issue = github_repo.get_issue(pr_number) - return issue.get_timeline() + def get_pr_timeline(self, github_repo: GithubRepository, pr_number: int) -> GithubPaginatedList: + """ + Get the timeline of events for a pull request. + + Args: + github_repo: GitHub repository object + pr_number: Pull request number + + Returns: + Timeline of events for the pull request + + Raises: + GithubException: If there's an error retrieving the timeline + """ + try: + issue = github_repo.get_issue(pr_number) + return issue.get_timeline() + except GithubException as e: + LOG.error(f"Error getting PR timeline for PR #{pr_number}: {e}") + raisebackend/analytics_server/mhq/service/code/sync/etl_code_analytics.py (3)
21-21
: Improve the type hint forpr_earliest_event
.The type hint
Optional[object]
is very generic. A more specific type would improve code readability and enable better IDE support.- pr_earliest_event:Optional[object] + pr_earliest_event: Optional[Any] = NoneAlso consider adding type imports:
from typing import Any
56-56
: Improve parameter spacing and type hints.The method signature has inconsistent spacing and uses a generic type for
pr_earliest_event
.- def get_pr_performance(pr: PullRequest, pr_events: [PullRequestEvent],pr_earliest_event:Optional[object]): + def get_pr_performance(pr: PullRequest, pr_events: List[PullRequestEvent], pr_earliest_event: Optional[Any] = None):
90-94
: Improve indentation and handling ofpr_earliest_event.created_at
.The current indentation makes this conditional expression hard to read. Also, there's no fallback if
pr_earliest_event
exists but doesn't have acreated_at
attribute.- cycle_time = pr.state_changed_at - ( - pr_earliest_event.created_at.astimezone(timezone.utc) - if pr_earliest_event and isinstance(pr_earliest_event.created_at, datetime) - else pr.created_at - ) + first_open_time = ( + pr_earliest_event.created_at.astimezone(timezone.utc) + if pr_earliest_event and hasattr(pr_earliest_event, 'created_at') and isinstance(pr_earliest_event.created_at, datetime) + else pr.created_at + ) + cycle_time = pr.state_changed_at - first_open_timebackend/analytics_server/mhq/service/code/sync/etl_github_handler.py (5)
149-151
: Add space after comma in method call.Maintain consistent spacing in method calls.
- str(org_repo.id), github_pr, github_repo + str(org_repo.id), github_pr, github_repo
160-160
: Fix parameter spacing.The spacing after commas is inconsistent.
- self, repo_id: str, pr: GithubPullRequest,github_repo:GithubRepository + self, repo_id: str, pr: GithubPullRequest, github_repo: GithubRepository
175-176
: Improve variable spacing and variable naming.The code lacks spaces around operators and could use a more descriptive variable name.
- pr_timeline_events=self._api.get_pr_timeline(github_repo,pr.number) - pr_earliest_ready_for_review= self.get_first_ready_for_review_event(pr_timeline_events) + pr_timeline_events = self._api.get_pr_timeline(github_repo, pr.number) + pr_earliest_ready_for_review = self.get_first_ready_for_review_event(pr_timeline_events)
188-188
: Fix parameter spacing.Add space after comma for consistency.
- pr_model, pr_events_model_list, pr_commits_model_list,pr_earliest_ready_for_review + pr_model, pr_events_model_list, pr_commits_model_list, pr_earliest_ready_for_review
194-202
: Improve type annotations and add docstring.The current type hint for
events
is too generic. Also, the method would benefit from a docstring explaining its purpose.- def get_first_ready_for_review_event(events: List) -> Optional[object]: + def get_first_ready_for_review_event(events: List[Any]) -> Optional[Any]: + """ + Find the earliest 'ready_for_review' event from the timeline events. + + This method filters PR timeline events to find when a PR was first marked + as ready for review after being in draft state. + + Args: + events: List of PR timeline events + + Returns: + The earliest 'ready_for_review' event or None if no such events exist + """ ready_events = [ event for event in events if getattr(event, "event", None) == "ready_for_review" and hasattr(event, "created_at") ] if not ready_events: return None earliest_event = min(ready_events, key=lambda e: e.created_at) return earliest_eventAlso, consider importing
Any
from typing:from typing import Any
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (3)
backend/analytics_server/mhq/exapi/github.py
(1 hunks)backend/analytics_server/mhq/service/code/sync/etl_code_analytics.py
(5 hunks)backend/analytics_server/mhq/service/code/sync/etl_github_handler.py
(4 hunks)
…t retrieval logic
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (4)
backend/analytics_server/mhq/service/code/sync/etl_code_analytics.py (4)
21-22
: Add default value for optional parameterThe new
pr_earliest_event
parameter is marked asOptional
but doesn't have a default value, which could cause issues if the method is called without this parameter from existing code.- pr_earliest_event:Optional[datetime] + pr_earliest_event: Optional[datetime] = None
26-26
: Fix spacing in parameter listMissing space after comma for better readability.
- pr_performance = self.get_pr_performance(pr, pr_events,pr_earliest_event) + pr_performance = self.get_pr_performance(pr, pr_events, pr_earliest_event)
56-57
: Improve parameter formatting and add default valueThe parameter list has inconsistent spacing and type annotation style. Also, similar to the
create_pr_metrics
method, the optional parameter should have a default value.- def get_pr_performance(pr: PullRequest, pr_events: [PullRequestEvent],pr_earliest_event:Optional[datetime]): + def get_pr_performance(pr: PullRequest, pr_events: List[PullRequestEvent], pr_earliest_event: Optional[datetime] = None):
91-96
: Add comments explaining the purpose of this logicIt would be helpful to add a comment explaining what
first_open_time
andpr_earliest_event
represent, especially since this is a core part of the fix for calculating cycle time for draft PRs. This helps future developers understand the intent of the code.+ # Determine when the PR was first actually open for review. + # For draft PRs, this is when they were marked as "ready for review", not when they were created. first_open_time = ( pr_earliest_event if pr_earliest_event and isinstance(pr_earliest_event, datetime) else pr.created_at ).astimezone(timezone.utc)
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (2)
backend/analytics_server/mhq/service/code/sync/etl_code_analytics.py
(5 hunks)backend/analytics_server/mhq/service/code/sync/etl_github_handler.py
(4 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
- backend/analytics_server/mhq/service/code/sync/etl_github_handler.py
🔇 Additional comments (1)
backend/analytics_server/mhq/service/code/sync/etl_code_analytics.py (1)
97-97
: LGTM: Correct cycle time calculationThe updated calculation correctly uses
first_open_time
instead ofpr.created_at
, addressing the issue with draft PRs where cycle time should be calculated from when the PR was first ready for review.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (3)
backend/analytics_server/mhq/service/code/sync/etl_code_analytics.py (3)
1-1
: Remove unused import.The
timezone
fromdatetime
is imported but not used in the code. It's best to remove unused imports to keep the code clean.-from datetime import datetime,timedelta,timezone +from datetime import datetime,timedelta🧰 Tools
🪛 Ruff (0.8.2)
1-1:
datetime.timezone
imported but unusedRemove unused import:
datetime.timezone
(F401)
56-56
: LGTM!Method signature properly updated to include the new parameter. Consider adding a space after the comma for better readability.
-def get_pr_performance(pr: PullRequest, pr_events: [PullRequestEvent],pr_earliest_event:Optional[datetime]): +def get_pr_performance(pr: PullRequest, pr_events: [PullRequestEvent], pr_earliest_event: Optional[datetime]):
103-114
: Consider updating first_review_time calculation.The
first_review_time
calculation still usespr.created_at
and doesn't account for the newfirst_open_time
. For consistency with the cycle time calculation, consider usingfirst_open_time
here as well.first_review_time=( - (first_review.created_at - pr.created_at).total_seconds() + (first_review.created_at - first_open_time).total_seconds() if first_review else -1 ),
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (2)
backend/analytics_server/mhq/service/code/sync/etl_code_analytics.py
(5 hunks)backend/analytics_server/mhq/service/code/sync/etl_github_handler.py
(4 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
- backend/analytics_server/mhq/service/code/sync/etl_github_handler.py
🧰 Additional context used
🪛 Ruff (0.8.2)
backend/analytics_server/mhq/service/code/sync/etl_code_analytics.py
1-1: datetime.timezone
imported but unused
Remove unused import: datetime.timezone
(F401)
🔇 Additional comments (5)
backend/analytics_server/mhq/service/code/sync/etl_code_analytics.py (5)
2-2
: LGTM!Good addition of
Optional
fromtyping
to support the new parameter functionality.
21-21
: LGTM!The addition of the
pr_earliest_event
parameter enables tracking the earliest "ready for review" event for draft PRs, which is essential for accurate cycle time calculation.
26-26
: LGTM!Correctly passing the new parameter to the
get_pr_performance
method.
91-98
: LGTM! Logic handles draft PRs correctly.Good implementation of the logic to determine the
first_open_time
based on the earliest "ready for review" event or the PR creation time. This resolves the issue of cycle time calculation for draft PRs by considering when they actually became ready for review rather than when they were initially created.
183-183
: LGTM!No changes were made to the
get_rework_cycles
method, but it works correctly with the updated code.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (1)
backend/analytics_server/tests/service/code/sync/test_etl_github_handler.py (1)
370-385
: Good edge case handling for no ready events.This test properly verifies the behavior when no "ready_for_review" events are present in the timeline data, ensuring the method returns
None
as expected.Consider extracting the
MockTimelineEvent
class to reduce duplication with the previous test, though keeping it self-contained is also acceptable for test clarity.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
backend/analytics_server/tests/service/code/sync/test_etl_github_handler.py
(1 hunks)
🔇 Additional comments (2)
backend/analytics_server/tests/service/code/sync/test_etl_github_handler.py (2)
349-368
: Well-structured test for finding the earliest ready event.The test effectively verifies that the
get_first_ready_for_review_event
method correctly identifies the earliest "ready_for_review" event from a list of mixed timeline events, which is essential for accurately calculating PR cycle time.
387-391
: Appropriate empty list edge case handling.This test verifies that the method correctly handles an empty list of events by returning
None
, which is an important edge case to cover for production reliability.
This pull request includes several changes to enhance the processing and analysis of pull requests in the
backend/analytics_server
module. The key changes include adding a method to retrieve the pull request timeline, updating thecreate_pr_metrics
function to account for the earliest "ready for review" event, and modifying theprocess_pr
function to incorporate this new data.Enhancements to pull request processing:
backend/analytics_server/mhq/exapi/github.py
: Added a new methodget_pr_timeline
to retrieve the timeline of a pull request.Updates to metrics calculation:
backend/analytics_server/mhq/service/code/sync/etl_code_analytics.py
: Updated thecreate_pr_metrics
function to include an optionalpr_earliest_event
parameter and adjusted theget_pr_performance
function to use this parameter for calculating the cycle time. [1] [2] [3]Changes to pull request processing flow:
backend/analytics_server/mhq/service/code/sync/etl_github_handler.py
: Modified theprocess_pr
function to call the newget_pr_timeline
method and retrieve the earliest "ready for review" event for use in metrics calculation. [1] [2] [3] [4]Additional imports:
backend/analytics_server/mhq/service/code/sync/etl_code_analytics.py
: Addeddatetime
andtimezone
to the imports to handle timezone-aware datetime objects.Linked Issue(s)
Resolves issue #634
Summary by CodeRabbit
Summary by CodeRabbit
New Features
Tests