Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

airbyte-ci: ignore CAT results when running on community connectors #39361

Conversation

alafanechere
Copy link
Contributor

@alafanechere alafanechere commented Jun 10, 2024

What

We decided that it was fine to make CI pass on community connectors if their acceptance tests results are not made worse than what they are on master.

Detection of "not made worse" is implemented in #39363

To do so we have to change the CI logic slightly to allow CAT to fail on community connector while still emitting a 🟢 CI status.

It'll allow keeping the original CAT results for debugging but not fail CI if they are not passing.

How

  1. Introduce a consider_result_in_overall_status flag at the StepResult level.
    It allows Step to emit StepResult which won't be considered in the overall computation of success / failure for a pipeline.

  2. Update the HTML report to show when a step was ignored from overall results.

  3. Change the definition of success at the Report level to integrate this new consider_result_in_overall_status flag.

Copy link

vercel bot commented Jun 10, 2024

The latest updates on your projects. Learn more about Vercel for Git ↗︎

1 Ignored Deployment
Name Status Preview Comments Updated (UTC)
airbyte-docs ⬜️ Ignored (Inspect) Visit Preview Jun 11, 2024 1:41pm

@alafanechere alafanechere force-pushed the augustin/06-10-airbyte-ci_ignore_CAT_results_when_running_on_community_connectors branch from 557a7c3 to 1db035d Compare June 10, 2024 13:00
@alafanechere alafanechere force-pushed the augustin/06-10-airbyte-ci_ignore_CAT_results_when_running_on_community_connectors branch from 1db035d to bc78187 Compare June 10, 2024 13:05
@alafanechere alafanechere marked this pull request as ready for review June 10, 2024 13:21
@alafanechere alafanechere requested a review from a team as a code owner June 10, 2024 13:21
@alafanechere alafanechere force-pushed the augustin/06-10-airbyte-ci_ignore_CAT_results_when_running_on_community_connectors branch from bc78187 to 6a006a5 Compare June 10, 2024 13:21
@alafanechere alafanechere changed the base branch from augustin/06-07-airbyte-ci_make_uploaded_artifacts_path_deterministic to augustin/06-07-airbyte-ci_make_AcceptanceTest_produce_jsonl_reports June 10, 2024 13:21
@alafanechere alafanechere force-pushed the augustin/06-07-airbyte-ci_make_AcceptanceTest_produce_jsonl_reports branch from f3eadca to 3d1653c Compare June 10, 2024 13:52
@alafanechere alafanechere force-pushed the augustin/06-10-airbyte-ci_ignore_CAT_results_when_running_on_community_connectors branch from 6a006a5 to 51799c8 Compare June 10, 2024 13:52
@alafanechere alafanechere mentioned this pull request Jun 10, 2024
2 tasks
Copy link
Contributor

@natikgadzhi natikgadzhi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I understand what's going on now! To approve this, I'd like one more engineer to take a look.

My ask is about variable naming clarity. It feels to me that the intent behind consider_result_in_overall_status is basically is_hard_failure.

  • I.e. if it's a hard fail (consider in overall result == true), fail the pipeline if this is failed.
  • If it's not a hard fail, than only fail the pipeline if the corresponding check on master does not fail.

If my understanding of the logic is correct, what would be a good name for this? perhaps it should be soft_result or fail_contingent_on_master?

@@ -326,13 +326,18 @@ async def get_step_result(self, container: Container) -> StepResult:
content=container.file(self.REPORT_LOG_PATH),
to_upload=True,
)
# When the current connector is not certified we want to report the acceptance test results but not consider them in the overall status of the test pipeline
consider_result_in_overall_status = (
self.context.connector.metadata.get("supportLevel") == "certified" or self.context.ci_context == CIContext.MASTER
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@bnchrch this might change if we go for a different gate

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we pull this out into a helper method so it's easier to find?

@@ -60,6 +60,10 @@ def json_report_remote_storage_key(self) -> str:
def failed_steps(self) -> List[StepResult]:
return [step_result for step_result in self.steps_results if step_result.status is StepStatus.FAILURE]

@property
def considered_failed_steps(self) -> List[StepResult]:
return [step_result for step_result in self.failed_steps if step_result.consider_in_overall_status]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Doesn't this just return either full list of all failed steps, or none at all? consider_in_overall_status is just a bool that stays the same for all steps, no?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's a StepResult property with a default value of True, for which we've only overridden it for the AcceptanceTests StepResult. So it should still see considered failed unit tests or qa checks, for example

@octavia-squidington-iv octavia-squidington-iv requested a review from a team June 10, 2024 17:51
@@ -326,13 +326,18 @@ async def get_step_result(self, container: Container) -> StepResult:
content=container.file(self.REPORT_LOG_PATH),
to_upload=True,
)
# When the current connector is not certified we want to report the acceptance test results but not consider them in the overall status of the test pipeline
consider_result_in_overall_status = (
self.context.connector.metadata.get("supportLevel") == "certified" or self.context.ci_context == CIContext.MASTER
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we pull this out into a helper method so it's easier to find?

@@ -60,6 +60,10 @@ def json_report_remote_storage_key(self) -> str:
def failed_steps(self) -> List[StepResult]:
return [step_result for step_result in self.steps_results if step_result.status is StepStatus.FAILURE]

@property
def considered_failed_steps(self) -> List[StepResult]:
return [step_result for step_result in self.failed_steps if step_result.consider_in_overall_status]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's a StepResult property with a default value of True, for which we've only overridden it for the AcceptanceTests StepResult. So it should still see considered failed unit tests or qa checks, for example

@@ -162,7 +162,11 @@ function copyToClipBoard(htmlElement) {
{% if step_result.status == StepStatus.SUCCESS %}
<label for="{{ step_result.step.title }}" class="lbl-toggle success">{{ step_result.step.title }} | {{ format_duration(step_result.step.run_duration) }}</label>
{% elif step_result.status == StepStatus.FAILURE %}
{% if not step_result.consider_in_overall_status %}
<label for="{{ step_result.step.title }}" class="lbl-toggle failure">{{ step_result.step.title }} | Ignored | {{ format_duration(step_result.step.run_duration) }}</label>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
<label for="{{ step_result.step.title }}" class="lbl-toggle failure">{{ step_result.step.title }} | Ignored | {{ format_duration(step_result.step.run_duration) }}</label>
<label for="{{ step_result.step.title }}" class="lbl-toggle failure">{{ step_result.step.title }} | Ignored (Failed) | {{ format_duration(step_result.step.run_duration) }}</label>

or something to indicate that we won't ignore it if it succeeds (and point out that it did fail although it was ignored), would be helpful, I think

@@ -70,7 +74,7 @@ def skipped_steps(self) -> List[StepResult]:

@property
def success(self) -> bool:
return len(self.failed_steps) == 0 and (len(self.skipped_steps) > 0 or len(self.successful_steps) > 0)
return len(self.considered_failed_steps) == 0 and (len(self.skipped_steps) > 0 or len(self.successful_steps) > 0)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Edge case: if we only return failed and ignored steps, this doesn't return true (because no steps were successfull/skipped). don't think we need to worry about this too much, but we're definitely getting out of our enum groups and into more complicated stuff

Copy link
Contributor

@erohmensing erohmensing Jun 10, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My ask is about variable naming clarity. It feels to me that the intent behind consider_result_in_overall_status is basically is_hard_failure.

I.e. if it's a hard fail (consider in overall result == true), fail the pipeline if this is failed.
If it's not a hard fail, than only fail the pipeline if the corresponding check on master does not fail.

This is interesting - we could even pull this back into our enums, instead of creating a second option, by creating
StepStatus.SOFT_FAILURE and StepStatus.HARD_FAILURE for example

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like this new enum strategy but it'd would require a significant refacto to change the failure detection everywhere it's used. I suggest doing it if we have multiple "soft failure" implementation coming up.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Edge case: if we only return failed and ignored steps, this doesn't return true (because no steps were successfull/skipped). don't think we need to worry about this too much, but we're definitely getting out of our enum groups and into more complicated stuff

I remember that I originally set this to return not self.failed_steps (which we could transform to return not self.considered_failed_steps) . I remember I added the and to fix another edge case - I should have added a comment to remember which one!

@alafanechere alafanechere force-pushed the augustin/06-10-airbyte-ci_ignore_CAT_results_when_running_on_community_connectors branch 3 times, most recently from ff30368 to 8d9929b Compare June 11, 2024 12:37
Copy link
Contributor Author

alafanechere commented Jun 11, 2024

Merge activity

  • Jun 11, 9:00 AM EDT: @alafanechere started a stack merge that includes this pull request via Graphite.
  • Jun 11, 9:41 AM EDT: Graphite rebased this pull request as part of a merge.
  • Jun 11, 10:02 AM EDT: @alafanechere merged this pull request with Graphite.

@alafanechere alafanechere changed the base branch from augustin/06-07-airbyte-ci_make_AcceptanceTest_produce_jsonl_reports to graphite-base/39361 June 11, 2024 13:18
@alafanechere alafanechere changed the base branch from graphite-base/39361 to master June 11, 2024 13:39
@alafanechere alafanechere force-pushed the augustin/06-10-airbyte-ci_ignore_CAT_results_when_running_on_community_connectors branch from 8d9929b to c9c37bb Compare June 11, 2024 13:41
@alafanechere alafanechere merged commit 721ca46 into master Jun 11, 2024
29 checks passed
@alafanechere alafanechere deleted the augustin/06-10-airbyte-ci_ignore_CAT_results_when_running_on_community_connectors branch June 11, 2024 14:02
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants