Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement check-result: test check failure affects test result by default #3239

Open
wants to merge 6 commits into
base: main
Choose a base branch
from

Conversation

martinhoyer
Copy link
Collaborator

@martinhoyer martinhoyer commented Sep 25, 2024

Trying to address #3185

Pull Request Checklist

  • implement the feature
  • write the documentation
  • extend the test coverage
  • update the specification
  • modify the json schema
  • mention the version
  • include a release note

@martinhoyer martinhoyer added status | need help Extra attention is needed status | need tests Test coverage to be added for the affected code status | need docs Documentation to be added for the affected code area | results Related to how tmt stores and shares results labels Sep 25, 2024
@martinhoyer martinhoyer self-assigned this Sep 25, 2024
@martinhoyer martinhoyer changed the title Have failure in check result propagate to the test result Implement check-result: test check failure affects test result by default Sep 26, 2024
tmt/result.py Outdated Show resolved Hide resolved
@martinhoyer
Copy link
Collaborator Author

martinhoyer commented Sep 26, 2024

Rebased to @happz's result-store-original-result #3147 and tried to use beakerlib for the first time. Feels werid to do all the assertgreps, but I've been merely following the existing examples.

Still proof of concept, but at least I finally figured that the "after-test" check were not available without the change in execute/internal.py.

@happz
Copy link
Collaborator

happz commented Sep 26, 2024

@martinhoyer if you change the base branch to the one from #3147, the diff should reduce a bit.

tmt/result.py Show resolved Hide resolved
tmt/result.py Outdated Show resolved Hide resolved
@martinhoyer
Copy link
Collaborator Author

tmt.utils.GeneralError: Test check 'dmesg' was not found in check registry.

perplexing

tmt/result.py Outdated Show resolved Hide resolved
docs/releases.rst Outdated Show resolved Hide resolved
spec/tests/check.fmf Outdated Show resolved Hide resolved
spec/tests/result.fmf Outdated Show resolved Hide resolved
tmt/result.py Outdated Show resolved Hide resolved
tmt/checks/__init__.py Outdated Show resolved Hide resolved
@psss psss added this to the 1.38 milestone Oct 1, 2024
@martinhoyer martinhoyer removed the status | need help Extra attention is needed label Oct 1, 2024
@martinhoyer
Copy link
Collaborator Author

Trying to implement the check results as discussed yesterday. Started from scratch. Still don't understand specs, schemas and docs generation. Why can't we just have a one place to define things at?

We can. It somehow started separately, in an organic way, as projects grow, and the amount of work necessary to turn the tide and introduce a single source of truth is not trivial. The work is ongoing, and one possible answer might be that I, who also dislike the currently fractured state of things, am unable to change it all in one night.

Right, but isn't that a sunk cost fallacy? I'm not saying you should do it alone in one night, but perhaps have a defined "state" to strive for and think about whether it wouldn't be worth working towards it in a long run.

@happz
Copy link
Collaborator

happz commented Oct 9, 2024

Trying to implement the check results as discussed yesterday. Started from scratch. Still don't understand specs, schemas and docs generation. Why can't we just have a one place to define things at?

We can. It somehow started separately, in an organic way, as projects grow, and the amount of work necessary to turn the tide and introduce a single source of truth is not trivial. The work is ongoing, and one possible answer might be that I, who also dislike the currently fractured state of things, am unable to change it all in one night.

Right, but isn't that a sunk cost fallacy? I'm not saying you should do it alone in one night, but perhaps have a defined "state" to strive for and think about whether it wouldn't be worth working towards it in a long run.

Yeah, that was just an example. There is a long-term goal, it just takes so many small steps to get there. Just recently plugin docs began to be generated from plugin sources. It's useful on its own, but it also served as an experiment on what would we need to render plugin schema from plugin sources. And so on, we're moving toward a much simpler state of things, with fewer sources of truth, it's just very slow.

@martinhoyer martinhoyer force-pushed the check-results branch 2 times, most recently from 8a3d12b to 1bec7f6 Compare October 11, 2024 19:28
@martinhoyer
Copy link
Collaborator Author

Not at all sure about what's happening in execute/internal

@@ -493,15 +493,3 @@ definitions:
type: string
# yamllint disable-line rule:line-length
pattern: "^\\d{2,}-\\d{2}-\\d{2}T\\d{2}:\\d{2}:\\d{2}\\.\\d+\\+\\d{2}:\\d{2}$"

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does it make sense to move it to test.yaml, or is check expected to be used elsewhere in future?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am hoping we can move the triggering of check to the plan, with the expectation that the check will operate on each test in the plan. But I am not sure if that work has been green lighted yet.

@@ -30,8 +30,10 @@
{% else %}
Default: {% for default_item in actual_default %}``{{ default_item.pattern | default(default_item) }}``{% if not loop.last %}, {% endif %}{% endfor %}
{% endif %}
{% elif actual_default.__class__.__name__ == 'CheckResultInterpret' %}
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ugly workaround.
@happz Do you have any cleaner solution to point me to please? I spent way more time than I'd like to admit trying to solve this.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, this is not very nice :/ What fields forced you to take this road? result is the new one, but it's a string, like how, so I don't immediately recall what might be a problem. What's logged by the script?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nevermind, I can try it out.

Copy link
Collaborator Author

@martinhoyer martinhoyer Oct 16, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It says CheckResultInterpret.RESPECT is the 'actual_default'. I've tried to add exporter=lambda result: result.value to Check.result field, but to no avail.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep, got it, it's an enum, I suppose it's the first one in plugins, and we can solve it once and for all :)

diff --git a/docs/scripts/generate-plugins.py b/docs/scripts/generate-plugins.py
index a37d83c1..9368f731 100755
--- a/docs/scripts/generate-plugins.py
+++ b/docs/scripts/generate-plugins.py
@@ -1,6 +1,7 @@
 #!/usr/bin/env python3
 
 import dataclasses
+import enum
 import sys
 import textwrap
 from typing import Any
@@ -106,6 +107,12 @@ def container_intrinsic_fields(container: ContainerClass) -> list[str]:
     return field_names
 
 
+def is_enum(value: Any) -> bool:
+    """ Find out whether a given value is an enum member """
+
+    return isinstance(value, enum.Enum)
+
+
 def _create_step_plugin_iterator(registry: tmt.plugins.PluginRegistry[tmt.steps.Method]):
     """ Create iterator over plugins of a given registry """
 
@@ -184,6 +191,7 @@ def main() -> None:
         STEP=step_name,
         PLUGINS=plugin_generator,
         REVIEWED_PLUGINS=REVIEWED_PLUGINS,
+        is_enum=is_enum,
         container_fields=tmt.utils.container_fields,
         container_field=tmt.utils.container_field,
         container_ignored_fields=container_ignored_fields,
diff --git a/docs/templates/plugins.rst.j2 b/docs/templates/plugins.rst.j2
index 5b99149c..5969cc4c 100644
--- a/docs/templates/plugins.rst.j2
+++ b/docs/templates/plugins.rst.j2
@@ -30,7 +30,7 @@
             {% else %}
     Default: {% for default_item in actual_default %}``{{ default_item.pattern | default(default_item) }}``{% if not loop.last %}, {% endif %}{% endfor %}
             {% endif %}
-        {% elif actual_default.__class__.__name__ == 'CheckResultInterpret' %}
+        {% elif is_enum(actual_default) %}
     Default: ``{{ actual_default.value }}``
         {% else %}
             {% set _ = LOGGER.warn("%s/%s.%s: could not render default value, '%s'" | format(STEP, plugin_id, field_name, actual_default), shift=0) %}

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice, thanks.

@martinhoyer
Copy link
Collaborator Author

@psss @happz I've marked the 'outdated' discussions as resolved, as it got quite messy with all the refactoring - sorry. I believe all the points have been incorporated though.

@martinhoyer martinhoyer added the ci | full test Pull request is ready for the full test execution label Oct 16, 2024
@martinhoyer
Copy link
Collaborator Author

/packit build

@martinhoyer martinhoyer added the status | need help Extra attention is needed label Oct 17, 2024
@psss psss added the priority | must high priority, must be included in the next release label Oct 18, 2024
@@ -38,7 +38,7 @@ rlJournalStart
run "errr" "/test/error" "" 2
run "pass" "/test/xfail-fail" "fail" 0
run "fail" "/test/xfail-pass" "pass" 1
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are we ok with this? I see no reason to add note when original and actual results are the same, no?


return _result.interpret_result(invocation.test.result)
interpret_checks = {check.how: check.result for check in invocation.test.check}
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This way doesn't really work when multiple check.how/check.name are the same. There surely has to be a better way to add/match CheckResultInterpret to each Check?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe we spoke about a way: there is a trivial mapping between a result and its parent test, TestInvocation stores results & points to the test. We did not needed to transition from check result to a check, therefore there is no such mapping yet, we need to establish one. My proposal was to add a dedicated mapping to TestInvocation class which would map from CheckResult to Check instances, and it would be populated in _run_checks_for_test() (most likely), as that is the place where check results are born. Something like this:

diff --git a/tmt/steps/execute/__init__.py b/tmt/steps/execute/__init__.py
index 407b4ac0..7f77d179 100644
--- a/tmt/steps/execute/__init__.py
+++ b/tmt/steps/execute/__init__.py
@@ -16,6 +16,7 @@ import fmf.utils
 
 import tmt
 import tmt.base
+import tmt.checks
 import tmt.log
 import tmt.steps
 import tmt.utils
@@ -175,6 +176,9 @@ class TestInvocation:
     results: list[Result] = dataclasses.field(default_factory=list)
     check_results: list[CheckResult] = dataclasses.field(default_factory=list)
 
+    check_result_to_check: dict[CheckResult, tmt.checks.Check] = dataclasses.field(
+        default_factory=dict)
+
     check_data: dict[str, Any] = field(default_factory=dict)
 
     return_code: Optional[int] = None
@@ -954,6 +958,8 @@ class ExecutePlugin(tmt.steps.Plugin[ExecuteStepDataT, None]):
                 result.end_time = format_timestamp(timer.end_time)
                 result.duration = format_duration(timer.duration)
 
+                invocation.check_result_to_check[result] = check
+
             results += check_results
 
         return results

It's not clean, it's not nice, I'd like to make it look and feel like the test invocation, but I don't have that ready and I promise to polish it later. With this, the following should work: use this mapping to collect check result/interpretation pairs, and use the right one for the given check result when calling the interpretation method:

interpret_checks = {
    check_result: check.result for check_result, check in invocation.check_result_to_check.items()
}

...

check_result.interpret_check_result(interpret_checks[check_result]))

if interpret == CheckResultInterpret.INFO:
self.result = ResultOutcome.INFO

elif interpret == CheckResultInterpret.XFAIL and self.event != CheckEvent.BEFORE_TEST:
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was thinking about adding a logic that would check for both before and after and make the xfail 'pass' only if after or both are failed, but wasn't sure if it's needed. ?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On second thought, this does need to be improved.
So, what are the expectations?
before fails > pass
before pass > pass
after fails > pass
after fails > fail
?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area | results Related to how tmt stores and shares results ci | full test Pull request is ready for the full test execution priority | must high priority, must be included in the next release status | need help Extra attention is needed
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants