Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make the result of an empty plan configurable #3042

Open
psss opened this issue Jun 24, 2024 · 7 comments
Open

Make the result of an empty plan configurable #3042

psss opened this issue Jun 24, 2024 · 7 comments
Labels
area | results Related to how tmt stores and shares results priority | should medium priority, should be included in the next release step | execute Stuff related to the execute step
Milestone

Comments

@psss
Copy link
Collaborator

psss commented Jun 24, 2024

When there is no test discovered (and executed) in any of the enabled plans which tmt run detects, it reports 3 as the exit code. This is then handled as an error when executed in Testing Farm. Some users would like to make it configurable how the empty plan scenario is handled.

The way proposed during the last discussion was to introduce a new option under the plan. It could look like this:

/plan:
    discover:
        how: fmf
    execute:
        how: tmt
    empty-plan-result: pass / warn / fail / error / skip

Possibly the new empty-plan-result key could be stored under some of the steps, perhaps execute?

/plan:
    discover:
        how: fmf
    execute:
        how: tmt
        empty-plan-result: pass / warn / fail / error / skip

The thing is that actually we don't have any plan result implemented. There are only results.yaml and based on them the appropriate exit code is reported. So should it be rather something like:

empty-plan-exit-code: 0

Another approach could be that the execute step would create a dummy result with the provided value. This would also suggest that the new option should belong under the execute step. Thoughts?

Related issue: https://issues.redhat.com/browse/TFT-2065

@psss psss added step | execute Stuff related to the execute step area | results Related to how tmt stores and shares results labels Jun 24, 2024
@psss
Copy link
Collaborator Author

psss commented Jun 24, 2024

@fhrdina1, @kkaarreell, @lukaszachy, @thrix, @happz, @Athwale, @psklenar, here's summary from our discussion. Please have a look.

@happz
Copy link
Collaborator

happz commented Jun 24, 2024

The thing is that actually we don't have any plan result implemented. There are only results.yaml and based on them the appropriate exit code is reported.

That is slightly inaccurate, I'd say: results.yaml is created per plan, so in this sense, we do have "plan result" - we don't have "plan exit code" because the eventual exit code is decided based on results collected from all plans of a run. And the "plan result" aka "plan exit code" derived from per-plan results never materializes.

This is mostly fine in TF, where tmt run is created for each discovered non-empty plan, and then "all results" are suddenly "plan results". So here a flag would work seamlessly. But then runs with multiple plans arrive on the scene. How about turning the piece of code that decided the exit code (tmt.base.Run.finish()) into something like this:

        results = [
            result
            for plan in self.plans
            for result in plan.execute.results()]
        ...
        raise SystemExit(tmt.result.results_to_exit_code(results))

# could become:
        exit_codes = [
            tmt.result.results_to_exit_code(plan.execute.results())
            for plan in self.plans]
        ...
        raise SystemExit(max(exit_codes))

And then we can change the loop and start involving the plan's flag to override that plan's evaluation of its results. This would materialize "plan result", in the sense of "an exit code a plan would produce if there were no other plans that might make the result even worse".

So should it be rather something like:

empty-plan-exit-code: 0

All in all, both approaches, a flag to turn "empty" into "success" or a flag to turn "empty" into a given exit code, seem fine to me.

What happens when there are two plans in a run, one empty and one with failing tests - the usual "the worst wins" approach?

@LecrisUT
Copy link
Contributor

I can also see this fitting in the discover definition. I guess there would be some ambiguity when defining multiple deiscover steps.

For reference, ctest uses language like ctest --no-tests={ignore,error}

@psss
Copy link
Collaborator Author

psss commented Jun 26, 2024

That is slightly inaccurate, I'd say: results.yaml is created per plan, so in this sense, we do have "plan result" - we don't have "plan exit code" because the eventual exit code is decided based on results collected from all plans of a run. And the "plan result" aka "plan exit code" derived from per-plan results never materializes.

I meant that currently we even do not define any final result value like pass, fail, warn for the plan, results.yaml is just a pile of individual test results.

And then we can change the loop and start involving the plan's flag to override that plan's evaluation of its results. This would materialize "plan result", in the sense of "an exit code a plan would produce if there were no other plans that might make the result even worse".

Yes, "counting" the final exit code for each plan makes sense. On the other hand, I'm thinking, whether it would not be more natural for user to state the expected empty plan result as pass, fail... instead of specifying the exit code which usually is not that interesting from the user point of view.

What happens when there are two plans in a run, one empty and one with failing tests - the usual "the worst wins" approach?

If we provide overal result for each plan then we can do this worst-wins math, but for exit codes it's hard to say what is actually "worst". For example, how do we compare what's better, 3 (no test results) or 4 (all tests skipped)?

I can also see this fitting in the discover definition. I guess there would be some ambiguity when defining multiple deiscover steps.

Interesting thought, storing the empty-plan-result flag in the discover phase could also make sense. But do we really need that level of granularity? Any tangible/specific use case on your mind?

@LecrisUT
Copy link
Contributor

I can also see this fitting in the discover definition. I guess there would be some ambiguity when defining multiple deiscover steps.

Interesting thought, storing the empty-plan-result flag in the discover phase could also make sense. But do we really need that level of granularity? Any tangible/specific use case on your mind?

I can see it go either way:

  • it could help early exit if the discover phase returns empty so that prepare is not executed
  • could be a more natural flow when you consider plugins for discover, e.g.:
    /plan:
      discover:
        - how: cmake
          no_tests: pass
        - how: fmf
          path: path/to/additional_static_tests
          no_tests: fail
  • but then if there are multiple discover steps, then each item must define their parameter. Support inheritance when parent is a list and child is not a list fmf#228 might be useful for this though. Although, why not have it in both, where execute would treat all discover phase as a whole.

@happz
Copy link
Collaborator

happz commented Jun 26, 2024

On the other hand, I'm thinking, whether it would not be more natural for user to state the expected empty plan result as pass, fail... instead of specifying the exit code which usually is not that interesting from the user point of view.

+1. It would follow the approach used by result in test specification, using words instead of numbers.

What happens when there are two plans in a run, one empty and one with failing tests - the usual "the worst wins" approach?

If we provide overal result for each plan then we can do this worst-wins math, but for exit codes it's hard to say what is actually "worst". For example, how do we compare what's better, 3 (no test results) or 4 (all tests skipped)?

Yep, that's the fun part :) We have 1 exit code and N plans, so eventually, we will need to reduce them somehow into a single number. Maybe both 3 and 4 have the same level of "badness", who knows...

@psss
Copy link
Collaborator Author

psss commented Aug 9, 2024

This issue has been escalated as part of TFT-2065. Not blocking the gating efforts as for them missing test coverage should not be ignored but still useful for some use cases like tier testing. Proposing as a should-have for 1.36.

@psss psss added this to the 1.36 milestone Aug 9, 2024
@psss psss added the priority | should medium priority, should be included in the next release label Aug 9, 2024
@martinhoyer martinhoyer modified the milestones: 1.36, 1.37 Sep 4, 2024
@happz happz modified the milestones: 1.37, 1.38 Sep 24, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area | results Related to how tmt stores and shares results priority | should medium priority, should be included in the next release step | execute Stuff related to the execute step
Projects
None yet
Development

No branches or pull requests

4 participants