Skip to content

[lit] show retry attempts #142413

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 8 commits into from
Jun 6, 2025
Merged

[lit] show retry attempts #142413

merged 8 commits into from
Jun 6, 2025

Conversation

kwk
Copy link
Contributor

@kwk kwk commented Jun 2, 2025

If a test took more than one attempt to complete, show the number of attempts and the maximum allowed attempts as 2 of 4 attempts inside the <progress info> (see TEST RUN OUTPUT FORMAT).

NOTE: Additionally this is a fixup for #141851 where the tests were not quite right. max-retries-per-test/allow-retries-test_retry_attempts/test.py was added but never used there. Now we're calling it. To correlate better between the test output and the test script I've used higher numbers of max allowed retries.

@llvmbot
Copy link
Member

llvmbot commented Jun 2, 2025

@llvm/pr-subscribers-testing-tools

Author: Konrad Kleine (kwk)

Changes

This introduces --show-attempts-count output option to lit.

It shows the number of attempts needed for a test to pass. It also shows the maximum number of attempts that were allowed for this test. This option is useful when you want to understand how many attempts took for a flaky test (FLAKYPASS) to pass.

This is how the test output looks like without --show-attempts-count:

PASS: your-test-suite :: your-first-test.py (1 of 3)
FLAKYPASS: your-test-suite :: your-second-test.py (2 of 2)

This is the output with --show-attempts-count:

PASS: your-test-suite :: your-first-test.py (1 of 2)
FLAKYPASS: your-test-suite :: your-second-test.py (2 of 2) [attempts=3,max_allowed_attempts=4]

In this case your-second-test.py was executed three times and it succeeded after the third time (attempts=3). Technically another run would have been possible (max_allowed_attempts=4).

We will only append the extra information when a test was allowed more than one attempt to succeed (i.e. see :option:--max-retries-per-test).

NOTE: Additionally this is a fixup for #141851 where the tests were not quite right. max-retries-per-test/allow-retries-test_retry_attempts/test.py was added but never used there. Now we're calling it. To correlate better between the test output and the test script I've used higher numbers of max allowed retries.


Full diff: https://github.com/llvm/llvm-project/pull/142413.diff

9 Files Affected:

  • (modified) llvm/docs/CommandGuide/lit.rst (+28)
  • (modified) llvm/utils/lit/lit/Test.py (+5-1)
  • (modified) llvm/utils/lit/lit/TestRunner.py (+1-1)
  • (modified) llvm/utils/lit/lit/cl_arguments.py (+6)
  • (modified) llvm/utils/lit/lit/display.py (+15-2)
  • (modified) llvm/utils/lit/tests/Inputs/max-retries-per-test/allow-retries-no-test_retry_attempts/test.py (+1-1)
  • (modified) llvm/utils/lit/tests/Inputs/max-retries-per-test/allow-retries-test_retry_attempts/test.py (+1-1)
  • (modified) llvm/utils/lit/tests/Inputs/max-retries-per-test/no-allow-retries-test_retry_attempts/lit.cfg (+1-1)
  • (modified) llvm/utils/lit/tests/allow-retries.py (+16-7)
diff --git a/llvm/docs/CommandGuide/lit.rst b/llvm/docs/CommandGuide/lit.rst
index 86879f870e06e..73219cb7bfe98 100644
--- a/llvm/docs/CommandGuide/lit.rst
+++ b/llvm/docs/CommandGuide/lit.rst
@@ -144,6 +144,34 @@ OUTPUT OPTIONS
 
  Show expectedly failed tests.
 
+.. option:: --show-attempts-count
+
+ Show the number of attempts needed for a test to pass. It also shows the
+ maximum number of attempts that were allowed for this test. This option is
+ useful when you want to understand how many attempts took for a flaky test
+ (``FLAKYPASS``) to pass.
+
+ This is how the test output looks like *without* :option:`--show-attempts-count`
+
+ .. code-block:: none
+
+  PASS: your-test-suite :: your-first-test.py (1 of 3)
+  FLAKYPASS: your-test-suite :: your-second-test.py (2 of 2)
+
+ This is the output *with* :option:`--show-attempts-count`
+
+ .. code-block:: none
+
+  PASS: your-test-suite :: your-first-test.py (1 of 2)
+  FLAKYPASS: your-test-suite :: your-second-test.py (2 of 2) [attempts=3,max_allowed_attempts=4]
+
+ In this case ``your-second-test.py`` was executed three times and it succeeded
+ after the third time (``attempts=3``). Technically another run would have been
+ possible (``max_allowed_attempts=4``).
+
+ We will only append the extra information when a test was allowed more than one
+ attempt to succeed (i.e. see :option:`--max-retries-per-test`).
+
 .. _execution-options:
 
 EXECUTION OPTIONS
diff --git a/llvm/utils/lit/lit/Test.py b/llvm/utils/lit/lit/Test.py
index 051062706f856..15097f41eb4ef 100644
--- a/llvm/utils/lit/lit/Test.py
+++ b/llvm/utils/lit/lit/Test.py
@@ -151,7 +151,7 @@ def toMetricValue(value):
 class Result(object):
     """Wrapper for the results of executing an individual test."""
 
-    def __init__(self, code, output="", elapsed=None):
+    def __init__(self, code, output="", elapsed=None, attempts=None, max_allowed_attempts=None):
         # The result code.
         self.code = code
         # The test output.
@@ -164,6 +164,10 @@ def __init__(self, code, output="", elapsed=None):
         self.metrics = {}
         # The micro-test results reported by this test.
         self.microResults = {}
+        # How often was the test run?
+        self.attempts = attempts
+        # How many attempts were allowed for this test
+        self.max_allowed_attempts = max_allowed_attempts
 
     def addMetric(self, name, value):
         """
diff --git a/llvm/utils/lit/lit/TestRunner.py b/llvm/utils/lit/lit/TestRunner.py
index 16e9c7fbf45c5..8eb79a7470307 100644
--- a/llvm/utils/lit/lit/TestRunner.py
+++ b/llvm/utils/lit/lit/TestRunner.py
@@ -2292,7 +2292,7 @@ def runOnce(
     if err:
         output += """Command Output (stderr):\n--\n%s\n--\n""" % (err,)
 
-    return lit.Test.Result(status, output)
+    return lit.Test.Result(status, output, attempts=i+1, max_allowed_attempts=attempts)
 
 
 def executeShTest(
diff --git a/llvm/utils/lit/lit/cl_arguments.py b/llvm/utils/lit/lit/cl_arguments.py
index 3292554ab5ff7..6d98bb1c5ce56 100644
--- a/llvm/utils/lit/lit/cl_arguments.py
+++ b/llvm/utils/lit/lit/cl_arguments.py
@@ -104,6 +104,12 @@ def parse_args():
         help="Do not use curses based progress bar",
         action="store_false",
     )
+    format_group.add_argument(
+        "--show-attempts-count",
+        dest="showAttemptsCount",
+        help="Show number of attempts and maximum attempts for flaky tests.",
+        action="store_true",
+    )
 
     # Note: this does not generate flags for user-defined result codes.
     success_codes = [c for c in lit.Test.ResultCode.all_codes() if not c.isFailure]
diff --git a/llvm/utils/lit/lit/display.py b/llvm/utils/lit/lit/display.py
index 7de5a298d2302..084ff15fdd21f 100644
--- a/llvm/utils/lit/lit/display.py
+++ b/llvm/utils/lit/lit/display.py
@@ -117,10 +117,23 @@ def clear(self, interrupted):
     def print_result(self, test):
         # Show the test result line.
         test_name = test.getFullName()
+
+        extra_info = ""
+        extra_attrs = []
+
+        if self.opts.showAttemptsCount:
+            if test.result.max_allowed_attempts is not None and test.result.max_allowed_attempts > 1:
+                extra_attrs.append(f"attempts={test.result.attempts}")
+                extra_attrs.append(f"max_allowed_attempts={test.result.max_allowed_attempts}")
+
+        if len(extra_attrs) > 0:
+            extra_info = " [" + ",".join(extra_attrs) + "]"
+
         print(
-            "%s: %s (%d of %d)"
-            % (test.result.code.name, test_name, self.completed, self.num_tests)
+            "%s: %s (%d of %d)%s"
+            % (test.result.code.name, test_name, self.completed, self.num_tests, extra_info)
         )
+        
 
         # Show the test failure output, if requested.
         if (test.isFailure() and self.opts.showOutput) or self.opts.showAllOutput:
diff --git a/llvm/utils/lit/tests/Inputs/max-retries-per-test/allow-retries-no-test_retry_attempts/test.py b/llvm/utils/lit/tests/Inputs/max-retries-per-test/allow-retries-no-test_retry_attempts/test.py
index f2333a7de455a..4227b89a1b452 100644
--- a/llvm/utils/lit/tests/Inputs/max-retries-per-test/allow-retries-no-test_retry_attempts/test.py
+++ b/llvm/utils/lit/tests/Inputs/max-retries-per-test/allow-retries-no-test_retry_attempts/test.py
@@ -1,4 +1,4 @@
-# ALLOW_RETRIES: 3
+# ALLOW_RETRIES: 8
 # RUN: "%python" "%s" "%counter"
 
 import sys
diff --git a/llvm/utils/lit/tests/Inputs/max-retries-per-test/allow-retries-test_retry_attempts/test.py b/llvm/utils/lit/tests/Inputs/max-retries-per-test/allow-retries-test_retry_attempts/test.py
index f2333a7de455a..cf3c5f849d1a3 100644
--- a/llvm/utils/lit/tests/Inputs/max-retries-per-test/allow-retries-test_retry_attempts/test.py
+++ b/llvm/utils/lit/tests/Inputs/max-retries-per-test/allow-retries-test_retry_attempts/test.py
@@ -1,4 +1,4 @@
-# ALLOW_RETRIES: 3
+# ALLOW_RETRIES: 10
 # RUN: "%python" "%s" "%counter"
 
 import sys
diff --git a/llvm/utils/lit/tests/Inputs/max-retries-per-test/no-allow-retries-test_retry_attempts/lit.cfg b/llvm/utils/lit/tests/Inputs/max-retries-per-test/no-allow-retries-test_retry_attempts/lit.cfg
index 2279d293849a8..cf15d1f442afa 100644
--- a/llvm/utils/lit/tests/Inputs/max-retries-per-test/no-allow-retries-test_retry_attempts/lit.cfg
+++ b/llvm/utils/lit/tests/Inputs/max-retries-per-test/no-allow-retries-test_retry_attempts/lit.cfg
@@ -9,4 +9,4 @@ config.test_exec_root = None
 config.substitutions.append(("%python", lit_config.params.get("python", "")))
 config.substitutions.append(("%counter", lit_config.params.get("counter", "")))
 
-config.test_retry_attempts = 3
\ No newline at end of file
+config.test_retry_attempts = 9
\ No newline at end of file
diff --git a/llvm/utils/lit/tests/allow-retries.py b/llvm/utils/lit/tests/allow-retries.py
index ba95d34a09a4c..847dd08d0e5e5 100644
--- a/llvm/utils/lit/tests/allow-retries.py
+++ b/llvm/utils/lit/tests/allow-retries.py
@@ -72,49 +72,58 @@
 #     CHECK-TEST7: Passed With Retry: 1
 
 # This test only passes on the 4th try. Here we check that a test can be re-run when:
-#  * The "--max-retries-per-test" is specified high enough (3).
+#  * The "--max-retries-per-test" is specified high enough (7).
 #  * No ALLOW_RETRIES keyword is used in the test script.
 #  * No config.test_retry_attempts is adjusted in the test suite config file.
 # RUN: rm -f %t.counter
 # RUN: %{lit} %{inputs}/max-retries-per-test/no-allow-retries-no-test_retry_attempts/test.py \
-# RUN:   --max-retries-per-test=3 \
+# RUN:   --max-retries-per-test=7 \
+# RUN:   --show-attempts-count \
 # RUN:   -Dcounter=%t.counter \
 # RUN:   -Dpython=%{python} \
 # RUN: | FileCheck --check-prefix=CHECK-TEST8 %s
+# CHECK-TEST8: FLAKYPASS: no-allow-retries-no-test_retry_attempts :: test.py (1 of 1) [attempts=4,max_allowed_attempts=8]
 # CHECK-TEST8: Passed With Retry: 1
 
 # This test only passes on the 4th try. Here we check that a test can be re-run when:
 #  * The "--max-retries-per-test" is specified too low (2).
-#  * ALLOW_RETRIES is specified high enough (3)
+#  * ALLOW_RETRIES is specified high enough (8)
 #  * No config.test_retry_attempts is adjusted in the test suite config file.
 # RUN: rm -f %t.counter
 # RUN: %{lit} %{inputs}/max-retries-per-test/allow-retries-no-test_retry_attempts/test.py \
 # RUN:   --max-retries-per-test=2 \
+# RUN:   --show-attempts-count \
 # RUN:   -Dcounter=%t.counter \
 # RUN:   -Dpython=%{python} \
 # RUN: | FileCheck --check-prefix=CHECK-TEST9 %s
+# CHECK-TEST9: FLAKYPASS: allow-retries-no-test_retry_attempts :: test.py (1 of 1) [attempts=4,max_allowed_attempts=9]
 # CHECK-TEST9: Passed With Retry: 1
 
 # This test only passes on the 4th try. Here we check that a test can be re-run when:
 #  * The "--max-retries-per-test" is specified too low (2).
 #  * No ALLOW_RETRIES keyword is used in the test script.
-#  * config.test_retry_attempts is set high enough (3).
+#  * config.test_retry_attempts is set high enough (9).
 # RUN: rm -f %t.counter
 # RUN: %{lit} %{inputs}/max-retries-per-test/no-allow-retries-test_retry_attempts/test.py \
 # RUN:   --max-retries-per-test=2 \
+# RUN:   --show-attempts-count \
 # RUN:   -Dcounter=%t.counter \
 # RUN:   -Dpython=%{python} \
 # RUN: | FileCheck --check-prefix=CHECK-TEST10 %s
+# CHECK-TEST10: FLAKYPASS: no-allow-retries-test_retry_attempts :: test.py (1 of 1) [attempts=4,max_allowed_attempts=10]
 # CHECK-TEST10: Passed With Retry: 1
 
 # This test only passes on the 4th try. Here we check that a test can be re-run when:
 #  * The "--max-retries-per-test" is specified too low (1).
-#  * ALLOW_RETRIES keyword set high enough (3).
-#  * config.test_retry_attempts is set too low enough (2).
+#  * ALLOW_RETRIES keyword is set high enough (10)
+#  * config.test_retry_attempts is set too low (2).
 # RUN: rm -f %t.counter
-# RUN: %{lit} %{inputs}/max-retries-per-test/no-allow-retries-test_retry_attempts/test.py \
+# RUN: %{lit} %{inputs}/max-retries-per-test/allow-retries-test_retry_attempts/test.py \
 # RUN:   --max-retries-per-test=1 \
+# RUN:   --show-attempts-count \
 # RUN:   -Dcounter=%t.counter \
 # RUN:   -Dpython=%{python} \
 # RUN: | FileCheck --check-prefix=CHECK-TEST11 %s
+# CHECK-TEST11: FLAKYPASS: allow-retries-test_retry_attempts :: test.py (1 of 1) [attempts=4,max_allowed_attempts=11]
 # CHECK-TEST11: Passed With Retry: 1
+

Copy link

github-actions bot commented Jun 2, 2025

✅ With the latest revision this PR passed the Python code formatter.

This introduces `--show-attempts-count` output option to `lit`.

It shows the number of attempts needed for a test to pass. It also shows
the maximum number of attempts that were allowed for this test. This option
is useful when you want to understand how many attempts took for a flaky
test (``FLAKYPASS``) to pass.

This is how the test output looks like *without* `--show-attempts-count`:

```
PASS: your-test-suite :: your-first-test.py (1 of 3)
FLAKYPASS: your-test-suite :: your-second-test.py (2 of 2)
```

This is the output *with* `--show-attempts-count`:

```
PASS: your-test-suite :: your-first-test.py (1 of 2)
FLAKYPASS: your-test-suite :: your-second-test.py (2 of 2) [attempts=3,max_allowed_attempts=4]
```

In this case `your-second-test.py` was executed three times and it
succeeded after the third time (`attempts=3`). Technically another run would have
been possible (`max_allowed_attempts=4`).

We will only append the extra information when a test was allowed more
than one attempt to succeed (i.e. see :option:`--max-retries-per-test`).

NOTE: Additionally this is a fixup for llvm#141851 where the tests were not
quite right. `max-retries-per-test/allow-retries-test_retry_attempts/test.py`
was added but never used there. Now we're calling it. To correlate better between
the test output and the test script I've used higher numbers of max
allowed retries.
@kwk kwk force-pushed the lit-show-attempts branch from 2984d3a to ccffc2e Compare June 2, 2025 16:09
Copy link
Contributor

@nikic nikic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we need an option for this? Is there some downside for always showing it? Esp. if you make the output a bit more compact (like 1 of 4 attempts).

@kwk
Copy link
Contributor Author

kwk commented Jun 4, 2025

Why do we need an option for this? Is there some downside for always showing it? Esp. if you make the output a bit more compact (like 1 of 4 attempts).

I figured that one only wants to see it for tests that are allowed to be re-run. We simply don't need the information for other tests.

The format suggestion that you've made is good but I wouldn't allow us to extend the information (should we need this in the past). That's the reasoning behind the [key=value,key2=value2] notation. But I guess you already figured that out.

Can you be really specific for how you want to see it? I make a proposal here and you can accept it if you want:

PASS: your-test-suite :: your-first-test.py (1 of 2) [1 of 1 attempts]
FLAKYPASS: your-test-suite :: your-second-test.py (2 of 2) [3 of 4 attempts]

This would add the information to all tests (even if they're not allowed to be re-run). And it uses your shorter format. Want me to implement it this way? I have no strong opinion other than potential extensibility.

@nikic
Copy link
Contributor

nikic commented Jun 4, 2025

@kwk My expectation would be that this is only shown for tests that allow multiple attempts, and where more than one attempt is made. (Like you show in the PR description -- just without the need to also enable an extra option.)

@kwk
Copy link
Contributor Author

kwk commented Jun 4, 2025

@kwk My expectation would be that this is only shown for tests that allow multiple attempts, and where more than one attempt is made. (Like you show in the PR description -- just without the need to also enable an extra option.)

Let me sum it up:

  • Dump the option and make it the default behaviour
  • Show number of attempts for all tests that were allowed multiple tries AND used more than one
  • Fix the format to:
PASS: your-test-suite :: your-first-test.py (1 of 3)
FLAKYPASS: your-test-suite :: your-second-test.py (2 of 3) [3 of 4 attempts]
PASS: your-test-suite :: your-third-test.py (3 of 3)

Where

  • your-first-test.py was only allowed one attempts,
  • your-second-test.py was only allowed 4 attempts and used 3,
  • your-third-test.py was only allowed i.e. 10 attempts but succeeded on first attempt (-> no output)

Correct?

@kwk
Copy link
Contributor Author

kwk commented Jun 4, 2025

@nikic I've updated the PR. I hope this is what you had in mind?

@kwk kwk changed the title [lit] introduce --show-attempts-count [lit] show retry attempts Jun 5, 2025
@kwk kwk requested a review from nikic June 5, 2025 14:10
Copy link
Contributor

@nikic nikic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

extra_info = ""
if (
test.result.max_allowed_attempts is not None
and test.result.max_allowed_attempts > 1
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The test.result.max_allowed_attempts > 1 check is probably redundant? Can't have attempts > 1 if max_allowed_attempts is not > 1.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure. It was overprotective and before I did print the information when a test was allowed more than one attempt but I didn't care about the actual attempts. With your new logic this is no longer needed. Simplified in 1eb0b45.

@kwk
Copy link
Contributor Author

kwk commented Jun 6, 2025

Hopefully 66ec87e fixed this issue:

Traceback (most recent call last):
  File "/usr/lib/python3.10/threading.py", line 1016, in _bootstrap_inner
    self.run()
  File "/usr/lib/python3.10/threading.py", line 953, in run
    self._target(*self._args, **self._kwargs)
  File "/usr/lib/python3.10/multiprocessing/pool.py", line 595, in _handle_results
    cache[job]._set(i, obj)
  File "/usr/lib/python3.10/multiprocessing/pool.py", line 779, in _set
    self._callback(self._value)
  File "/var/lib/buildkite-agent/builds/linux-56-59b8f5d88-v6lwg-1/llvm-project/github-pull-requests/llvm/utils/lit/lit/display.py", line 105, in update
    self.print_result(test)
  File "/var/lib/buildkite-agent/builds/linux-56-59b8f5d88-v6lwg-1/llvm-project/github-pull-requests/llvm/utils/lit/lit/display.py", line 122, in print_result
    if test.result.attempts > 1:
TypeError: '>' not supported between instances of 'NoneType' and 'int'

@kwk kwk merged commit 6918314 into llvm:main Jun 6, 2025
8 checks passed
@kwk kwk deleted the lit-show-attempts branch June 6, 2025 08:24
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants