-
-
Notifications
You must be signed in to change notification settings - Fork 34.2k
test_runner: support expecting a test-case to fail #60669
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
test_runner: support expecting a test-case to fail #60669
Conversation
Co-Authored-By: Alejandro Espa <98526766+vespa7@users.noreply.github.com>
|
Review requested:
|
|
There is long-established precedent for
What is the commitment level and timeline for this? I ask because I've already implemented lmk, please |
|
Ah, sure. I'm not particular about the name as long as it's intuitive (which I expect this could land pretty quickly: quite a few collaborators had previously voiced support for it, and to my knowledge there have been no general concerns raised against the feature itself. Aside from the name (which is trivial to update in the PR), the remaining work is:
Creating competing PRs is never desirable and generally pisses off the person / people already working on it. Happy to collaborate on the feature; but aside from influencing the name, I think this is very straightforward and has very little implication work, especially almost none remaining. In terms of timeline to finish this, I expect to have time this week (the past ~2 weeks were very tumultuous for me). |
|
Additional features that my version has:
I think that's it, though I'll have to take a look at |
|
Alex and I considered those and pushed them to v2 to get the base implementation landed sooner (many users probably won't need the v2 extras, so no need to make them wait for things they don't want/need). If you already have an implementation for that, I'm happy to defer to you for that once this lands 🙂 |
|
Deferring to v2 makes sense, but I think you should quickly consider how it would be added to the API, just to avoid potential API thrashing or getting painted into a corner. In the implementation, you have a string value given for How would you, in v2, specify the expected failure? An |
|
Arguably a new kind of OTOH, assertions of failures maybe should be reserved for testing that the code throws the correct exception under specific conditions. |
|
That's contrary to the intention of this feature: the test should be identical to the non-failing version. This is flag to flip.
|
We were thinking in the case of In the case of For all of these, there is no API thrashing as they're all truthy / falsy, which makes them additive to the current. |
|
it all sounds good to me. I'm glad this is getting into As to reporting, I think you have three options:
In my implementation, I chose option 2. A failing xfail generates a test todo event, where the todo message is prefixed with "xfail:". A passing xfail generates a test fail event, with an error indicating that it was an xfail that unexpectedly passed. This allows my xfail aware reporter to produce a more informative report (it formats xfail results differently from both pass and todo), while degrading gracefully for legacy reporters. |
|
I (and I think Alex too?) was imagining the result of Reporters that support the extra prop do so, and those that don't just behave "normally" (even the uninformed report is still correct). |
Jacob, both spec and dot reporters rely on:
I know this as I spent a lot of time trying to reverse engineer the semantics of TestsStream, which isn't documented nearly enough for people to write custom reporters without inferring semantics by reading all of the built-in reporter code. I will open a discussion or issue on that topic when I get the chance. But as part of my reverse engineering, I ended up paying off some test_runner tech debt in precisely this part of the code: #59700. It's seems to be falling through the cracks in PR limbo. Maybe we can get that merged? It isn't strictly necessary for your PR, but since we'll have review eyes on this area of code, it would be efficient to do both at the same time? |
|
|
||
|
|
||
| it.todo('sync pass todo', () => { | ||
| it.xfail('sync expect fail (method)', () => { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think you should also add a "sync pass xfail with message" like there is with todo and skip.
This opens up the question of what that should look like for TAP. In my own reporter, I append #XFAIL to the message. The discussion on TAP Directives for XPASS and XFAIL that I linked to in the main thread debates #XFAIL vs #TODO. The latter makes a lot of sense too... see my comment that conceptually an xfail is a special case of todo.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you mean in the case of specifying it via config object? If so, the value that would subsequently be supported would be the expected failure we talked about #60669 (comment) so
it('should add', { xfail: 'Expected 2 to equal 1' }, () => { … });where the test-case must fail with an error whose message is 'Expected 2 to equal 1'
Or as a regex:
it('should add', { xfail: /expected 2 to equal 1/i }, () => { … });When it fails with a different error, xfail would then not "pass" (and would be marked "failed", reported as failed, etc).
That would then be used in the test-case reporter output like
✔ should add # Expected 2 to equal 1
✔ should add # /expected 2 to equal 1/i
✖ should add # Expected 2 to equal 1
✖ should add # /expected 2 to equal 1/i
But since that's part of v1.1, I think for now it should not handle any value other than truthy/falsy.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think both a reason (like todo and skip) and expected error should be supported. Example reasons:
✔ should add #XFAIL Issue #389
✔ should add #XFAIL Can't fix until SemVer Major increment
I'm building a system where released bugs are tracked both as a a bug tracking issue and as an XFAIL test case.
I took your comment to mean (update of the current documentation to illustrate):
- skip
<boolean> | <string>If truthy, the test is skipped. If a string is provided, that string is displayed in the test results as the reason for skipping the test. Default: false.- todo
<boolean> | <string>If truthy, the test marked as TODO. If a string is provided, that string is displayed in the test results as the reason why the test is TODO. Default: false.- xfail
<boolean> | <string> | <RegExp>If truthy, the test marked as XFAIL. If a string is provided, that string is displayed in the test results as the reason why the test is expected to fail. If a RegExp is provided, only a failure that matches is permitted. Default: false.
Now I see my interpretation was wrong. In any case, that approach would allow EITHER a reason OR an expected error, NOT BOTH. I think both should be supported.
But as you say, we can punt until v1.1. 🤫🤐🙃
Huzzah, yes, that's it. Thanks! |
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #60669 +/- ##
==========================================
- Coverage 88.55% 88.53% -0.02%
==========================================
Files 703 703
Lines 208077 208449 +372
Branches 40083 40210 +127
==========================================
+ Hits 184254 184548 +294
- Misses 15841 15905 +64
- Partials 7982 7996 +14
🚀 New features to boost your workflow:
|
|
PR is quasi-blocked by a bug in the |
| switch (type) { | ||
| case 'test:fail': { | ||
| yield reportTest(data.nesting, data.testNumber, 'not ok', data.name, data.skip, data.todo); | ||
| yield reportTest(data.nesting, data.testNumber, 'not ok', data.name, data.skip, data.todo, data.xfail); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we decide to move forward with xfail, we will need to update all integrated reporters!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Didn't I already?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can see only the integration for tap, did you check spec, dot etc?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Spec and dot use the same util, and I did Spec 🙂
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
puuuurfect, can we add some tests to cover this behaviour for the different embedded reporters? 😁
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should add the new xfail here: test/fixtures/test-runner/output/output.js.
This fixture should be used by the snapshot tests for all the different reporters
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are you thinking I do it there instead of the output one I did do or in addition? I can see reasons for either:
-
More coverage to catch something.
-
More things to change, which is a maintenance burden.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If I'm not mistaken, currently no tests cover the actual output of xfail across all integrated reporters.
My only concern is that any behavior that becomes clearly defined should be tested to avoid unintentional breakage.
IMHO, it's fine to remove the existing test if it no longer adds value once other tests are added.
Regarding the maintenance burden: these are snapshot tests, so they're usually quick to update.
| These are tests written to otherwise pass but there is some issue (often | ||
| upstream) causing them to fail. Rather than using `skip` or `todo`, which would | ||
| effectively silence the failure, the test would instead be marked `xfail` (`x` | ||
| from "expected" + `fail`). That way, when the issue is resolved, that is |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IMHO, we should provide more behaviour-first documentation for this feature, rather than focusing on a specific use case!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was thinking to add that in the test runner Learn article. The sentiment for API docs seems to be purely what it is. I don't have an opinion.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ava calls this "failing", which seems a bit clearer to me than "xfail"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My biggest concern about this feature isn't the feature itself, but the naming 😬
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@ljharb @pmarchini See prior art: #60669 (comment)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
jest and ava, so yes? 🤷🏾♂️
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i think xfail is more intuitive but i defer to you guys. More important to me are semantics. Is failing a subclass of todo or pass? For protocols like TAP and downstream consumers that don't know about it, it should be mapped as a subclass of one or the other. Likewise for existing custom node:test reporters.
I think todo. A failing test is one that is flagged for future action. it shouldn't be easily ignored.
(my last argument for xfail is embedded above. there's gonna be a lot of confusion when people use the very common phrase "failing test".)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
a failing test that fails correctly, as a todo, make sense to me, as long as a failing test that passes incorrectly, is a fail :-)
fair point on the discussion confusion - doesn't change my position, but it's a good thing for others to consider.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
a failing test that fails correctly, as a todo, make sense to me, as long as a failing test that passes incorrectly, is a fail :-)
that's how i see it exactly
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"Successfully failed" and "failed to fail"; not confusing at all 🤪
I don't particularly care about the name—I want the functionality.
I'm not sure I understand this "subclass of todo". What does todo have to do with this? They seem fairly different to me:
A todo can pass or fail without affecting the suite's result.
An xfail must fail correctly, and it does affect the suite's result.
|
|
||
|
|
||
| it.todo('sync pass todo', () => { | ||
| it.xfail('sync expect fail (method)', () => { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would suggest adding a test similar to test-runner-todo-skip-tests.js too, as this specific test is mainly meant to test the output 😁
avivkeller
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Neato!



This PR adds support for an
xfaildirective (similar toskipandtodo), whereby a user can flag a test-case as expected to fail.