Overbroad xfail marks will eventually make CI fail

### Background

#1679 included improvements to a number of tests that are known to fail on some platforms, by marking them `xfail` instead of `skip` so they are still run and their status is reported, but without a failing status causing the whole test run to fail. However, it applied [`xfail`](https://docs.pytest.org/en/7.1.x/how-to/skipping.html#xfail-mark-test-functions-as-expected-to-fail) to too many tests, due to limitations on granularity when applying [`pytest` marks](https://docs.pytest.org/en/latest/how-to/mark.html) to `unittest` test cases generated by `@ddt` parameterization.

https://github.com/gitpython-developers/GitPython/blob/340da6d39397253da8d0807179b0ecb5952effca/test/test_util.py#L221-L228

https://github.com/gitpython-developers/GitPython/blob/340da6d39397253da8d0807179b0ecb5952effca/test/test_util.py#L233-L245

### Upcoming impact

Although this was known and discussed in #1679, and FIXME comments about it were included in the code, the problem turns out to be somewhat more serious than I had anticipated: if not addressed, it will eventually lead to test failures in a future version of `pytest`. This is because the default behavior of an *unexpectedly passing* test--one that is marked `xfail` but passes--will most likely change in pytest 8. Because GitPython does not specify upper bounds on most of its development dependencies, and pytest is one of the development dependencies for which no upper bound is specified, pytest 8 will be automatically installed once it is (stably) released.

 Specifically, and in the absence of configuration or command-line options to `pytest` that override the behavior:

- A test marked `xfail` that fails, and fails in the expected way, produces an XFAIL status, which is treated similarly to PASS. We always want this.
- A test marked `xfail` that fails in a detectably unexpected way--where a different exception results than the one that was [expected](https://docs.pytest.org/en/7.1.x/how-to/skipping.html#raises-parameter)--produces a FAIL status. We always want this.
- A test marked `xfail` that *passes* produces an XPASS status. How this status is treated is more complicated. The `xfail` mark supports [an optional `strict` parameter](https://docs.pytest.org/en/7.1.x/how-to/skipping.html#strict-parameter). Where present, it determines whether the XPASS fails the test run like a FAIL status would, or does not fail the test run (thus behaving like PASS or XFAIL). If absent, the `xfail_strict` configuration option provides the default. Currently, as of pytest 7, `xfail_strict` defaults to `False` when not specified.

As noted in https://github.com/pytest-dev/pytest/issues/11467, which was opened by a pytest maintainer and is listed for pytest's [8.0 milestone](https://github.com/pytest-dev/pytest/milestone/35), the default is planned to be changed from `False` to `True` starting in pytest 8.0. (See also https://github.com/pytest-dev/pytest/pull/11499.)

### Possible fixes

Breakage could be avoided (at least for a while, since `strict=False` [may eventually](https://github.com/pytest-dev/pytest/issues/11467#issuecomment-1733852526) be removed as a feature) by passing `strict=False` or setting `xfail_strict=false` for `pytest` in `pyproject.toml`. It is also possible to set an upper bound like `<8` for `pytest` in `test-requirements.txt`.

However, I recommend this instead be fixed by reorganizing the tests in `test_util.py` so that the tests of `cygpath` and `decygpath`--which are the ones that have the insufficiently precise `xfail` markings that mark some generated test cases `xfail` even though they are known to pass--can be pure `pytest` tests. Because they are currently `unittest` tests, they cannot be generated by `@pytest.mark.parametrize` (hence `@ddt` is used). But if they could be generated with the `parametrize` mark then they could have per-case markings, because `parametrize` supports an optional `marks` argument. They could then have the `xfail` mark applied to exactly the cases where failure is really expected.

That approach – which I mentioned in #1679 itself and in https://github.com/gitpython-developers/GitPython/pull/1700#discussion_r1353654395, and more recently alluded to in #1725 and https://github.com/gitpython-developers/GitPython/pull/1726#issuecomment-1791541248 – has the following advantages over other approaches that effectively just suppress the problem:

- Any XPASS will be a sign that something has changed and should be looked into, thereby building on the improvements in [#1679](https://github.com/gitpython-developers/GitPython/pull/1679).
- Although we have FIXME comments, the current situation is still misleading in the test results themselves, which indicate that some tests are unexpectedly passing.
- When the default treatment of XPASS in `pytest` changes--but also even before that, once it is documented to change--the presence of expected XPASSes will be more misleading than it is already, even if GitPython is not using a version of `pytest` affected by the change. This is because that change will further solidify people's expectations about what XPASS indicates, including for people who are trying to become familiar with GitPython.
- Reorganizing the tests in `test_util.py` can also help clarify the tests of `rmtree` behavior, and help make them easier to modify. This is useful because it will allow building on [#1700](https://github.com/gitpython-developers/GitPython/pull/1700) toward an eventual complete fix for [#790](https://github.com/gitpython-developers/GitPython/issues/790). (In addition, I want to make sure the [planned](https://github.com/gitpython-developers/GitPython/pull/1654#issuecomment-1717239735) native Windows CI jobs don't have the effect of calcifying cleanup logic in `rmtree` that otherwise could or should change, or at least that this does not happen in ways that impinge on non-Windows platforms. I think such a reorganization will help with that, too.)

I have opened #1729, which fixes this issue by reorganizing tests in `test_util.py` in this way.

	# FIXME: Mark only the /proc-prefixing cases xfail, somehow (or fix them).
	@pytest.mark.xfail(
	reason="Many return paths prefixed /proc/cygdrive instead.",
	raises=AssertionError,
	)
	@skipUnless(sys.platform == "cygwin", "Paths specifically for Cygwin.")
	@ddt.idata(_norm_cygpath_pairs + _unc_cygpath_pairs)
	def test_cygpath_ok(self, case):

	@pytest.mark.xfail(
	reason=R'2nd example r".\bar" -> "bar" fails, returns "./bar"',
	raises=AssertionError,
	)
	@skipUnless(sys.platform == "cygwin", "Paths specifically for Cygwin.")
	@ddt.data(
	(R"./bar", "bar"),
	(R".\bar", "bar"), # FIXME: Mark only this one xfail, somehow (or fix it).
	(R"../bar", "../bar"),
	(R"..\bar", "../bar"),
	(R"../bar/.\foo/../chu", "../bar/chu"),
	)
	def test_cygpath_norm_ok(self, case):

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Overbroad xfail marks will eventually make CI fail #1728

Background

Upcoming impact

Possible fixes

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Uh oh!

Overbroad xfail marks will eventually make CI fail #1728

Description

Background

Upcoming impact

Possible fixes

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions