Discussion on removing the py dependency for path handling

I had a look again at pytest performance, and was struck again by how many `stat` calls pytest performs. This seems to boil down to using the path classes from py for checking file existence, walking directory structures and such. Unfortunately instances of these are passed to plugins so aren't internal implementation details. 

I suggest we think of using plain strings as paths that are passed to plugins. This is obviously a breaking change so can't be done until pytest 6. It's not clear to me how we'd make deprecation warnings for this though :(

To take a concrete example of this being a problem, the test suite for the product I work on calls stat 79k times *just for the collect phase*. If I monkey patch stat to log the paths there are ~3k unique paths in the output. I can get a little performance boost by monkey patching stat to be cached:

```python
orig_stat = os.stat

cache = {}


def monkey_stat(*args, **kwargs):
    a = args[0]
    if a in cache:
        return cache[a]
    r = orig_stat(*args, **kwargs)
    cache[a] = r

    return r
```

That this can improve the performance is pretty silly :P

It seems like pytest could just use `os.walk` once and then use that data for the rest of the run of the program...

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Discussion on removing the py dependency for path handling #6130

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Uh oh!

Discussion on removing the py dependency for path handling #6130

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions