Skip to content

Discussion on removing the py dependency for path handling #6130

Closed
@boxed

Description

@boxed

I had a look again at pytest performance, and was struck again by how many stat calls pytest performs. This seems to boil down to using the path classes from py for checking file existence, walking directory structures and such. Unfortunately instances of these are passed to plugins so aren't internal implementation details.

I suggest we think of using plain strings as paths that are passed to plugins. This is obviously a breaking change so can't be done until pytest 6. It's not clear to me how we'd make deprecation warnings for this though :(

To take a concrete example of this being a problem, the test suite for the product I work on calls stat 79k times just for the collect phase. If I monkey patch stat to log the paths there are ~3k unique paths in the output. I can get a little performance boost by monkey patching stat to be cached:

orig_stat = os.stat

cache = {}


def monkey_stat(*args, **kwargs):
    a = args[0]
    if a in cache:
        return cache[a]
    r = orig_stat(*args, **kwargs)
    cache[a] = r

    return r

That this can improve the performance is pretty silly :P

It seems like pytest could just use os.walk once and then use that data for the rest of the run of the program...

Metadata

Metadata

Assignees

No one assigned

    Labels

    type: backward compatibilitymight present some backward compatibility issues which should be carefully noted in the changelogtype: proposalproposal for a new feature, often to gather opinions or design the API around the new feature

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions