Description
Feature or enhancement
Proposal:
As has been mentioned in numerous issues (See #119169 for example) , there are quite a few performance issues regarding os.scandir and os.walk.
Diving into the implementation - see os_scandir_impl
in posixmodule.c - I noticed that we are currently using WinAPI to list a directory. Looking into the relevant WinAPI functions (FindNextFileW
, FindFirstFileW
) it seems redundant to implement our Python wrapper around these wrappers. I propose to use the native NT functions - for example NtQueryDirectoryFile
- directly, as we are already implementing a wrapper ourselves.
After quite a bit of reading MS-Docs and looking at Kernel32.dll & NtDll.dll, I have reached a stable implementation of this proposal. Unfortunately, I could not significantly reduce the amount of syscalls being performed, however I did reduce the amount of memory allocation and copying by a ratio of 1:6.
Additionally, this new implementation should aid in future implementations around Windows file system operations (See #99454 for any easy example).
This improves the performance of both os.scandir
and os.walk
which is implemented over it.
Has this already been discussed elsewhere?
This is a minor feature, which does not need previous discussion elsewhere
Links to previous discussion of this feature:
No response