Skip to content

migrate os.listdir() to os.scandir() to increase performance #78

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Mar 6, 2025

Conversation

xsuchy
Copy link
Contributor

@xsuchy xsuchy commented Jan 17, 2025

@xsuchy
Copy link
Contributor Author

xsuchy commented Jan 17, 2025

I did very simple and trivial benchmark on walking through the kernel tree:

  • previously: 45 second
  • after this PR: 36 second

@xsuchy
Copy link
Contributor Author

xsuchy commented Jan 17, 2025

If this gets merged, I will open another PR for the remaining occurrences of listdir() in other functions.

As recommended in https://peps.python.org/pep-0471/

Signed-off-by: Miroslav Suchý <msuchy@redhat.com>
@xsuchy
Copy link
Contributor Author

xsuchy commented Jan 29, 2025

Any review comments?

@xsuchy
Copy link
Contributor Author

xsuchy commented Feb 19, 2025

@AyanSinhaMahapatra or @pombredanne can I kindly ask you for a review?

Copy link
Member

@AyanSinhaMahapatra AyanSinhaMahapatra left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@xsuchy Thanks++ for the PR.

Thanks for the references, we should definitely move towards using scandir() allright in this context. I've tested out this branch with scancode-toolkit latest too, and everything looks good, and we do get some performance improvements there on the codebase collection step similarly as indicated by you, thanks for doing and including those too.

  1. Some more CI updates and other fixes were also added to main, could you rebase to latest main?
  2. Please go ahead and use scandir() in the other instances of listdir() use in commoncode too.

This looks ready otherwise, and apologies for the late review.

We might want to do the same in other places too which would have some effect on the performance, https://github.com/search?q=org%3Aaboutcode-org+os.listdir&type=code maybe this could be useful in extractcode/fetchcode/deltacode too

@AyanSinhaMahapatra AyanSinhaMahapatra merged commit de159a4 into aboutcode-org:main Mar 6, 2025
9 of 13 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants