Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

distutils.filelist.findall falls into infinite loop over a symlink to a parent directory #332

Open
ghost opened this issue Jan 18, 2015 · 1 comment

Comments

@ghost
Copy link

ghost commented Jan 18, 2015

Originally reported by: xi (Bitbucket: xi, GitHub: xi)


In pull request #105, a patch to distutils.filelist.findall() was updated to have it follow symlinks. Unfortunately, it now falls into an infinite loop over a symlink to a parent directory (we use such links to maintain a commonjs dependency graph). It could be easily fixed by eliminating already seen directory entries in findall(). Here is a patch:

diff -r 4b5954d5e760 setuptools/__init__.py
--- a/setuptools/__init__.py    Sat Jan 17 21:41:55 2015 +0100
+++ b/setuptools/__init__.py    Sun Jan 18 10:15:11 2015 -0500
@@ -137,8 +137,14 @@
     """Find all files under 'dir' and return the list of full filenames
     (relative to 'dir').
     """
+    seen = set()
     all_files = []
     for base, dirs, files in os.walk(dir, followlinks=True):
+        seen.add(os.path.realpath(base))
+        for dir in dirs[:]:
+            realpath = os.path.realpath(os.path.join(base, dir))
+            if realpath in seen:
+                dirs.remove(dir)
         if base==os.curdir or base.startswith(os.curdir+os.sep):
             base = base[2:]
         if base:

@ghost ghost added major bug labels Mar 29, 2016
@ssbarnea
Copy link

ssbarnea commented Jun 23, 2021

I can confirm this bug with the mention that is not always infinite, sometimes is more of like 5-10mins of 100% CPU usage. I raised this bug on https://bugs.python.org/issue44497 with a proposed patch at python/cpython#26873 but my hopes are low because distutils is already marked as deprecated in py3.10, so I will also propose a fix on our vendored copy.

What is bit of mindblowing is that I see not less than 2 copies of def _find_all_simple inside the codebase, not sure which one I need to path, all?:

setuptools/__init__.py:212:5:def _find_all_simple(path):
setuptools/_distutils/filelist.py:246:5:def _find_all_simple(path):

ssbarnea added a commit to ssbarnea/setuptools that referenced this issue Jun 23, 2021
ssbarnea added a commit to ssbarnea/setuptools that referenced this issue Jun 23, 2021
ssbarnea added a commit to ssbarnea/distutils that referenced this issue Jun 23, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant