Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ignore __pycache__ contents and *.pyc files in external dependencies #570

Merged
merged 2 commits into from
Nov 24, 2021

Conversation

UebelAndre
Copy link
Contributor

@UebelAndre UebelAndre commented Nov 16, 2021

PR Checklist

Please check if your PR fulfills the following requirements:

  • Does not include precompiled binaries, eg. .par files. See CONTRIBUTING.md for info
  • Tests for the changes have been added (for bug fixes / features)
  • Docs have been added / updated (for bug fixes / features)

PR Type

What kind of change does this PR introduce?

  • Bugfix
  • Feature (please, look at the "Scope of the project" section in the README.md file)
  • Code style update (formatting, local variables)
  • Refactoring (no functional changes, no api changes)
  • Build related changes
  • CI related changes
  • Documentation content changes
  • Other... Please describe:

What is the current behavior?

I noticed in some of my builds were getting a bunch of unexpected cache misses and found that some pypi dependencies were including some __pycache__ contents and left-over *.pyc files. I don't believe these are needed or should ever be used by the Bazel rules.

Issue Number: N/A

What is the new behavior?

This PR prevents these files from being included in the pip_parse and pip_install generated py_library targets.

Does this PR introduce a breaking change?

  • Yes
  • No

Other information

@mattem
Copy link
Collaborator

mattem commented Nov 19, 2021

Hm, this only excludes it from data, I'm finding __pycache__ files in srcs

@UebelAndre
Copy link
Contributor Author

I'm on mobile now but I think it'd make sense to exclude them from srcs as well. If you wanted to push that change to this branch.

@mattem
Copy link
Collaborator

mattem commented Nov 19, 2021

Doesn't look like I have permission to push, these were my changes:

diff --git a/python/pip_install/extract_wheels/lib/bazel.py b/python/pip_install/extract_wheels/lib/bazel.py
index d498581..6402d8f 100644
--- a/python/pip_install/extract_wheels/lib/bazel.py
+++ b/python/pip_install/extract_wheels/lib/bazel.py
@@ -103,6 +103,7 @@ def generate_build_file_contents(
     """

     data_exclude = [
+        "*dist-info/RECORD",
         "*.whl",
         "**/__pycache__/**",
         "**/*.py",
@@ -137,7 +138,7 @@ def generate_build_file_contents(

         py_library(
             name = "{name}",
-            srcs = glob(["**/*.py"], exclude=["{entry_point_prefix}*.py"], allow_empty = True),
+            srcs = glob(["**/*.py"], exclude=["{entry_point_prefix}*.py", "**/__pycache__/**"], allow_empty = True),
             data = glob(["**/*"], exclude={data_exclude}),
             # This makes this directory a top-level in the python import
             # search path for anything that depends on this.

@UebelAndre
Copy link
Contributor Author

Doesn't look like I have permission to push, these were my changes:

@mattem Ah, I thought you were a maintainer here, makes sense then.

I picked up the only the additional exclude for __pycache__. I'm wondering why *dist-info/RECORD would be ignored and not all of *dist-info/**. Could you explain the reasoning there?

@thundergolfer thundergolfer requested review from thundergolfer and removed request for brandjon and lberki November 23, 2021 23:50
Copy link
Collaborator

@thundergolfer thundergolfer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks correct to me.

@thundergolfer thundergolfer merged commit 2b1d6be into bazelbuild:main Nov 24, 2021
@UebelAndre UebelAndre deleted the pyc branch November 27, 2021 16:58
@UebelAndre
Copy link
Contributor Author

Doesn't look like I have permission to push, these were my changes:

@mattem Ah, I thought you were a maintainer here, makes sense then.

I picked up the only the additional exclude for __pycache__. I'm wondering why *dist-info/RECORD would be ignored and not all of *dist-info/**. Could you explain the reasoning there?

Would still like to hear @mattem's response to this if you don't mind 😄

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants