Skip to content

Conversation

@tobyh-canva
Copy link

@tobyh-canva tobyh-canva commented Oct 26, 2025

PR Status: Addressing windows compatibility issues

Context

I've recently been looking into Bazel analysis performance in my company's monorepo (predominantly Java and Typescript), and was surprised to find that rules_python accounts for almost 60% of the CPU time during analysis across our repo (433s of 741s).

From a profile, it was clear that depset.to_list() being called out of _create_zip_file() was accounting for the vast majority of the analysis time. So I focused on eliminating these depset expansions, which are expensive for analysis (anyone who's unaware, see Bazel's Optimizing Performance page). Path normalization in _get_zip_runfiles_path was also accounting for a large chunk of the analysis time.

What was slightly less clear, but eventually apparent, was that runfiles.empty_filenames being provided as an arg to the zipper action was also causing a hit to analysis performance. I traced this back to the fact that the runfiles.empty_filenames depset in this context is actually coming from the GetInitPyFiles Java class in the Python utils built into Bazel, and this class was also causing an implicit expansion of the runfiles.files depset. You can verify this by running a profile using my PR's changes and with --incompatible_default_to_explicit_init_py=false (the default).

Note that despite the Bazel-native --build_python_zip flag being disabled by default on non-Windows systems, this only affects whether the zipapp is considered a default output, i.e. the analysis cost is still incurred if this flag is disabled.

Intent

My strategy for optimizing this was basically to get rid of any depset.to_list() calls, and shift reproducible but expensive business logic (e.g. path normalization) from the analysis phase to the build phase.

Changes

  1. Introduced a wrapper for @bazel_tools//tools/zip:zipper, written in C++, which prepares the input manifest to the zipper tool, including all path rewriting
  2. Change the inputs to _create_zip_file to provide only what it needs (runfiles_without_exe) and avoid a depset expansion + filter.

I chose to write the script in C++ because it would keep the actual build action performant whilst avoiding adding new dependencies to rules_python (rules_cc is already a dependency).

Testing

I haven't added new tests specific to this wrapper or the affected functionality, instead relying on the existing //tests. Please let me know if you'd prefer specific shell tests or C++ tests.

Snapshot tests

To verify builds were the same, I wrote some snapshot tests, which are not integrated in this branch. Let me know if you'd like these to be brought in (they're platform-specific, and the snapshots may need to be updated somewhat frequently for innocuous changes).

Manual tests

I manually ran snapshot tests above on macOS, Linux, and Windows, comparing a snapshot from the main branch with this branch.

Results

Here's the --starlark_cpu_profile flame graph before/after these changes (this is profiling analysis of bazel build --nobuild //... in our monorepo.

Before

Overall the CPU time appears to have been reduced by at least 95% for py_binary and py_test targets.

The memory profile shows a similar 95% reduction in memory usage for py_binary and py_test.

  • All build outputs are identical after these changes.
  • The overall build performance is impacted by the compilation of the new C++ wrapper script.
  • The build performance of the actual PythonZipper action is effectively unchanged - the additional work done by the C++ wrapper is barely visible compared to the work which was already being done by the actual zipper.

Notes

I wasn't aware that --incompatible_default_to_explicit_init_py=true made such a significant improvement to analysis performance. Is it worth mentioning this somewhere in the docs of rules_python? It smells to me like it just disables a legacy Google-specific quirk, though I could be wrong.

I initially moved the GetInitPyFiles logic from Java into the C++ wrapper, so at least it would remove the analysis performance hit. But since it's only optimising legacy behaviour, and users can just toggle a flag to disable it, I reverted that change and left the empty __init__.py generation alone.

@gemini-code-assist
Copy link
Contributor

Summary of Changes

Hello @tobyh-canva, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request significantly improves the analysis performance of py_binary and py_test rules in Bazel by addressing inefficiencies related to depset.to_list() expansions and path normalization. It introduces a new C++ wrapper to handle manifest preparation for the zipper tool, moving expensive logic from the analysis phase to the build phase, resulting in substantial reductions in both CPU time and memory consumption during Bazel analysis.

Highlights

  • Performance Improvement: Achieved a 95% reduction in CPU time and memory usage for py_binary and py_test targets during Bazel analysis by optimizing how Python zip manifests are generated.
  • Elimination of depset.to_list(): Replaced expensive depset.to_list() calls in Starlark with a C++ wrapper, shifting complex processing from the analysis phase to the build phase.
  • New C++ Wrapper for Zipper: Introduced py_executable_zip_gen, a C++ binary that prepares the input manifest for the zipper tool, handling path rewriting and __init__.py generation more efficiently.
  • Streamlined Starlark Logic: Simplified the _create_zip_file function in Starlark by reducing its inputs and offloading manifest generation details to the new C++ wrapper.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

@tobyh-canva tobyh-canva force-pushed the create-zip-file-analysis-optimisations-2 branch from c10ec90 to 7ef352e Compare October 26, 2025 22:16
Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This is a fantastic performance improvement! Moving the expensive depset manipulation and path normalization from the analysis phase to the execution phase by introducing a C++ wrapper is a great strategy. The flame graphs clearly show the massive impact of this change.

I have a few comments on the C++ implementation regarding correctness and security. Specifically, I've pointed out a potential issue with locating the zipper tool and the use of std::system, which can be a security risk.

Additionally, while you've done manual testing across platforms, this change would greatly benefit from automated tests. Adding unit tests for the new C++ wrapper (py_executable_zip_gen.cc) would ensure its logic is correct and robust. Also, adding integration tests to verify the PythonZipper action with the new wrapper would prevent future regressions. Since this is a critical path for building Python binaries, ensuring its correctness with a solid test suite is very important.

Overall, this is a very valuable contribution that will significantly improve analysis performance for many users of rules_python.

std::exit(1);
}

std::string zipper_path = runfiles->Rlocation("bazel_tools/tools/zip/zipper/zipper");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

The path bazel_tools/tools/zip/zipper/zipper seems incorrect. The zipper target from @bazel_tools//tools/zip:zipper should be available in runfiles at bazel_tools/tools/zip/zipper. The extra /zipper at the end is likely a typo and will cause this tool to fail to find the zipper executable.

Suggested change
std::string zipper_path = runfiles->Rlocation("bazel_tools/tools/zip/zipper/zipper");
std::string zipper_path = runfiles->Rlocation("bazel_tools/tools/zip/zipper");

Comment on lines +243 to +272
std::string cmd = zipper_path + " cC " + output + " @" + manifest_path;
int result = std::system(cmd.c_str());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

Using std::system is generally discouraged as it can introduce security vulnerabilities (command injection) and is not robust against paths with spaces or other special characters. Since the output path is passed from an external source (command-line arguments), it could potentially be manipulated to execute arbitrary commands.

A safer approach would be to use fork and execv on POSIX systems and CreateProcess on Windows to execute the zipper tool directly, without involving a shell.

If a full rewrite is not feasible, at a minimum, all path arguments in the command string should be properly quoted to handle spaces and prevent trivial command injection. For example:

std::string cmd = "\"" + zipper_path + "\" cC \"" + output + "\" @\"" + manifest_path + "\"";

However, replacing std::system is the recommended solution for robustness and security.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maintainers, please let me know if this is a genuine concern in this circumstance 😅

@tobyh-canva tobyh-canva force-pushed the create-zip-file-analysis-optimisations-2 branch 3 times, most recently from 12c96bc to 6db7a1a Compare October 26, 2025 22:57
@tobyh-canva tobyh-canva force-pushed the create-zip-file-analysis-optimisations-2 branch 2 times, most recently from 1b2dbc1 to fe6038f Compare October 26, 2025 23:08
@groodt
Copy link
Collaborator

groodt commented Oct 26, 2025

I wasn't aware that --incompatible_default_to_explicit_init_py=true made such a significant improvement to analysis performance. Is it worth mentioning this somewhere in the docs of rules_python? It smells to me like it just disables a legacy Google-specific quirk, though I could be wrong.

Yes, it's a historical thing that was difficult to remove when much of the rules_python implementation was in upstream bazelbuild/bazel. I think now that the starlarkification efforts have landed, it might be possible to flip the default. The tracking issue is here: #2945

WDYT @rickeylev? Are we now in a position to flip this?

@tobyh-canva tobyh-canva force-pushed the create-zip-file-analysis-optimisations-2 branch from 2d51dc1 to 4464bb7 Compare October 27, 2025 00:19
@tobyh-canva tobyh-canva force-pushed the create-zip-file-analysis-optimisations-2 branch from 4464bb7 to cef10a3 Compare October 27, 2025 00:43
@rickeylev
Copy link
Collaborator

Thanks for this! Yeah, the runfiles.files.to_list() flattening was necessary to filter out the main executable file; it predates runfiles_without_exe.

What does perf look like if L853 for artifact in runfiles.files.to_list(): ... is moved to execution phase using map_each (and the C++ tool isn't used)? By using runfiles_without_exe, that custom filtering logic doesn't need to occur, so the depset can be passed directly.

Basically, I'm wondering what part of the perf improvement is coming from avoiding the analysis time depset flattening vs c++ doing string operations instead of Starlark/Java. I'm not super keen to require on-demand building a C++ binary (and C++ is a bit of a multi-platform PITA), but if the perf gains are this significant, it's pretty compelling.

That get_zip_runfiles_path is showing up so prominently is weird/surprising. Those should be fairly simple string manipulations. But, looking at the paths.relativize implementation, it's actually doing quite a bit of work in there (normalizing both paths (which incurs splits/appends), then some zip+looping, etc). L876 can be replaced with a simple stripprefix() call. L882 can be replaced with e.g. "{workspace}/{path}" if not path.startswith("../") else path[3:]

Also, how big are the depset's you're working with? 40k? 100k? I ask because the experimental venv construction logic has to do depset flattening

@tobyh-canva
Copy link
Author

tobyh-canva commented Oct 27, 2025

Hi @rickeylev :) I was also hoping I could avoid the additional C++ binary 😅

I did not think to replace the calls to path.relativize - I'll attempt your suggestions and regenerate the profile to check. Thanks! 🫡

Also, how big are the depset's you're working with? 40k? 100k? I ask because the experimental venv construction logic has to do depset flattening

I'm actually not sure, though I'd be happy to profile this experimental logic on our repo. Is there a config setting I should toggle to try it out?

@aignas
Copy link
Collaborator

aignas commented Oct 27, 2025

It is --@rules_python//python/config_settings:venv_site_packages=yes, it is going to use the codepaths introduced in #2156 to build a virtualenv just like what a venv or uv tool would.

@tobyh-canva
Copy link
Author

tobyh-canva commented Oct 27, 2025

This is the profile after applying @rickeylev's suggestions - instead of using a C++ script, replacing paths.relativize and paths.normalize with simpler logic. Oddly map_zip_runfiles is still fairly significant (15% overall) on the profile. See #3381 for the actual implementation I tested this with.

profile 3

I thought map_each functions were deferred to build time, so I'm a bit confused why map_zip_runfiles is showing up in the analysis profile during a --nobuild - is this expected when we're using a closure for the map function?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants