Skip to content

Switch coverage CI targets to EngFlow #39269

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
May 1, 2025

Conversation

krinkinmu
Copy link
Contributor

Commit Message:

Currently Envoy CI uses different RBE backends for different CI targets. EngFlow is one of available backends and we want to migrate most if not all targets to EngFlow.

Other than just making the overall CI setup simpler, EngFlow appear to offer more powerful build grid machines at the moment, plus some additional nice features like automatically scaling memory when build tasks don't fit into available memory.

One specific reason why I'd like to migrate coverage targets to EngFlow is because I want to switch them to static linking to workaround a bug in Clang/LLVM source-based coverage (see llvm/llvm-project#32849).

With currently used Google RBE backend we are having issues with fuzzing coverage tests, as fuzzing tests include a lot of extensions and together with coverage instrumentation it pushes linker memory footrpint way to high and causing OOMs.

Our approach to solving this particular problem is two-fold:

  1. I want to migrate to EngFlow that can offer bigger machines (and it aligns with the general direction for Envoy CI to migrate to EngFlow.
  2. I want to optimize our fuzzing targets a little bit by cutting out some unnecessary bits and reducing the number of libraries that linker need to link together (on top of reducing the amount of time it takes to build things and similar benefits).

This particular PR takes care of the first part.

Additional Description:

Some relevant discussions can be found in #39030 which prompted me to work on this in the first place. And I will use #39248 as a tracking bug for the coverage changes.

Risk Level: medium (it could break CI and cause disruption, but rolling it back should be easy)
Testing: I applied the same change in the envoy-ci-staging repo and created a test PR to see if EngFlow is used for coverage. The setup between envoy-ci-staging and envoy repo is not exactly the same, so I could have missed something.

Docs Changes: n/a
Release Notes: n/a
Platform Specific Features: n/a

+cc @phlax

Currently Envoy CI uses different RBE backends for different CI targets.
EngFlow is one of available backends and we want to migrate most if not
all targets to EngFlow.

Other than just making the overall CI setup simpler, EngFlow appear to
offer more powerful build grid machines at the moment, plus some
additional nice features like automatically scaling memory when build
tasks don't fit into available memory.

One specific reason why I'd like to migrate coverage targets to EngFlow
is because I want to switch them to static linking to workaround a bug
in Clang/LLVM source-based coverage (see llvm/llvm-project#32849).

With currently used Google RBE backend we are having issues with
fuzzing coverage tests, as fuzzing tests include a lot of extensions and
together with coverage instrumentation it pushes linker memory footrpint
way to high and causing OOMs.

Our approach to solving this particular problem is two-fold:

1. I want to migrate to EngFlow that can offer bigger machines
   (and it aligns with the general direction for Envoy CI to migrate to
   EngFlow.
2. I want to optimize our fuzzing targets a little bit by cutting out
   some unnecessary bits and reducing the number of libraries that
   linker need to link together (on top of reducing the amount of time
   it takes to build things and similar benefits).

This particular PR takes care of the first part.

Signed-off-by: Mikhail Krinkin <mkrinkin@microsoft.com>
Copy link

As a reminder, PRs marked as draft will not be automatically assigned reviewers,
or be handled by maintainer-oncall triage.

Please mark your PR as ready when you want it to be reviewed!

🐱

Caused by: #39269 was opened by krinkinmu.

see: more, trace.

Signed-off-by: Mikhail Krinkin <mkrinkin@microsoft.com>
Copy link
Member

@phlax phlax left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm, thanks @krinkinmu

really appreciated, thanks

@krinkinmu krinkinmu marked this pull request as ready for review April 30, 2025 10:14
@phlax phlax merged commit e0420ee into envoyproxy:main May 1, 2025
24 checks passed
krinkinmu added a commit to krinkinmu/envoy that referenced this pull request May 1, 2025
This reverts commit e0420ee.
It unexpectedly resulted in lower coverage numbers - we should rollback
while we are investigating what's going on to avoid disruptions.
krinkinmu added a commit to krinkinmu/envoy that referenced this pull request May 1, 2025
This reverts commit e0420ee.
It unexpectedly resulted in lower coverage numbers - we should rollback
while we are investigating what's going on to avoid disruptions.

Signed-off-by: Mikhail Krinkin <mkrinkin@microsoft.com>
phlax pushed a commit that referenced this pull request May 1, 2025
Commit Message:

This reverts commit e0420ee. It
unexpectedly resulted in lower coverage numbers - we should rollback
while we are investigating what's going on to avoid disruptions.

Additional Description:

Some relevant discussions can be found in
#39030 which prompted me to work
on this in the first place. And I will use
#39248 as a tracking bug for
the coverage changes.

Signed-off-by: Mikhail Krinkin <mkrinkin@microsoft.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants