Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

zetasql/2024.11.1 package update #34096

Open
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

octo-sts[bot]
Copy link
Contributor

@octo-sts octo-sts bot commented Nov 13, 2024

Signed-off-by: wolfi-bot <121097084+wolfi-bot@users.noreply.github.com>
@octo-sts octo-sts bot added request-version-update request for a newer version of a package automated pr labels Nov 13, 2024
Copy link
Contributor Author

octo-sts bot commented Nov 13, 2024

Gen AI suggestions to solve the build error:

Based on the error output, there are a few issues to address:

  1. Primary Issue - Patch Failure:
    The patch is failing because it can't find the file zetasql/jdk/BUILD to patch. This could be because:
  • The file path in the patch file doesn't match the actual repository structure
  • The repository structure has changed since the patch was created

Suggestions to fix:

  1. Verify the correct path in the repository:
# When in interactive mode
ls -R | grep BUILD  # List all BUILD files
  1. Update the patch file paths to match the actual repository structure
  • You might need to modify multi-arch-for-linux.patch to use the correct paths
  • If the file has moved, update the path in the patch file
  1. Secondary Consideration - Build Resources:
    The package requests significant resources:
resources:
    cpu: 33
    memory: 100Gi

These are quite high but might be necessary for Bazel builds.

  1. Authentication Error Note:
    There's an error about Google authentication, but it doesn't appear to be affecting the build:
Error: rpc error: code = NotFound desc = federate identity...

This can be ignored for now as it's not blocking the build process.

Next steps:

  1. Examine the repository structure to confirm the correct paths for the BUILD files
  2. Update the patch file to match the actual file locations
  3. If needed, recreate the patch file against the current version of the repository

To recreate the patch if needed:

git diff > new-multi-arch-for-linux.patch

Focus on fixing the patch file paths first, as this is the immediate cause of the build failure.

@hbh7
Copy link
Member

hbh7 commented Nov 14, 2024

Important previous context here: #26865

Signed-off-by: hectorj2f <hector@chainguard.dev>
Copy link
Contributor Author

octo-sts bot commented Nov 21, 2024

Gen AI suggestions to solve the build error:

• Detected Error:

Error in fail: Unsupported linux distribution and version: wolfi, 20230201

• Error Category: Build Configuration

• Failure Point: LLVM toolchain setup in Bazel build system

• Root Cause Analysis: The LLVM toolchain rules don't recognize Wolfi OS as a supported Linux distribution for building. This is common when using newer or non-mainstream distributions with build tools that have hard-coded OS checks.

• Suggested Fix:
Add the following to the environment section of the melange yaml:

environment:
  contents:
    packages:
      - llvm15
      - clang15
  environment:
    BAZEL_ARGS: "--config=g++ --sandbox_add_mount_pair=/usr/lib/llvm15"

• Explanation:

  1. This bypasses the LLVM toolchain fetch by providing system LLVM
  2. Using system LLVM avoids distribution detection issues
  3. Mount pair ensures Bazel can access system LLVM in sandbox
  4. LLVM 15 is stable and compatible with ZetaSQL requirements

• Additional Notes:

  • The error occurs because the LLVM toolchain rules expect specific Linux distributions
  • Using system LLVM is a common workaround for distribution compatibility issues
  • Alternative approach would be to patch the LLVM toolchain rules to support Wolfi
  • Long term fix would be to upstream Wolfi support to the LLVM toolchain rules

• References:

Copy link
Contributor Author

octo-sts bot commented Nov 21, 2024

Gen AI suggestions to solve the build error:

• Detected Error: "Unsupported linux distribution and version: wolfi, 20230201"

• Error Category: Build Configuration

• Failure Point: LLVM toolchain repository setup during Bazel build

• Root Cause Analysis: The LLVM toolchain build rules don't recognize Wolfi OS as a supported distribution. The toolchain is attempting to detect the host OS and failing because Wolfi isn't in its list of known distributions.

• Suggested Fix:
Add the following to the environment section in the melange.yaml:

environment:
  contents:
    packages:
      # ... existing packages ...
      - llvm15
      - llvm15-dev
  environment:
    # ... existing env vars ...
    BAZEL_ARGS: "--config=g++ --sandbox_add_mount_pair=/usr/lib/llvm15 --action_env=LLVM_COMMIT=llvm-15 --action_env=DISTRIBUTION=debian --action_env=DISTRIBUTION_VERSION=11"

• Explanation:

  1. Installing LLVM directly from Wolfi packages bypasses the need for Bazel to download and build LLVM
  2. Setting DISTRIBUTION and DISTRIBUTION_VERSION tricks the build system into using Debian-compatible paths
  3. Specifying LLVM_COMMIT ensures consistent toolchain version
  4. The sandbox mount ensures the system LLVM is accessible to the build

• Additional Notes:

  • This is a common issue when building on newer/alternative distributions
  • The fix maintains build reproducibility while working around distribution detection
  • Using system LLVM is generally preferred over building from source for this package
  • Wolfi's LLVM packages are regularly updated and security patched

• References:

Copy link
Contributor Author

octo-sts bot commented Nov 21, 2024

Gen AI suggestions to solve the build error:

• Detected Error: "Unsupported linux distribution and version: wolfi, 20230201"

• Error Category: Configuration/Build System

• Failure Point: LLVM toolchain repository initialization during Bazel build

• Root Cause Analysis: The LLVM toolchain build rules don't recognize Wolfi OS as a supported distribution. The toolchain is trying to detect the host OS for building LLVM components but fails because Wolfi isn't in its list of known distributions.

• Suggested Fix:
Add a bazel configuration override in the pipeline section to specify a supported distribution:

pipeline:
  - uses: git-checkout
    with:
      repository: https://github.com/google/zetasql
      tag: ${{package.version}}
      expected-commit: a516c6b26d183efc4f56293256bba92e243b7a61

  - uses: patch
    with:
      patches: multi-arch-for-linux.patch

  - runs: |
      cd zetasql
      bazel build --verbose_failures ${BAZEL_ARGS} $EXTRA_BAZEL_ARGS \
        --action_env=BAZEL_DO_NOT_DETECT_CPP_TOOLCHAIN=1 \
        --host_platform=@local_config_platform//:host \
        --platforms=@local_config_platform//:host \
        -c opt ...

• Explanation:
The fix bypasses the LLVM toolchain's OS detection by forcing Bazel to use the host toolchain directly. Since Wolfi uses glibc and has LLVM15 installed as a package, this should work as the necessary components are available in the build environment.

• Additional Notes:

  • Consider adding a patch to make the LLVM toolchain rules recognize Wolfi
  • Alternative approach would be to override the detected OS to a supported one like Ubuntu
  • The host toolchain approach is simpler but may need adjustments for cross-compilation

• References:

Copy link
Contributor Author

octo-sts bot commented Nov 22, 2024

Gen AI suggestions to solve the build error:

• Detected Error:

Error: LinkageError occurred while loading main class com.google.devtools.build.lib.bazel.Bazel
java.lang.UnsupportedClassVersionError: com/google/devtools/build/lib/bazel/Bazel has been compiled by a more recent version of the Java Runtime (class file version 65.0), this version of the Java Runtime only recognizes class file versions up to 61.0

• Error Category: Dependency/Version

• Failure Point: Bazel server startup during build process

• Root Cause Analysis:
The installed JDK 17 (class file version 61.0) is too old for the Bazel 7.4.1 which requires JDK 21 (class file version 65.0)

• Suggested Fix:
Update the environment section to use OpenJDK 21:

environment:
  contents:
    packages:
      - openjdk-21
      - openjdk-21-default-jvm
      # ... other packages ...
  environment:
    JAVA_HOME: /usr/lib/jvm/java-21-openjdk

• Explanation:
Bazel 7.x requires JDK 21 for its execution. The error indicates a Java class version mismatch where the Bazel binary was compiled with JDK 21 but we're trying to run it with JDK 17. Updating to OpenJDK 21 will provide the correct Java version support.

• Additional Notes:

  • Bazel 7.x made JDK 21 a requirement for running the Bazel server
  • Class file version 65.0 corresponds to JDK 21
  • Class file version 61.0 corresponds to JDK 17
  • This is a runtime requirement, not a build requirement for the actual ZetaSQL code

• References:

Copy link
Contributor Author

octo-sts bot commented Nov 22, 2024

Gen AI suggestions to solve the build error:

• Detected Error: "Unsupported linux distribution and version: wolfi, 20230201"

• Error Category: Configuration/Dependency

• Failure Point: Bazel's LLVM toolchain configuration during the build process

• Root Cause Analysis:
The LLVM toolchain configuration in Bazel doesn't recognize Wolfi OS as a supported distribution. This is because the toolchain's distribution detection logic in release_name.bzl is failing to handle Wolfi.

• Suggested Fix:
Add LLVM packages directly to the environment section and modify the build configuration to use the system LLVM instead of downloading it:

environment:
  contents:
    packages:
      - llvm15
      - llvm15-dev
      - clang15
      # Add existing packages...

pipeline:
  - runs: |
      cd zetasql
      bazel build --verbose_failures ${BAZEL_ARGS} $EXTRA_BAZEL_ARGS \
        --action_env=CC=/usr/bin/clang \
        --action_env=CXX=/usr/bin/clang++ \
        --action_env=LLVM_CONFIG=/usr/bin/llvm-config \
        -c opt ...

• Explanation:
Instead of letting Bazel download and configure LLVM toolchain (which fails due to distribution detection), we provide the system LLVM packages and tell Bazel to use them through environment variables. This bypasses the problematic distribution detection while still providing the necessary toolchain.

• Additional Notes:

  1. Wolfi OS is relatively new and many build systems don't recognize it yet
  2. The error occurs in the LLVM toolchain's distribution detection code
  3. Using system LLVM packages is a common workaround for such issues
  4. Make sure to use LLVM 15 as it's known to work with ZetaSQL's requirements

• References:

Copy link
Contributor Author

octo-sts bot commented Nov 22, 2024

Gen AI suggestions to solve the build error:

• Detected Error: "Unsupported linux distribution and version: wolfi, 20230201"

• Error Category: Build Configuration

• Failure Point: LLVM toolchain repository fetch during Bazel build

• Root Cause Analysis: The LLVM toolchain build rule doesn't recognize Wolfi as a supported Linux distribution. This is common with build systems that have hard-coded distribution checks.

• Suggested Fix:

  1. Add to environment section:
  environment:
    BAZEL_ARGS: "--config=g++"
    EXTRA_BAZEL_ARGS: "--tool_java_runtime_version=local_jdk"
    JAVA_HOME: /usr/lib/jvm/java-17-openjdk
    # Add these lines:
    CC: clang
    CXX: clang++
    BAZEL_LLVM: "/usr"
  1. Add to pipeline before bazel build:
  - runs: |
      cd zetasql
      # Configure to use system LLVM
      echo "build --action_env=BAZEL_LLVM=/usr" >> .bazelrc
      echo "build --action_env=CC=clang" >> .bazelrc
      echo "build --action_env=CXX=clang++" >> .bazelrc
      bazel build --verbose_failures ${BAZEL_ARGS} $EXTRA_BAZEL_ARGS -c opt ...

• Explanation:

  • The error occurs because the LLVM toolchain rules try to detect the Linux distribution but don't recognize Wolfi
  • By configuring Bazel to use the system LLVM installation (provided by llvm15 packages) instead of downloading its own, we bypass the distribution check
  • This approach is valid since Wolfi already provides LLVM 15 through the package manager

• Additional Notes:

  • This is a common issue when building on newer/custom distributions
  • The fix leverages existing system LLVM rather than downloading a new copy
  • Make sure llvm15 and related packages are properly installed in environment section

• References:

@hectorj2f hectorj2f force-pushed the wolfictl-8eca536c-587a-4631-b36a-416d8b9f0b41 branch from f7720ff to 1917b4a Compare November 25, 2024 11:06
Copy link
Contributor Author

octo-sts bot commented Nov 25, 2024

Gen AI suggestions to solve the build error:

• Detected Error:

Error in fail: Unsupported linux distribution and version: wolfi, 20230201

• Error Category: Build Configuration/Dependency

• Failure Point:
LLVM toolchain repository fetch during Bazel build process

• Root Cause Analysis:
The LLVM toolchain build rules don't recognize Wolfi OS as a supported Linux distribution. The toolchain is trying to determine the appropriate LLVM distribution to download based on the host OS, but fails when it detects Wolfi.

• Suggested Fix:
Add LLVM tools directly as a build dependency in the package environment section:

environment:
  contents:
    packages:
      - bash
      - bazel-6
      - binutils
      - build-base
      - busybox
      - ca-certificates-bundle
      - gcc-12
      - git
      - openjdk-17
      - openjdk-17-default-jvm
      - openssf-compiler-options
      - patch
      - python3
      - tzdata
      - wolfi-baselayout
      - llvm15
      - llvm15-dev
      - clang15

• Explanation:
Instead of letting Bazel download and configure LLVM, we provide the system LLVM packages directly. This bypasses the distribution detection issue while still providing the required LLVM toolchain components.

• Additional Notes:

  1. Wolfi OS is relatively new and many build systems don't recognize it yet
  2. The LLVM toolchain rules in Bazel are specifically looking for known distributions like Ubuntu, Debian, etc.
  3. Using system-provided LLVM packages is a common workaround for cross-distribution compatibility issues

• References:

@cmwilson21
Copy link
Member

👋 @hbh7 - As we are implementing the new interrupts/escalation process, would you mind adding and filling out the escalation template on this one?

@dannf dannf added the ai/skip-comment Stop AI from commenting on PR label Dec 19, 2024
@hbh7
Copy link
Member

hbh7 commented Dec 19, 2024

Reason for Escalation / Level of Urgency

Package build is failing for an unknown reason. Medium priority, not known to be blocking anything but this has now been open for an extended period of time.

If prospect/customer issue, please provide needed by date

N/A

Short Description (Context / Steps already done)

Build is failing possibly due to the following error.

Error Messages / Logs

Error in fail: Unsupported linux distribution and version: wolfi, 20230201

Steps to Reproduce

  1. Run build
  2. Observe error

Customers / Images / SLA affected

Unknown

Possible Solution

Unknown

@dannf
Copy link
Contributor

dannf commented Dec 19, 2024

I'm new to bazel builds. I had to do some hacking of the build tree to get it to try and build with the system toolchain. That's especially annoying because we're asking it to build w/ gcc, not llvm/clang. Advice on better ways to do this welcome.

  • I commented out the http_archive section for toolchains_llvm in WORKSPACE.
  • In bazel/zetasql_deps_step_1.bzl, I commented out the load("@toolchains_llvm* lines, and the bazel_toolchain_dependencies() and llvm_toolchain() calls.
  • In bazel/zetasql_deps_step_2.bzl, I commented out the load("@llvm_toolchain... line as well as the llvm_register_toolchains() line.

I then set up a bunch of symlinks:

ln -s /usr/lib/llvm16/bin $(bazel info output_base)/external/llvm_toolchain/bin`
ln -s /usr/lib/llvm16/include $(bazel info output_base)/external/llvm_toolchain_llvm/include
ln -s /usr/lib/llvm16/lib $(bazel info output_base)/external/llvm_toolchain_llvm/lib
mkdir $(bazel info output_base)/external/llvm_toolchain_llvm/bin
ln -s /usr/lib/clang* $(bazel info output_base)/external/llvm_toolchain_llvm/bin
ln -s /usr/lib/llvm16/bin/* $(bazel info output_base)/external/llvm_toolchain_llvm/bin

With that, I was able to get a build going and I observed a familiar error message:

external/com_google_absl/absl/container/internal/raw_hash_set.h:4019:56:   in 'constexpr' expansion of 'absl::container_internal::hash_policy_traits<absl::container_internal::FlatHashMapPolicy<int, std::__cxx11::basic_string<char> >, void>::get_hash_slot_fn<absl::hash_internal::Hash<int> >()'
external/com_google_absl/absl/container/internal/hash_policy_traits.h:163:54: error: '(absl::container_internal::TypeErasedApplyToSlotFn<absl::hash_internal::Hash<int>, int> == 0)' is not a constant expression
  163 |     return Policy::template get_hash_slot_fn<Hash>() == nullptr
      |            ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^~~~~~~~~~

See: #34075

The workaround we've been implementing for this is to set GCC_SPEC_FILE=/dev/null in the build environment. But I see that bazel strips the environment to help produce hermetic builds. I tried using --action_env=GCC_SPEC_FILE=/dev/null on the build command line as well as various places in .bazelrc, but it didn't seem to fix anything.

@dannf
Copy link
Contributor

dannf commented Dec 28, 2024

I can avoid the symlinking and just comment out the toolchains_llvm references:

diff --git a/WORKSPACE b/WORKSPACE
index 994ce01..6d33ddb 100644
--- a/WORKSPACE
+++ b/WORKSPACE
@@ -37,13 +37,13 @@ workspace(name = "com_google_zetasql")
 
 load("@bazel_tools//tools/build_defs/repo:http.bzl", "http_archive")
 
-http_archive(
-    name = "toolchains_llvm",
-    canonical_id = "1.0.0",
-    sha256 = "e91c4361f99011a54814e1afbe5c436e0d329871146a3cd58c23a2b4afb50737",
-    strip_prefix = "toolchains_llvm-1.0.0",
-    url = "https://github.com/bazel-contrib/toolchains_llvm/releases/download/1.0.0/toolchains_llvm-1.0.0.tar.gz",
-)
+# http_archive(
+#     name = "toolchains_llvm",
+#     canonical_id = "1.0.0",
+#     sha256 = "e91c4361f99011a54814e1afbe5c436e0d329871146a3cd58c23a2b4afb50737",
+#     strip_prefix = "toolchains_llvm-1.0.0",
+#     url = "https://github.com/bazel-contrib/toolchains_llvm/releases/download/1.0.0/toolchains_llvm-1.0.0.tar.gz",
+# )
 
 http_archive(
     name = "rules_jvm_external",
diff --git a/bazel/zetasql_deps_step_1.bzl b/bazel/zetasql_deps_step_1.bzl
index 825bf8e..1fde594 100644
--- a/bazel/zetasql_deps_step_1.bzl
+++ b/bazel/zetasql_deps_step_1.bzl
@@ -22,25 +22,25 @@ load("@bazel_tools//tools/build_defs/repo:http.bzl", "http_archive")
 # but depend on them being something different. So we have to override them both
 # by defining the repo first.
 load("@com_google_zetasql//bazel:zetasql_bazel_version.bzl", "zetasql_bazel_version")
-load("@toolchains_llvm//toolchain:deps.bzl", "bazel_toolchain_dependencies")
-load("@toolchains_llvm//toolchain:rules.bzl", "llvm_toolchain")
+# load("@toolchains_llvm//toolchain:deps.bzl", "bazel_toolchain_dependencies")
+# load("@toolchains_llvm//toolchain:rules.bzl", "llvm_toolchain")
 
 def zetasql_deps_step_1(add_bazel_version = True):
     if add_bazel_version:
         zetasql_bazel_version()
 
-    bazel_toolchain_dependencies()
-    llvm_toolchain(
-        name = "llvm_toolchain",
-        llvm_versions = {
-            "": "16.0.0",
-            # The LLVM repo stops providing pre-built binaries for the MacOS x86_64
-            # architecture for versions >= 16.0.0: https://github.com/llvm/llvm-project/releases,
-            # but our Kokoro MacOS tests are still using x86_64 (ventura).
-            # TODO: Upgrade the MacOS version to sonoma-slcn.
-            "darwin-x86_64": "15.0.7",
-        },
-    )
+    # bazel_toolchain_dependencies()
+    # llvm_toolchain(
+    #     name = "llvm_toolchain",
+    #     llvm_versions = {
+    #         "": "16.0.0",
+    #         # The LLVM repo stops providing pre-built binaries for the MacOS x86_64
+    #         # architecture for versions >= 16.0.0: https://github.com/llvm/llvm-project/releases,
+    #         # but our Kokoro MacOS tests are still using x86_64 (ventura).
+    #         # TODO: Upgrade the MacOS version to sonoma-slcn.
+    #         "darwin-x86_64": "15.0.7",
+    #     },
+    # )
 
     http_archive(
         name = "io_bazel_rules_go",
diff --git a/bazel/zetasql_deps_step_2.bzl b/bazel/zetasql_deps_step_2.bzl
index 6873dbe..03cd8df 100644
--- a/bazel/zetasql_deps_step_2.bzl
+++ b/bazel/zetasql_deps_step_2.bzl
@@ -19,7 +19,7 @@
 load("@bazel_gazelle//:deps.bzl", "gazelle_dependencies", "go_repository")
 load("@bazel_tools//tools/build_defs/repo:http.bzl", "http_archive")
 load("@io_bazel_rules_go//go:deps.bzl", "go_register_toolchains", "go_rules_dependencies")
-load("@llvm_toolchain//:toolchains.bzl", "llvm_register_toolchains")
+# load("@llvm_toolchain//:toolchains.bzl", "llvm_register_toolchains")
 load("@rules_bison//bison:bison.bzl", "bison_register_toolchains")
 load("@rules_flex//flex:flex.bzl", "flex_register_toolchains")
 load("@rules_foreign_cc//foreign_cc:repositories.bzl", "rules_foreign_cc_dependencies")
@@ -29,7 +29,7 @@ load("@rules_proto//proto:setup.bzl", "rules_proto_setup")
 load("@rules_proto//proto:toolchains.bzl", "rules_proto_toolchains")
 
 def _load_deps_from_step_1():
-    llvm_register_toolchains()
+#     llvm_register_toolchains()
     rules_foreign_cc_dependencies()
 
 def textmapper_dependencies():

As a test, I replaced the system openssf.spec with an empty file, and that avoided the build failure with absl. I tried passing an alternate specfile w/ -copt='-specs=abseil-cpp.spec' and --host_copt='-specs=abseil-cpp.spec' , as well as --host_action_env="GCC_SPEC_FILE=/dev/null" - those work for the target I'm building, but they don't get passed down to dependencies, so that seems like a no-go.

With the truncated-spec-file workaround, I then hit failures with tests that use non-latin filenames in civetweb and boost. I can workaround those by removing those test cases, but ick. After that, I hit a failure with icu's build, which we may be able to workaround by tweaking the ARFLAGS there.

But long story short, there seems to be a lot going on with this one.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
ai/skip-comment Stop AI from commenting on PR automated pr eng:os help wanted Extra attention is needed interrupt request-version-update request for a newer version of a package
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants