Skip to content

Conversation

sitaowang1998
Copy link
Contributor

@sitaowang1998 sitaowang1998 commented Sep 21, 2025

Description

This PR

  • Adds yscope-spider-py as a dependency of job-orchestration.
  • Changes to install c++ libraries in build/deps/cpp.
  • Adds tasks to install mariadb-connector-cpp and spider.

Notice that spider and mariadb-connector-cpp are both Ubuntu only tasks because of the build issue of mariadb-connector-cpp on other platforms and that we don't need to use spider on other platforms.

Checklist

  • The PR satisfies the contribution guidelines.
  • This is a breaking change and that has been indicated in the PR title, OR this isn't a
    breaking change.
  • Necessary docs have been updated, OR no docs need to be updated.

Validation performed

  • spider C++ package installed successfully in Ubuntu 24.04.
  • Running deps:core and then deps:spider or vice versa does not build libraries again in Ubuntu 24.04.
  • job-orchestration builds.
  • GitHub workflows pass.

Summary by CodeRabbit

  • New Features

    • Added Spider build/orchestration tasks and a Spider build-dir variable; added yscope-spider-py dependency.
  • Refactor

    • Migrated public/config namespace from “core” to “cpp” (paths, checksum handling, CMake settings, tasks and artifact outputs).
  • Chores

    • Standardized parallelism env var to CLP_CPP_MAX_PARALLELISM_PER_BUILD_TASK across workflows and tasks.
    • Tightened job-orchestration Python constraint to >=3.9,<4.0.

@sitaowang1998 sitaowang1998 requested a review from a team as a code owner September 21, 2025 20:55
Copy link
Contributor

coderabbitai bot commented Sep 21, 2025

Walkthrough

Migrates public/task/config surfaces from the "core" namespace to "cpp": adds G_DEPS_CPP_* variables, renames core tasks to cpp, adds spider-related tasks, updates CI env-var names to CLP_CPP_*, adjusts a CMake deps include to use build/deps/cpp, and tightens/extends a Python project's dependencies.

Changes

Cohort / File(s) Summary
Deps taskfile namespace migration (core → cpp)
taskfiles/deps/main.yaml, taskfiles/deps/utils.yaml
Added G_DEPS_CPP_* variables (DIR, CHECKSUMS_DIR, CMAKE_SETTINGS_DIR, CHECKSUM_FILE); replaced G_DEPS_CORE_* references with G_DEPS_CPP_*; renamed checksum and helper tasks from corecpp; updated CMake settings, build/install/checksum paths and public artifact references to the CPP namespace; added/wired spider-related public tasks (spider, spider-dep, spider-all-parallel).
Root taskfile vars and usage
taskfile.yaml
Added G_SPIDER_BUILD_DIR; replaced G_CORE_MAX_PARALLELISM_PER_BUILD_TASK with G_CPP_MAX_PARALLELISM_PER_BUILD_TASK and environment default reference CLP_CPP_MAX_PARALLELISM_TASK; updated core-generate/core-build and package flows to use G_DEPS_CPP_* and include spider build/rsync steps.
CI workflows env var rename & spider invocation
.github/workflows/clp-artifact-build.yaml, .github/workflows/clp-core-build-macos.yaml
Replaced CLP_CORE_MAX_PARALLELISM_PER_BUILD_TASK with CLP_CPP_MAX_PARALLELISM_PER_BUILD_TASK in run_command/env contexts; updated workflow steps that invoke deps tasks to use the CPP-named parallelism variable; artifact build workflow adjusted to invoke CPP/spider deps in relevant job(s).
CMake deps path update
components/core/CMakeLists.txt
Changed deps include path from ../../build/deps/core to ../../build/deps/cpp so optional include loads cmake-settings from the cpp deps dir.
Job orchestration Python project
components/job-orchestration/pyproject.toml
Tightened Python requirement from >= 3.9 to >= 3.9,<4.0 and added dependency yscope-spider-py = "0.1.0".

Sequence Diagram(s)

sequenceDiagram
  autonumber
  participant CI as CI Workflow
  participant Tasks as Taskfiles (deps)
  participant CMake as CMake (build/deps/cpp)
  participant Spider as Spider build

  CI->>Tasks: run deps:cpp / cpp-all-parallel
  Note right of Tasks #E7F6FF: Tasks now use G_DEPS_CPP_* and\nG_CPP_MAX_PARALLELISM_PER_BUILD_TASK
  Tasks->>CMake: configure / build / install (CMAKE_SETTINGS_DIR=.../cpp)
  CMake-->>Tasks: deps installed under build/deps/cpp

  CI->>Tasks: run spider / spider-all-parallel
  Tasks->>Spider: invoke spider build with cpp CMake args
  Spider-->>Tasks: spider artifacts produced

  CI->>CoreComponent: configure components/core
  CoreComponent->>CMake: optionally include ../../build/deps/cpp/cmake-settings/all.cmake
  CoreComponent-->>CI: continue build using cpp deps if included
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Pre-merge checks and finishing touches

❌ Failed checks (1 warning)
Check name Status Explanation Resolution
Title Check ⚠️ Warning The title focuses solely on adding Spider and its dependencies but omits the significant refactoring that migrates the core namespace to C++ (the G_DEPS_CPP_* variables and related tasks). As a result, it does not fully summarise the primary scope of changes, which also include extensive updates to the build layout and namespace for C++ libraries. Consider revising the title to reflect both the core-to-CPP namespace migration and the addition of Spider build tasks, for example: “feat(deps): migrate core to cpp namespace and add Spider build tasks and dependencies.” This will ensure the title conveys the full scope of the changeset at a glance.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Docstring Coverage ✅ Passed No functions found in the changes. Docstring coverage check skipped.
✨ Finishing touches
  • 📝 Generate Docstrings
🧪 Generate unit tests
  • Create PR with unit tests
  • Post copyable unit tests in a comment

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 5

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (2)
taskfiles/lint.yaml (2)

763-765: Include spider-orchestration in UV-driven lint/type-check passes.

It’s added to the first loop (black/ruff), but not to UV_PYTHON_PROJECTS, so it will miss ruff format and mypy under uv.

   py:
     internal: true
     vars:
       UV_PYTHON_PROJECTS:
-        - "{{.G_INTEGRATION_TESTS_DIR}}"
+        - "{{.G_INTEGRATION_TESTS_DIR}}"
+        - "components/spider-orchestration"

769-779: Optional: run tools from the project root for consistency.

You currently lint the package subdir "components/spider-orchestration/spider_orchestration". Consider running from the project root to let tools read pyproject.toml configs consistently.

-          - "components/spider-orchestration/spider_orchestration"
+          - "components/spider-orchestration"
📜 Review details

Configuration used: CodeRabbit UI

Review profile: ASSERTIVE

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between b83e3cf and d5751df.

⛔ Files ignored due to path filters (1)
  • components/spider-orchestration/poetry.lock is excluded by !**/*.lock
📒 Files selected for processing (4)
  • components/spider-orchestration/README.md (1 hunks)
  • components/spider-orchestration/pyproject.toml (1 hunks)
  • taskfiles/deps/main.yaml (9 hunks)
  • taskfiles/lint.yaml (1 hunks)
🧰 Additional context used
🧠 Learnings (5)
📚 Learning: 2025-08-20T05:38:30.720Z
Learnt from: junhaoliao
PR: y-scope/clp#1242
File: taskfile.yaml:545-547
Timestamp: 2025-08-20T05:38:30.720Z
Learning: The python component directories in the y-scope/clp repository are kept very clean and only contain essential files: source code directories, pyproject.toml, poetry.lock, README.md, and the dist/ directory. No cache directories, virtual environments, or other transient files exist in the component directories, so additional exclude patterns beyond "dist/**/*" are unnecessary.

Applied to files:

  • taskfiles/lint.yaml
📚 Learning: 2025-01-16T16:58:43.190Z
Learnt from: haiqi96
PR: y-scope/clp#651
File: components/clp-package-utils/clp_package_utils/scripts/compress.py:0-0
Timestamp: 2025-01-16T16:58:43.190Z
Learning: In the clp-package compression flow, path validation and error handling is performed at the scheduler level rather than in the compress.py script to maintain simplicity and avoid code duplication.

Applied to files:

  • components/spider-orchestration/README.md
📚 Learning: 2025-07-23T09:54:45.185Z
Learnt from: Bill-hbrhbr
PR: y-scope/clp#1122
File: components/core/src/clp/clp/CMakeLists.txt:175-195
Timestamp: 2025-07-23T09:54:45.185Z
Learning: In the CLP project, when reviewing CMakeLists.txt changes that introduce new compression library dependencies (BZip2, LibLZMA, LZ4, ZLIB), the team prefers to address conditional linking improvements in separate PRs rather than expanding the scope of focused migration PRs like the LibArchive task-based installation migration.

Applied to files:

  • taskfiles/deps/main.yaml
📚 Learning: 2025-08-09T04:07:27.083Z
Learnt from: Bill-hbrhbr
PR: y-scope/clp#1156
File: components/core/CMakeLists.txt:772-772
Timestamp: 2025-08-09T04:07:27.083Z
Learning: In the CLP project's CMakeLists.txt, when reviewing changes related to the ${zstd_TARGET} variable usage in test linking, the team is planning a refactoring PR to improve this mechanism. Guards for undefined target variables should be deferred to that separate PR rather than being added in focused dependency migration PRs.

Applied to files:

  • taskfiles/deps/main.yaml
📚 Learning: 2025-09-15T18:53:15.844Z
Learnt from: Bill-hbrhbr
PR: y-scope/clp#1226
File: taskfiles/deps/main.yaml:522-527
Timestamp: 2025-09-15T18:53:15.844Z
Learning: When ystdlib is built with -Dystdlib_BUILD_TESTING=OFF in taskfiles/deps/main.yaml, it does not require Catch2 settings to be injected via CMAKE_GEN_ARGS since testing is disabled and Catch2 is not needed during the build process.

Applied to files:

  • taskfiles/deps/main.yaml
🪛 markdownlint-cli2 (0.18.1)
components/spider-orchestration/README.md

3-3: Trailing spaces
Expected: 0 or 2; Actual: 1

(MD009, no-trailing-spaces)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (16)
  • GitHub Check: manylinux_2_28-x86_64-dynamic-linked-bins
  • GitHub Check: musllinux_1_2-x86_64-dynamic-linked-bins
  • GitHub Check: centos-stream-9-dynamic-linked-bins
  • GitHub Check: musllinux_1_2-x86_64-static-linked-bins
  • GitHub Check: centos-stream-9-static-linked-bins
  • GitHub Check: manylinux_2_28-x86_64-static-linked-bins
  • GitHub Check: ubuntu-jammy-lint
  • GitHub Check: package-image
  • GitHub Check: ubuntu-jammy-dynamic-linked-bins
  • GitHub Check: ubuntu-jammy-static-linked-bins
  • GitHub Check: build-macos (macos-15, true)
  • GitHub Check: build-macos (macos-14, true)
  • GitHub Check: build-macos (macos-14, false)
  • GitHub Check: build-macos (macos-15, false)
  • GitHub Check: lint-check (macos-15)
  • GitHub Check: lint-check (ubuntu-24.04)
🔇 Additional comments (7)
components/spider-orchestration/pyproject.toml (1)

13-22: Validate dependency pins and pre-release usage.

Confirm that yscope-spider-py "~0.0.1-beta.2" is intended (pre-release) and compatible with Python 3.10+, and that Pydantic v1 is required (v2 is common now). If not intentional, update pins accordingly.

taskfiles/deps/main.yaml (6)

21-29: LGTM: centralizing LIB_NAME variables improves maintainability.

Good move; reduces duplication and eases future renames.


84-92: Verify CI behaviour when spider/mariadb tasks are Linux/Ubuntu-gated.

They’re declared as deps of core-all-parallel. Ensure the platform gating won’t hard-fail on non-Linux runners (Task should skip cleanly). If not, gate at the caller or split platform-specific aggregates.


578-593: LGTM: ystdlib PIC enabled and LIB_NAME centralized.

Matches prior learning about testing OFF; PIC is a good default for linkers.


210-217: LGTM: adding Boost.Process target.

Likely needed by Spider; no issues.


454-471: LGTM: spdlog configured to use external fmt and centralized LIB_NAME.

Consistent with fmt dep; good.


409-421: LGTM: msgpack-cxx C++20 and CMP0074 NEW with centralized LIB_NAME.

Settings look correct.

@sitaowang1998 sitaowang1998 changed the title feat: Add project setup for spider orchestration. feat: Add project setup for spider-orchestration. Sep 21, 2025
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 6

📜 Review details

Configuration used: CodeRabbit UI

Review profile: ASSERTIVE

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between d5751df and 8a33b1c.

⛔ Files ignored due to path filters (1)
  • components/spider-orchestration/poetry.lock is excluded by !**/*.lock
📒 Files selected for processing (2)
  • components/spider-orchestration/README.md (1 hunks)
  • components/spider-orchestration/pyproject.toml (1 hunks)
🧰 Additional context used
🧠 Learnings (1)
📚 Learning: 2025-01-16T16:58:43.190Z
Learnt from: haiqi96
PR: y-scope/clp#651
File: components/clp-package-utils/clp_package_utils/scripts/compress.py:0-0
Timestamp: 2025-01-16T16:58:43.190Z
Learning: In the clp-package compression flow, path validation and error handling is performed at the scheduler level rather than in the compress.py script to maintain simplicity and avoid code duplication.

Applied to files:

  • components/spider-orchestration/README.md
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (3)
  • GitHub Check: manylinux_2_28-x86_64-static-linked-bins
  • GitHub Check: lint-check (ubuntu-24.04)
  • GitHub Check: lint-check (macos-15)
🔇 Additional comments (6)
components/spider-orchestration/README.md (3)

1-1: Header reads well

Title is clear and scoped to Spider usage.


3-3: Remove trailing whitespace (markdownlint MD009).

There’s one trailing space at the end of Line 3.

-This Python module contains CLP's scheduler and worker to handle distributed compression.·
+This Python module contains CLP's scheduler and worker to handle distributed compression.

3-4: Fix README scope — don't claim the package ships a scheduler/worker.
This package orchestrates CLP via Spider and does not ship the scheduler or worker; tighten wording accordingly.

-This Python module contains CLP's scheduler and worker to handle distributed compression.
-CLP's Compression Job Handler can be used to interface and submit compression jobs to the CLP scheduler.
+This Python module orchestrates CLP compression jobs via Spider, integrating with the CLP scheduler and workers.
+The Compression Job Handler interfaces with the CLP scheduler to submit compression jobs.
components/spider-orchestration/pyproject.toml (3)

11-13: Confirm minimum Python version policy (3.9 EOL in Oct 2025).

If you don’t need 3.9, consider bumping to ">=3.10,<4.0" to align with Ubuntu 22.04+ and reduce maintenance.

Would you like a follow-up patch to raise and add classifiers accordingly?


1-3: Avoid duplicate Python version sources; drop the unused [project] block.

With poetry-core, metadata comes from [tool.poetry]. Keeping [project] risks conflicting signals.

Apply:

-[project]
-requires-python = ">= 3.9, <4.0"
-

9-9: README present — no action required.
components/spider-orchestration/README.md exists (verified); pyproject.toml's readme = "README.md" is valid.

Comment on lines 4 to 9
[tool.poetry]
name = "spider-orchestration"
version = "0.5.1-dev"
description = "Scheduler and worker cluster for CLP's distributed architecture."
authors = ["YScope Inc. <dev@yscope.com>"]
readme = "README.md"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick

Add license metadata.

Include the project licence in package metadata.

Apply:

 readme = "README.md"
+license = "Apache-2.0"
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
[tool.poetry]
name = "spider-orchestration"
version = "0.5.1-dev"
description = "Scheduler and worker cluster for CLP's distributed architecture."
authors = ["YScope Inc. <dev@yscope.com>"]
readme = "README.md"
[tool.poetry]
name = "spider-orchestration"
version = "0.5.1-dev"
description = "Scheduler and worker cluster for CLP's distributed architecture."
authors = ["YScope Inc. <dev@yscope.com>"]
readme = "README.md"
license = "Apache-2.0"
🤖 Prompt for AI Agents
In components/spider-orchestration/pyproject.toml around lines 4-9, the Poetry
project metadata is missing license information; add a license entry (use the
repository's SPDX identifier, e.g. license = "MIT") and include license-file =
"LICENSE" under the [tool.poetry] table so the package metadata exposes the
project's license and points to the LICENSE file.


[tool.poetry]
name = "spider-orchestration"
version = "0.5.1-dev"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Use a PEP 440–compliant version.

"0.5.1-dev" isn’t valid per PEP 440 and can break build/publish. Use a .devN pre-release.

Apply:

-version = "0.5.1-dev"
+version = "0.5.1.dev0"
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
version = "0.5.1-dev"
version = "0.5.1.dev0"
🤖 Prompt for AI Agents
In components/spider-orchestration/pyproject.toml around line 6, the version
"0.5.1-dev" is not PEP 440–compliant; replace it with a PEP 440 .dev pre-release
(for example "0.5.1.dev0" or "0.5.1.dev1") by updating the version field
accordingly so packaging and publishing tools accept it.

Comment on lines 14 to 18
# MariaDB version must be compatible with libmariadb-dev <3.3.1 installed in the runtime
# environment. The default version of libmariadb-dev in Ubuntu 22.04 is 3.2.6.
# See https://mariadb.com/docs/connectors/mariadb-connector-python/install#build-prerequisites
mariadb = "~1.0.11"
msgpack = "^1.0.7"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Gate mariadb install to Linux to prevent cross‑platform failures.

The MariaDB Python package often relies on system libmariadb; on macOS/Windows this can fail unexpectedly.

Apply:

-mariadb = "~1.0.11"
+mariadb = { version = "~1.0.11", markers = "sys_platform == 'linux'" }
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
# MariaDB version must be compatible with libmariadb-dev <3.3.1 installed in the runtime
# environment. The default version of libmariadb-dev in Ubuntu 22.04 is 3.2.6.
# See https://mariadb.com/docs/connectors/mariadb-connector-python/install#build-prerequisites
mariadb = "~1.0.11"
msgpack = "^1.0.7"
# MariaDB version must be compatible with libmariadb-dev <3.3.1 installed in the runtime
# environment. The default version of libmariadb-dev in Ubuntu 22.04 is 3.2.6.
# See https://mariadb.com/docs/connectors/mariadb-connector-python/install#build-prerequisites
mariadb = { version = "~1.0.11", markers = "sys_platform == 'linux'" }
msgpack = "^1.0.7"
🤖 Prompt for AI Agents
In components/spider-orchestration/pyproject.toml around lines 14 to 18, the
mariadb dependency must be gated to Linux to avoid install failures on
macOS/Windows; change the mariadb entry to be platform‑marked so it only
installs on Linux (use a PEP 508 marker such as sys_platform == "linux" on the
mariadb requirement), leaving msgpack unchanged.

Comment on lines 14 to 16
# MariaDB version must be compatible with libmariadb-dev <3.3.1 installed in the runtime
# environment. The default version of libmariadb-dev in Ubuntu 22.04 is 3.2.6.
# See https://mariadb.com/docs/connectors/mariadb-connector-python/install#build-prerequisites
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

🧩 Analysis chain

Validate the libmariadb note and keep it current.

The comment references specific libmariadb-dev versions on Ubuntu 22.04. Please verify these constraints against the current MariaDB Connector/Python 1.0.x guidance.


🌐 Web query:

What libmariadb (Connector/C) versions are compatible with MariaDB Connector/Python 1.0.11 on Ubuntu 22.04, and is there any documented upper bound (e.g., <3.3.1)?

💡 Result:

Short answer

  • MariaDB Connector/Python 1.0.11 requires MariaDB Connector/C (libmariadb) >= 3.1.5 per the official install docs for the 1.0.x line. [1][2]
  • There was a reported installer/runtime check that required >= 3.2.4 for some installs (Jira CONPY-215), so in practice you may need Connector/C ≥ 3.2.4 on some systems. [3]
  • There is no documented upper‑bound (e.g. “< 3.3.1”) listed for Connector/Python 1.0.11; later Connector/Python 1.1.x simply raised the documented minimum to Connector/C ≥ 3.3.1. [1][4]

Sources

  1. MariaDB Connector/Python installation / dependencies (shows 3.1.5 minimum for <1.1.0 and 3.3.1 for 1.1+). [1]
  2. Release notes for Connector/Python 1.0.11. [2]
  3. MariaDB Jira CONPY‑215 (installer reported requirement >= 3.2.4 for 1.0.11). [3]
  4. Connector/Python 1.1 install docs (Connector/C >= 3.3.1). [4]

If you want, I can check which libmariadb package versions are available on Ubuntu 22.04 (apt names/versions) and recommend a specific package to install.


Update libmariadb note: remove the '<3.3.1' upper bound and state the correct Connector/C minima
Connector/Python 1.0.11 documents MariaDB Connector/C (libmariadb) ≥ 3.1.5 (some installs reported needing ≥ 3.2.4); there is no documented upper bound such as "<3.3.1" — Connector/Python 1.1+ raises the minimum to ≥ 3.3.1. Update components/spider-orchestration/pyproject.toml (lines 14–16) to reflect this.

🤖 Prompt for AI Agents
In components/spider-orchestration/pyproject.toml around lines 14 to 16, remove
the erroneous "<3.3.1" upper bound and update the note to state the correct
Connector/C (libmariadb) minima: for Connector/Python 1.0.11 require libmariadb
≥ 3.1.5 (with some installs reporting need for ≥ 3.2.4), and note that
Connector/Python 1.1+ raises the minimum to ≥ 3.3.1; keep the Ubuntu 22.04
default version note and link reference intact.

Comment on lines 25 to 27
[build-system]
requires = ["poetry-core"]
build-backend = "poetry.core.masonry.api" No newline at end of file
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick

Pin a minimum poetry-core for reproducibility.

Unpinned build backends can change behaviour subtly across environments.

Apply:

-requires = ["poetry-core"]
+requires = ["poetry-core>=1.8.0"]
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
[build-system]
requires = ["poetry-core"]
build-backend = "poetry.core.masonry.api"
[build-system]
requires = ["poetry-core>=1.8.0"]
build-backend = "poetry.core.masonry.api"
🤖 Prompt for AI Agents
In components/spider-orchestration/pyproject.toml around lines 25 to 27, pin the
poetry-core build dependency to a minimum version to ensure reproducible builds;
change the requires entry from ["poetry-core"] to a minimum-pinned spec such as
["poetry-core>=1.5.0"] (keep build-backend = "poetry.core.masonry.api"
unchanged).

# CLP Job Orchestration using Spider

This Python module contains CLP's scheduler and worker to handle distributed compression.
CLP's Compression Job Handler can be used to interface and submit compression jobs to the CLP scheduler.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick

Add platform/build note for Spider and MariaDB Connector C++.

Readers should know these C++ deps are built only on Ubuntu as per this PR.

Suggested addition (new paragraph after Line 4):

+Note: As part of project setup, the Spider library and MariaDB Connector/C++ are built and installed on Ubuntu platforms only. Other platforms skip these native builds.
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
CLP's Compression Job Handler can be used to interface and submit compression jobs to the CLP scheduler.
CLP's Compression Job Handler can be used to interface and submit compression jobs to the CLP scheduler.
Note: As part of project setup, the Spider library and MariaDB Connector/C++ are built and installed on Ubuntu platforms only. Other platforms skip these native builds.
🤖 Prompt for AI Agents
In components/spider-orchestration/README.md around line 4, add a new paragraph
after the existing sentence stating that the Spider and MariaDB Connector C++
dependencies are built only on Ubuntu (per the referenced PR); briefly note the
supported OS (Ubuntu), any required build/toolchain/version constraints if
known, and recommend using Ubuntu or an Ubuntu-based Docker image for local
builds or CI; include a link or reference to the PR for details.

Copy link
Member

@LinZhihao-723 LinZhihao-723 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Before we proceed: do we need a new namespace under components? I thought spider related subcomponents can also be located in the current job-orchestration: the current job orchestration already contains non-celery related components.
But as I'm also new to the package side of thing, let's wait for @junhaoliao @kirkrodrigues to confirm.

@kirkrodrigues
Copy link
Member

Before we proceed: do we need a new namespace under components? I thought spider related subcomponents can also be located in the current job-orchestration: the current job orchestration already contains non-celery related components. But as I'm also new to the package side of thing, let's wait for @junhaoliao @kirkrodrigues to confirm.

Yeah, name-wise, I think it makes more sense to be under job-orchestration. I guess Sitao was making it a separate "component" since the package will only run one or the other (job-orchestration or spider-orchestration)?

@sitaowang1998
Copy link
Contributor Author

I have two reasons to use a separate "component":

  1. The task function name is already occupied. I can go with spider_compress though.
  2. It will be easier to do the switch later if we have a component that does not have celery dependency.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

📜 Review details

Configuration used: CodeRabbit UI

Review profile: ASSERTIVE

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 7e223ef and b2dc621.

📒 Files selected for processing (1)
  • docs/src/dev-docs/components-core/index.md (3 hunks)
🧰 Additional context used
🧠 Learnings (7)
📓 Common learnings
Learnt from: Bill-hbrhbr
PR: y-scope/clp#1156
File: components/core/CMakeLists.txt:772-772
Timestamp: 2025-08-09T04:07:27.083Z
Learning: In the CLP project's CMakeLists.txt, when reviewing changes related to the ${zstd_TARGET} variable usage in test linking, the team is planning a refactoring PR to improve this mechanism. Guards for undefined target variables should be deferred to that separate PR rather than being added in focused dependency migration PRs.
📚 Learning: 2025-08-18T05:43:07.868Z
Learnt from: Bill-hbrhbr
PR: y-scope/clp#1184
File: components/core/cmake/Modules/FindLibLZMA.cmake:21-24
Timestamp: 2025-08-18T05:43:07.868Z
Learning: In the CLP project, all supplied `<lib>_ROOT` variables will live within the `CLP_CORE_DEPS_DIR` as part of their architectural design. This means that using CLP_CORE_DEPS_DIR for library discovery in custom Find modules is the intended approach, and prioritizing individual `<lib>_ROOT` variables over CLP_CORE_DEPS_DIR is not necessary.

Applied to files:

  • docs/src/dev-docs/components-core/index.md
📚 Learning: 2025-07-23T09:54:45.185Z
Learnt from: Bill-hbrhbr
PR: y-scope/clp#1122
File: components/core/src/clp/clp/CMakeLists.txt:175-195
Timestamp: 2025-07-23T09:54:45.185Z
Learning: In the CLP project, when reviewing CMakeLists.txt changes that introduce new compression library dependencies (BZip2, LibLZMA, LZ4, ZLIB), the team prefers to address conditional linking improvements in separate PRs rather than expanding the scope of focused migration PRs like the LibArchive task-based installation migration.

Applied to files:

  • docs/src/dev-docs/components-core/index.md
📚 Learning: 2025-06-18T20:48:48.990Z
Learnt from: quinntaylormitchell
PR: y-scope/clp#968
File: docs/src/user-guide/quick-start/overview.md:53-54
Timestamp: 2025-06-18T20:48:48.990Z
Learning: CLP is designed to run on Linux systems where Python is typically pre-installed, so Python installation links are generally not needed in CLP documentation.

Applied to files:

  • docs/src/dev-docs/components-core/index.md
📚 Learning: 2025-08-25T00:45:05.464Z
Learnt from: junhaoliao
PR: y-scope/clp#1242
File: components/core/tools/scripts/lib_install/ubuntu-jammy/install-prebuilt-packages.sh:39-41
Timestamp: 2025-08-25T00:45:05.464Z
Learning: Task v3.44.1 causes issues in the clp-ffi-js project (issue #110), so CLP should avoid upgrading to v3.44.1 and stick with v3.44.0 for now.

Applied to files:

  • docs/src/dev-docs/components-core/index.md
📚 Learning: 2025-08-09T04:07:27.083Z
Learnt from: Bill-hbrhbr
PR: y-scope/clp#1156
File: components/core/CMakeLists.txt:772-772
Timestamp: 2025-08-09T04:07:27.083Z
Learning: In the CLP project's CMakeLists.txt, when reviewing changes related to the ${zstd_TARGET} variable usage in test linking, the team is planning a refactoring PR to improve this mechanism. Guards for undefined target variables should be deferred to that separate PR rather than being added in focused dependency migration PRs.

Applied to files:

  • docs/src/dev-docs/components-core/index.md
📚 Learning: 2025-08-29T07:31:24.032Z
Learnt from: Bill-hbrhbr
PR: y-scope/clp#1271
File: components/core/tools/scripts/lib_install/centos-stream-9/install-prebuilt-packages.sh:41-43
Timestamp: 2025-08-29T07:31:24.032Z
Learning: In CLP's build tool installation scripts, uv version constraints should use lower bound constraints (uv>=0.8) rather than exact version pinning, following the same philosophy used for other build tools to accommodate platform differences while ensuring minimum required functionality.

Applied to files:

  • docs/src/dev-docs/components-core/index.md
🔇 Additional comments (1)
docs/src/dev-docs/components-core/index.md (1)

140-140: Spider reference link LGTM

Link is clear and matches usage above.

Comment on lines 31 to 35
To get all dependencies required by CLP core, run the `deps:cpp` task:

```shell
task deps:core
task deps:cpp
```
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick

🧩 Analysis chain

deps:cpp rename looks good; add a brief migration note and sweep remaining docs/usages

Add a one‑liner noting the rename to help users updating local scripts. Also sweep the repo for any remaining references to deps:core/core-all-parallel.

Apply this diff:

-To get all dependencies required by CLP core, run the `deps:cpp` task:
+To get all dependencies required by CLP core, run the `deps:cpp` task:
+
+Note: Previously `deps:core`. Update any local scripts accordingly.

Run this script to find stale references:


🏁 Script executed:

#!/bin/bash
set -euo pipefail

echo "Searching for deprecated task names..."
rg -n -C2 -S $'\bdeps:core\b|\bcore-all-parallel\b'

Length of output: 38


Add migration note for renamed deps:cpp task

Add this after the example to help users update local scripts:

-To get all dependencies required by CLP core, run the `deps:cpp` task:
+To get all dependencies required by CLP core, run the `deps:cpp` task:
+
+Note: Previously `deps:core`. Update any local scripts accordingly.

No stale references to deps:core or core-all-parallel remain in the repo.

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
To get all dependencies required by CLP core, run the `deps:cpp` task:
```shell
task deps:core
task deps:cpp
```
To get all dependencies required by CLP core, run the `deps:cpp` task:
Note: Previously `deps:core`. Update any local scripts accordingly.
🤖 Prompt for AI Agents
In docs/src/dev-docs/components-core/index.md around lines 31 to 35, add a short
migration note immediately after the deps:cpp example explaining that the task
was renamed from deps:core (and previously core-all-parallel) to deps:cpp, show
the new command developers should use in their local scripts, and instruct them
to search the repo for and remove or update any stale references to deps:core or
core-all-parallel; keep the note brief and placed directly after the existing
example.

Comment on lines 64 to 70
Notice that this task will also download, build, and install the following libraries that are not
directly used by CLP core, but are required by some other C++ components such as [Spider]:

| Library | Version/commit | Note |
|---------------------------------------------------------------------------------------|-----------------|--------------------------------------------------------------------|
| [mariadb-connector-cpp](https://github.com/mariadb-corporation/mariadb-connector-cpp) | v1.1.5 | Required by [Spider]. Built only when the host platform is Ubuntu. |

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick

🧩 Analysis chain

Clarify Ubuntu versions for mariadb-connector-cpp and verify task gating

Good call-out. Please specify supported Ubuntu releases (e.g., 22.04/24.04) to set expectations, and confirm the Task gating aligns.

Suggested wording tweak:

-| [mariadb-connector-cpp](https://github.com/mariadb-corporation/mariadb-connector-cpp) | v1.1.5          | Required by [Spider]. Built only when the host platform is Ubuntu. |
+| [mariadb-connector-cpp](https://github.com/mariadb-corporation/mariadb-connector-cpp) | v1.1.5          | Required by [Spider]. Built only when the host is Ubuntu (22.04/24.04). |

Verify gating and version consistency:


🏁 Script executed:

#!/bin/bash
set -euo pipefail

echo "Locate mariadb-connector-cpp task definitions and OS gating..."
rg -n -C3 -S 'mariadb-connector-cpp|ubuntu|ID=ubuntu|lsb_release|/etc/os-release|UBUNTU' --type sh --type toml --type yaml --type make --type txt

echo "Confirm documented version matches installer scripts/config..."
rg -n -C2 -S 'mariadb-connector-cpp.*1\.1\.5|1\.1\.5.*mariadb-connector-cpp'

Length of output: 9027


Clarify Ubuntu support for mariadb-connector-cpp — document Ubuntu 22.04 (jammy) and confirm gating

Verified: docs list v1.1.5 and taskfiles/deps/main.yaml pins tags/1.1.5; the task is gated to run only on Ubuntu via /etc/os-release ID checks, but build scripts and docker images target jammy (Ubuntu 22.04) — no evidence of 24.04 support.

Files to check/update:

  • docs/src/dev-docs/components-core/index.md (existing entry)
  • taskfiles/deps/main.yaml (mariadb-connector-cpp block, ~lines 342–364)
  • components/core/tools/scripts/lib_install/mariadb-connector-c.sh (uses UBUNTU_CODENAME)
  • components/core/tools/docker-images/-ubuntu-jammy/

Suggested wording tweak:
| mariadb-connector-cpp | v1.1.5 | Required by [Spider]. Built only when the host is Ubuntu (tested on Ubuntu 22.04 "jammy"). |

🤖 Prompt for AI Agents
In docs/src/dev-docs/components-core/index.md around lines 64–70, the
mariadb-connector-cpp note is ambiguous about which Ubuntu release is supported;
update the table cell to explicitly state it is built only on Ubuntu and tested
on Ubuntu 22.04 "jammy" (suggested wording: "...Built only when the host is
Ubuntu (tested on Ubuntu 22.04 'jammy')."). Also verify and, if needed, update
taskfiles/deps/main.yaml (mariadb-connector-cpp block ~lines 342–364) to ensure
the gating logic matches the docs (OS check only runs on Ubuntu), confirm
components/core/tools/scripts/lib_install/mariadb-connector-c.sh uses
UBUNTU_CODENAME and documents jammy, and ensure any
components/core/tools/docker-images/*-ubuntu-jammy/* references remain
consistent; if 24.04 is not supported, do not claim it and keep docs pointing to
jammy.

Copy link
Member

@LinZhihao-723 LinZhihao-723 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we add deps:spider to the package workflow? It's better that the changes to this type of PR can be tested immediately on the CI.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

📜 Review details

Configuration used: CodeRabbit UI

Review profile: ASSERTIVE

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between dc443f0 and 98d6563.

📒 Files selected for processing (1)
  • taskfiles/deps/main.yaml (17 hunks)
🧰 Additional context used
🧠 Learnings (7)
📓 Common learnings
Learnt from: Bill-hbrhbr
PR: y-scope/clp#1156
File: components/core/CMakeLists.txt:772-772
Timestamp: 2025-08-09T04:07:27.083Z
Learning: In the CLP project's CMakeLists.txt, when reviewing changes related to the ${zstd_TARGET} variable usage in test linking, the team is planning a refactoring PR to improve this mechanism. Guards for undefined target variables should be deferred to that separate PR rather than being added in focused dependency migration PRs.
Learnt from: Bill-hbrhbr
PR: y-scope/clp#1122
File: components/core/src/clp/clp/CMakeLists.txt:175-195
Timestamp: 2025-07-23T09:54:45.185Z
Learning: In the CLP project, when reviewing CMakeLists.txt changes that introduce new compression library dependencies (BZip2, LibLZMA, LZ4, ZLIB), the team prefers to address conditional linking improvements in separate PRs rather than expanding the scope of focused migration PRs like the LibArchive task-based installation migration.
📚 Learning: 2025-08-09T04:07:27.083Z
Learnt from: Bill-hbrhbr
PR: y-scope/clp#1156
File: components/core/CMakeLists.txt:772-772
Timestamp: 2025-08-09T04:07:27.083Z
Learning: In the CLP project's CMakeLists.txt, when reviewing changes related to the ${zstd_TARGET} variable usage in test linking, the team is planning a refactoring PR to improve this mechanism. Guards for undefined target variables should be deferred to that separate PR rather than being added in focused dependency migration PRs.

Applied to files:

  • taskfiles/deps/main.yaml
📚 Learning: 2025-05-28T18:33:30.155Z
Learnt from: anlowee
PR: y-scope/clp#925
File: taskfiles/deps/main.yaml:97-106
Timestamp: 2025-05-28T18:33:30.155Z
Learning: In the taskfiles dependency system (taskfiles/deps/main.yaml), echo commands are used to generate .cmake settings files that are consumed by the main CMake build process. These files set variables like ANTLR_RUNTIME_HEADER to point to dependency locations for use during compilation.

Applied to files:

  • taskfiles/deps/main.yaml
📚 Learning: 2025-06-02T18:22:24.060Z
Learnt from: gibber9809
PR: y-scope/clp#955
File: components/core/src/clp_s/search/sql/CMakeLists.txt:8-26
Timestamp: 2025-06-02T18:22:24.060Z
Learning: In the clp project, ANTLR code generation at build time is being removed by another PR. When reviewing CMake files, be aware that some temporary suboptimal configurations may exist to reduce merge conflicts between concurrent PRs, especially around ANTLR_TARGET calls.

Applied to files:

  • taskfiles/deps/main.yaml
📚 Learning: 2025-09-15T18:53:15.844Z
Learnt from: Bill-hbrhbr
PR: y-scope/clp#1226
File: taskfiles/deps/main.yaml:522-527
Timestamp: 2025-09-15T18:53:15.844Z
Learning: When ystdlib is built with -Dystdlib_BUILD_TESTING=OFF in taskfiles/deps/main.yaml, it does not require Catch2 settings to be injected via CMAKE_GEN_ARGS since testing is disabled and Catch2 is not needed during the build process.

Applied to files:

  • taskfiles/deps/main.yaml
📚 Learning: 2025-08-19T07:08:15.583Z
Learnt from: Bill-hbrhbr
PR: y-scope/clp#1229
File: taskfiles/deps/main.yaml:494-504
Timestamp: 2025-08-19T07:08:15.583Z
Learning: In zlib v1.3.1's CMake build system, only ZLIB_BUILD_EXAMPLES is a valid option. Other ZLIB_BUILD_* flags (MINIZIP, SHARED, STATIC, TESTING) and ZLIB_INSTALL are not defined in the CMakeLists.txt and have no effect. Use BUILD_SHARED_LIBS=ON for shared library control and ZLIB_BUILD_EXAMPLES=OFF to disable examples.

Applied to files:

  • taskfiles/deps/main.yaml
📚 Learning: 2025-07-23T09:54:45.185Z
Learnt from: Bill-hbrhbr
PR: y-scope/clp#1122
File: components/core/src/clp/clp/CMakeLists.txt:175-195
Timestamp: 2025-07-23T09:54:45.185Z
Learning: In the CLP project, when reviewing CMakeLists.txt changes that introduce new compression library dependencies (BZip2, LibLZMA, LZ4, ZLIB), the team prefers to address conditional linking improvements in separate PRs rather than expanding the scope of focused migration PRs like the LibArchive task-based installation migration.

Applied to files:

  • taskfiles/deps/main.yaml
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (4)
  • GitHub Check: ubuntu-jammy-deps-image
  • GitHub Check: musllinux_1_2-x86_64-deps-image
  • GitHub Check: centos-stream-9-deps-image
  • GitHub Check: lint-check (ubuntu-24.04)
🔇 Additional comments (6)
taskfiles/deps/main.yaml (6)

10-18: LGTM! Directory structure changes are consistent.

The migration from core to cpp naming is applied consistently across all directory variables and paths. This aligns with the PR objective to reorganize the build layout.


91-111: LGTM! Spider task implementation is correct.

The spider task correctly uses preconditions instead of status, restricts to Ubuntu as intended, and includes proper dependency management through the spider-dep task. The CMake arguments are appropriate for the spider build configuration.


113-125: LGTM! Spider dependency task follows established patterns.

The spider-dep task correctly mirrors the structure of the core task, ensuring spider dependencies are built and configured properly before the main spider build.


127-139: LGTM! Spider parallel dependencies are correctly defined.

The dependency list for spider-all-parallel includes all necessary libraries for building spider, including the MariaDB connector which is appropriately Ubuntu-gated.


391-411: LGTM! MariaDB connector task correctly implements Ubuntu gating.

The task properly uses preconditions with detailed comments explaining the Ubuntu limitation and DEB install layout requirement. The implementation addresses previous review feedback effectively.


596-596: Remove unnecessary CMAKE_POSITION_INDEPENDENT_CODE flag.

Based on the past review comments indicating this flag is not needed, consider removing it unless spider specifically requires position-independent code for shared library usage.

Apply this diff to remove the unnecessary flag:

          CMAKE_GEN_ARGS:
            - "-C {{.G_DEPS_CPP_CMAKE_SETTINGS_DIR}}/{{.G_BOOST_LIB_NAME}}.cmake"
-            - "-DCMAKE_POSITION_INDEPENDENT_CODE=ON"
            - "-Dystdlib_BUILD_TESTING=OFF"

TARGETS:
- "filesystem"
- "iostreams"
- "process"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick

❓ Verification inconclusive

Verify spider dependency on boost::process.

The addition of "process" to the boost targets suggests spider requires this library. Ensure this dependency is actually needed by spider's codebase.
The search results don't provide specific information about y-scope spider's boost::process dependency. Let me check for more specific information about the spider project directly.


🏁 Script executed:

#!/bin/bash
# Check if spider repository contains any references to boost process
rg -i "process" --type cpp --type hpp -C3 -g '!test*' | head -20

Length of output: 27


🏁 Script executed:

#!/bin/bash
# Search for any inclusion or direct usage of Boost.Process
rg -n "#include.*boost/process.hpp" -C2 -g '!test*'
rg -n "boost::process::" -C2 -g '!test*'
rg -n "Boost::process" -C2 -g '!test*'

Length of output: 446


🏁 Script executed:

#!/bin/bash
# Search entire repo for Boost.Process includes and namespace usages without file‐type filters
grep -RIn "#include\s*<boost/process.hpp>" .
grep -RIn "boost::process::" .

Length of output: 0


Remove unused Boost.Process dependency
Search across the repository (code and build files) shows no #include <boost/process.hpp> or boost::process usages; drop "process" from the Boost targets in taskfiles/deps/main.yaml.

🤖 Prompt for AI Agents
In taskfiles/deps/main.yaml around line 254, the Boost target list includes
"process" but there are no usages of Boost.Process in the repository; remove the
"process" entry from the Boost targets list in that file so the dependency is no
longer declared in your taskfiles/deps/main.yaml.

Copy link
Member

@LinZhihao-723 LinZhihao-723 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For the PR title, how about:

feat(clp-package): Add Spider and its dependencies to the build.

@sitaowang1998 sitaowang1998 changed the title feat: Add spider build task; Add yscope-spider-py as job-orchestration dependency. feat(clp-package): Add Spider and its dependencies to the build. Sep 24, 2025
- "core"
- "clp-package-utils"
- "clp-py-utils"
- "deps:spider"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we should let deps:spider compute a checksum file and let add the checksum file to the package task's source list; otherwise, the package might not be re-run if there are changes in the artifacts generated.

alternatively, we can simply add spider_scheduler and spider_worker to the package task's source list.

@coderabbitai create an issue to track

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# Create GitHub issue to track spider artifact checksum problem

gh issue create \
  --title "Add checksum tracking for spider artifacts in package task" \
  --body "## Problem

The \`package\` task in \`taskfile.yaml\` currently depends on \`deps:spider\` but does not track changes to spider artifacts through checksums or direct source dependencies. This means the \`package\` task might not re-run when spider artifacts (\`spider_scheduler\` and \`spider_worker\`) change.

## Proposed Solutions

1. **Option 1**: Let \`deps:spider\` compute a checksum file and add it to the \`package\` task's \`source\` list
2. **Option 2**: Add \`spider_scheduler\` and \`spider_worker\` binaries directly to the \`package\` task's \`source\` list

## Context

- PR: https://github.com/y-scope/clp/pull/1318
- Comment: https://github.com/y-scope/clp/pull/1318#discussion_r2425015201
- Requested by: @junhaoliao" \
  --assignee junhaoliao \
  --label "enhancement"

Length of output: 42


@junhaoliao, I've created issue #1409 to track adding checksum tracking for spider artifacts in the package task: #1409

The issue includes both proposed solutions and references this PR and comment as backlinks.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(first reported by @sitaowang1998 )

it was found that libmariadbcpp.so, which both spider_scheduler and spider_worker dynamically link to, are not added into the package execution environment (i.e., the clp-package Docker execution image)

@coderabbitai create an issue to track

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash

# Create a GitHub issue for the missing libmariadbcpp.so library
gh issue create \
  --title "Missing libmariadbcpp.so in clp-package Docker execution image" \
  --body "## Description

The shared library \`libmariadbcpp.so\`, which both \`spider_scheduler\` and \`spider_worker\` dynamically link to, is not included in the package execution environment (the \`clp-package\` Docker execution image). This will cause runtime failures when attempting to execute these binaries.

## Context

- Originally reported by: @sitaowang1998
- Reported by: @junhaoliao
- Related PR: #1318
- Comment: https://github.com/y-scope/clp/pull/1318#discussion_r2425018371

## Impact

The spider scheduler and worker binaries cannot run in the Docker execution environment without this library present." \
  --assignee "junhaoliao"

Length of output: 42


@junhaoliao, I've created issue #1410 to track the missing libmariadbcpp.so library in the clp-package Docker execution environment. The issue includes references to this PR and the original reporters.

🐰✨

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants