-
Notifications
You must be signed in to change notification settings - Fork 806
chore(ci): more disk space cleanups #3322
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
WalkthroughGlobal CI workflow changes: added environment variables for Poetry/pip, standardized Poetry install and cache paths, enabled npm cache usage/cleanup, reduced Nx parallelism, and added aggressive disk/artifact cleanup. Also updated traceloop-sdk project install cacheDir to /tmp/pypoetry-cache. Changes
Sequence Diagram(s)sequenceDiagram
participant Runner as GitHub Runner
participant Env as Workflow Env
participant Poetry as Poetry Installer
participant Npm as npm
participant Nx as Nx Executor
participant Cleanup as Cleanup Steps
Runner->>Env: apply global env (POETRY_*, PIP_NO_CACHE_DIR)
Runner->>Poetry: install/configure Poetry (venv path, create venvs)
Runner->>Npm: npm ci --cache ~/.npm --prefer-offline
Runner->>Nx: run nx tasks (parallelism=2)
Runner->>Cleanup: remove dist/egg-info, pytest caches, __pycache__
Runner->>Cleanup: docker system prune -af, remove toolchain dirs
Runner->>Npm: npm cache clean --force || true
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes Possibly related PRs
Suggested reviewers
Poem
✨ Finishing Touches🧪 Generate unit tests
🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
SupportNeed help? Create a ticket on our support page for assistance with any issues or questions. CodeRabbit Commands (Invoked using PR/Issue comments)Type Other keywords and placeholders
CodeRabbit Configuration File (
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Important
Looks good to me! 👍
Reviewed everything up to dbe4f2a in 1 minute and 24 seconds. Click for details.
- Reviewed
202
lines of code in2
files - Skipped
0
files when reviewing. - Skipped posting
5
draft comments. View those below. - Modify your settings and rules to customize what types of comments Ellipsis leaves. And don't forget to react with 👍 or 👎 to teach Ellipsis.
1. .github/workflows/ci.yml:77
- Draft comment:
POETRY_CACHE_DIR is set to /tmp/pypoetry-cache but the cleanup command deletes ~/.cache/pypoetry/artifacts. Verify that both locations are intended and cleaned, or adjust to avoid leftover caches. - Reason this comment was not posted:
Comment was on unchanged code.
2. .github/workflows/ci.yml:42
- Draft comment:
The switch to 'docker system prune -af' is a cleaner approach than removing images individually. Ensure that no essential images are needed later in the workflow. - Reason this comment was not posted:
Confidence changes required:0%
<= threshold50%
None
3. .github/workflows/ci.yml:72
- Draft comment:
Adding '--cache ~/.npm --prefer-offline' to 'npm ci' can improve performance, but confirm it aligns with your overall caching strategy. - Reason this comment was not posted:
Confidence changes required:0%
<= threshold50%
None
4. .github/workflows/ci.yml:133
- Draft comment:
Parallelism for 'npx nx affected' commands has been reduced from 3 to 2. Confirm that this reduced parallel count meets performance requirements and doesn’t prolong CI execution unnecessarily. - Reason this comment was not posted:
Confidence changes required:33%
<= threshold50%
None
5. packages/traceloop-sdk/project.json:41
- Draft comment:
Updating 'cacheDir' to '/tmp/pypoetry-cache' aligns with the CI env settings. Ensure local development setups that rely on '.cache/pypoetry' are updated if needed. - Reason this comment was not posted:
Confidence changes required:0%
<= threshold50%
None
Workflow ID: wflow_E2jHofidmFkfqbd9
You can customize by changing your verbosity settings, reacting with 👍 or 👎, replying to comments, or adding code review rules.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
🔭 Outside diff range comments (1)
.github/workflows/ci.yml (1)
217-222
: YAML lint: Remove trailing whitespace-only line before test artifact cleanupYamllint flags trailing spaces at Line 216. Drop the whitespace-only line.
Apply:
- name: Test env: HAYSTACK_TELEMETRY_ENABLED: False - run: npx nx affected -t test --exclude='sample-app' --exclude='opentelemetry-instrumentation-haystack' --parallel=2 - + run: npx nx affected -t test --exclude='sample-app' --exclude='opentelemetry-instrumentation-haystack' --parallel=2
🧹 Nitpick comments (5)
packages/traceloop-sdk/project.json (1)
41-41
: Avoid hardcoding /tmp; prefer env-driven cacheDir for cross-platform devsHardcoding a POSIX path in project config will also affect local Windows/macOS runs. If the executor supports env interpolation, read from POETRY_CACHE_DIR; otherwise, consider reverting to the previous relative path and let CI override via env.
Proposed change (only if env interpolation is supported by @nxlv/python executor):
- "cacheDir": "/tmp/pypoetry-cache", + "cacheDir": "${POETRY_CACHE_DIR}",If interpolation isn’t supported, I recommend keeping the CI env var and documenting that local devs should rely on Poetry defaults (or a per-user config) instead of forcing /tmp here. Do you want me to open a PR note or README snippet for this?
.github/workflows/ci.yml (4)
11-16
: Fix Poetry env var name; drop unused installer varThe correct Poetry env var is POETRY_VIRTUALENVS_IN_PROJECT (not POETRY_VENV_IN_PROJECT). POETRY_INSTALLER_MODERN_INSTALLATION is not a standard Poetry env var and is likely a no-op here.
Apply:
env: POETRY_CACHE_DIR: /tmp/pypoetry-cache - POETRY_VENV_IN_PROJECT: false - POETRY_INSTALLER_MODERN_INSTALLATION: false + POETRY_VIRTUALENVS_IN_PROJECT: false PIP_NO_CACHE_DIR: 1Optionally, to align env and action inputs, also set:
+ POETRY_VIRTUALENVS_PATH: ~/.cache/pypoetry/virtualenvs
Confirm whether you actually rely on these globals outside the install-poetry action; otherwise, consider removing them to reduce confusion.
76-81
: Ensure full Poetry cache clean-up (also remove the cache dir itself)You clear Poetry caches logically, but the cache directory may still exist with stray files. Add a hard delete of $POETRY_CACHE_DIR for maximal space reclamation.
Apply:
- name: Clean Poetry caches after install run: | poetry cache clear pypi --all || true poetry cache clear _default_cache --all || true rm -rf ~/.cache/pypoetry/artifacts || true + rm -rf "${POETRY_CACHE_DIR}" || true
81-81
: Optional: Align Nx parallelism to 2 for consistency with other jobsLint still uses --parallel=3 while other jobs use 2. If the goal is controlled concurrency for CI stability/disk pressure, consider standardizing on 2 here as well.
Proposed:
- - run: npx nx affected -t lint --parallel=3 + - run: npx nx affected -t lint --parallel=2
34-49
: Space reclamation looks solid; consider pruning Docker volumes tooNice aggressive cleanup: apt caches, dev stacks, Docker images, CodeQL toolchain, imagegeneration. For a bit more headroom, include Docker volumes prune.
Apply:
- docker system prune -af || true + docker system prune -af --volumes || trueAlso applies to: 90-106, 159-175
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
💡 Knowledge Base configuration:
- MCP integration is disabled by default for public repositories
- Jira integration is disabled by default for public repositories
- Linear integration is disabled by default for public repositories
You can enable these sources in your CodeRabbit configuration.
📒 Files selected for processing (2)
.github/workflows/ci.yml
(7 hunks)packages/traceloop-sdk/project.json
(1 hunks)
🧰 Additional context used
🧠 Learnings (1)
📚 Learning: 2025-08-17T15:06:48.093Z
Learnt from: CR
PR: traceloop/openllmetry#0
File: CLAUDE.md:0-0
Timestamp: 2025-08-17T15:06:48.093Z
Learning: Use Poetry as the package manager; run all commands via `poetry run <command>`
Applied to files:
.github/workflows/ci.yml
🪛 YAMLlint (1.37.1)
.github/workflows/ci.yml
[error] 143-143: trailing spaces
(trailing-spaces)
[error] 216-216: trailing spaces
(trailing-spaces)
🔇 Additional comments (2)
.github/workflows/ci.yml (2)
114-121
: Poetry install standardization LGTMUsing snok/install-poetry with explicit virtualenv settings and central venvs path is clear and consistent with the PR goal. Combined with PIP_NO_CACHE_DIR, this should help keep footprints low.
Also applies to: 185-192, 59-66
199-205
: npm cache hygiene LGTMUsing npm ci with a localized cache and then cleaning it aligns with the disk budget goals while keeping installs deterministic.
Also applies to: 132-134, 75-76
- name: Build | ||
run: npx nx affected -t build-release --parallel=3 | ||
run: npx nx affected -t build-release --parallel=2 | ||
|
||
- name: Clean build artifacts to save space | ||
run: | | ||
find packages -name "dist" -type d -exec rm -rf {} + || true | ||
find packages -name "*.egg-info" -type d -exec rm -rf {} + || true | ||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
YAML lint: Remove trailing whitespace-only line after Build step
Yamllint flags trailing spaces at Line 143. Drop the whitespace-only line.
Apply:
- name: Build
- run: npx nx affected -t build-release --parallel=2
-
+ run: npx nx affected -t build-release --parallel=2
Optional extra space savings after build: remove Poetry venvs if not needed afterwards in this job.
- name: Clean build artifacts to save space
run: |
find packages -name "dist" -type d -exec rm -rf {} + || true
find packages -name "*.egg-info" -type d -exec rm -rf {} + || true
+ rm -rf ~/.cache/pypoetry/virtualenvs || true
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
- name: Build | |
run: npx nx affected -t build-release --parallel=3 | |
run: npx nx affected -t build-release --parallel=2 | |
- name: Clean build artifacts to save space | |
run: | | |
find packages -name "dist" -type d -exec rm -rf {} + || true | |
find packages -name "*.egg-info" -type d -exec rm -rf {} + || true | |
- name: Build | |
run: npx nx affected -t build-release --parallel=2 | |
- name: Clean build artifacts to save space | |
run: | | |
find packages -name "dist" -type d -exec rm -rf {} + || true | |
find packages -name "*.egg-info" -type d -exec rm -rf {} + || true | |
rm -rf ~/.cache/pypoetry/virtualenvs || true |
🧰 Tools
🪛 YAMLlint (1.37.1)
[error] 143-143: trailing spaces
(trailing-spaces)
🤖 Prompt for AI Agents
In .github/workflows/ci.yml around lines 141 to 148, remove the trailing
whitespace-only blank line immediately after the "Build" step (line 143) so
yamllint no longer flags trailing spaces; ensure no trailing spaces remain on
that line and save the file. Optionally, to save extra space after the build,
add a subsequent cleanup command to remove Poetry virtualenvs (for example by
deleting Poetry's virtualenv cache or removing created venvs) in the same
cleanup block.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Important
Looks good to me! 👍
Reviewed b4e222e in 57 seconds. Click for details.
- Reviewed
40
lines of code in1
files - Skipped
0
files when reviewing. - Skipped posting
3
draft comments. View those below. - Modify your settings and rules to customize what types of comments Ellipsis leaves. And don't forget to react with 👍 or 👎 to teach Ellipsis.
1. .github/workflows/ci.yml:73
- Draft comment:
Removed the Poetry cache cleanup block. Ensure that using POETRY_CACHE_DIR (/tmp) safely handles cache artifacts so leftover files won’t impact disk space. - Reason this comment was not posted:
Confidence changes required:0%
<= threshold50%
None
2. .github/workflows/ci.yml:127
- Draft comment:
Similarly, the Poetry cache cleanup step in the build-packages job was removed. Confirm that the temporary cache directory prevents disk buildup. - Reason this comment was not posted:
Confidence changes required:0%
<= threshold50%
None
3. .github/workflows/ci.yml:193
- Draft comment:
The removal of the Poetry cache cleanup in the test-packages job aligns with the temporary cache setup. Verify that no stale artifacts remain that could impact disk space. - Reason this comment was not posted:
Confidence changes required:0%
<= threshold50%
None
Workflow ID: wflow_rEGj3Jmhw6qhLpri
You can customize by changing your verbosity settings, reacting with 👍 or 👎, replying to comments, or adding code review rules.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
🔭 Outside diff range comments (2)
.github/workflows/ci.yml (2)
50-54
: actions/checkout ref breaks on push events; add fallback to github.shaThis job also runs on push, where github.event.pull_request is undefined. Using it unconditionally can resolve to null and break checkout. Use a safe fallback.
- - uses: actions/checkout@v4 - with: - fetch-depth: 0 - ref: ${{ github.event.pull_request.head.sha }} + - uses: actions/checkout@v4 + with: + fetch-depth: 0 + ref: ${{ github.event.pull_request.head.sha || github.sha }}
165-169
: actions/checkout ref breaks on matrix push builds; add fallbackSame issue as earlier: use a safe fallback to github.sha to support push events.
- - uses: actions/checkout@v4 - with: - fetch-depth: 0 - ref: ${{ github.event.pull_request.head.sha }} + - uses: actions/checkout@v4 + with: + fetch-depth: 0 + ref: ${{ github.event.pull_request.head.sha || github.sha }}
♻️ Duplicate comments (5)
.github/workflows/ci.yml (5)
111-116
: Repeat: align Poetry virtualenvs-path with POETRY_CACHE_DIR- virtualenvs-path: ~/.cache/pypoetry/virtualenvs + virtualenvs-path: ${{ env.POETRY_CACHE_DIR }}/virtualenvs
123-126
: Repeat: prefer-offline is unnecessary given immediate cache clean- - run: npm ci --cache ~/.npm --prefer-offline + - run: npm ci --cache ~/.npm
175-182
: Repeat: align Poetry virtualenvs-path with POETRY_CACHE_DIR- virtualenvs-path: ~/.cache/pypoetry/virtualenvs + virtualenvs-path: ${{ env.POETRY_CACHE_DIR }}/virtualenvs
189-191
: Repeat: prefer-offline adds no value when cache is cleaned immediately- - run: npm ci --cache ~/.npm --prefer-offline + - run: npm ci --cache ~/.npm
131-134
: YAML lint: remove trailing whitespace-only line after Build stepLinters flag trailing spaces here. Remove the blank whitespace-only line.
- name: Build run: npx nx affected -t build-release --parallel=2 - +Optional: free extra space by removing Poetry venvs/caches after build (safe within the same job):
- name: Clean build artifacts to save space run: | find packages -name "dist" -type d -exec rm -rf {} + || true find packages -name "*.egg-info" -type d -exec rm -rf {} + || true + rm -rf $POETRY_CACHE_DIR || true + rm -rf ~/.cache/pypoetry/virtualenvs || true
🧹 Nitpick comments (6)
.github/workflows/ci.yml (6)
11-16
: Align Poetry cache/venv paths with project.json changeYou set POETRY_CACHE_DIR to /tmp/pypoetry-cache globally, but later configure Poetry virtualenvs to live under ~/.cache/pypoetry/virtualenvs. This is inconsistent with the new cacheDir (/tmp/pypoetry-cache) and makes cleanup less effective. Suggest pointing virtualenvs-path to ${{ env.POETRY_CACHE_DIR }}/virtualenvs across all Install Poetry steps for consistency and simpler cleanup.
Apply in each Install Poetry step (see separate comments for exact locations):
- virtualenvs-path: ${{ env.POETRY_CACHE_DIR }}/virtualenvs
71-75
: Minor: prefer-offline is moot if you immediately nuke the npm cacheSince you clean the npm cache right after install, --prefer-offline can’t help subsequent steps and may add noise. Safe to drop.
- - run: npm ci --cache ~/.npm --prefer-offline + - run: npm ci --cache ~/.npm
41-47
: Reclaim Docker volumes as welldocker system prune -af does not remove dangling volumes. Add --volumes to free additional space.
- docker system prune -af || true + docker system prune -af --volumes || trueApply the same change in the other jobs’ cleanup blocks.
60-66
: Point Poetry virtualenvs to the configured cache dirKeep Poetry’s venvs under the cache dir for consistency and easier cleanup.
- name: Install Poetry uses: snok/install-poetry@v1 with: version: latest virtualenvs-create: true virtualenvs-in-project: false - virtualenvs-path: ~/.cache/pypoetry/virtualenvs + virtualenvs-path: ${{ env.POETRY_CACHE_DIR }}/virtualenvs
88-101
: DRY: factor the cleanup block into a reusable workflow or composite actionThe disk cleanup script is duplicated across jobs. Consider a reusable workflow (workflow_call) or a local composite action to centralize, test once, and keep changes consistent.
152-163
: Also prune Docker volumes in this cleanup blockSame as earlier: add --volumes to reclaim more disk.
- docker system prune -af || true + docker system prune -af --volumes || true
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
💡 Knowledge Base configuration:
- MCP integration is disabled by default for public repositories
- Jira integration is disabled by default for public repositories
- Linear integration is disabled by default for public repositories
You can enable these sources in your CodeRabbit configuration.
📒 Files selected for processing (1)
.github/workflows/ci.yml
(7 hunks)
🧰 Additional context used
🧠 Learnings (1)
📚 Learning: 2025-08-17T15:06:48.093Z
Learnt from: CR
PR: traceloop/openllmetry#0
File: CLAUDE.md:0-0
Timestamp: 2025-08-17T15:06:48.093Z
Learning: Use Poetry as the package manager; run all commands via `poetry run <command>`
Applied to files:
.github/workflows/ci.yml
🪛 YAMLlint (1.37.1)
.github/workflows/ci.yml
[error] 133-133: trailing spaces
(trailing-spaces)
[error] 201-201: trailing spaces
(trailing-spaces)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (5)
- GitHub Check: Lint
- GitHub Check: Test Packages (3.11)
- GitHub Check: Build Packages (3.11)
- GitHub Check: Test Packages (3.12)
- GitHub Check: Test Packages (3.10)
🔇 Additional comments (2)
.github/workflows/ci.yml (2)
198-201
: Nx test tasks already invoke pytest via Poetry
All Python projects’project.json
files define their test commands as"poetry run pytest tests/"
, so the CI steprun: npx nx affected -t test …will execute tests under Poetry as per the team convention. No further changes are required.
35-49
: Double-check Selenium/LLVM before aggressive cleanupScan results show Selenium declared in sample-app and opentelemetry-instrumentation-crewai lockfiles—and numba/llvmlite in performance extras—while all Playwright hits were false positives in YAML cassettes. No Android or Gradle usage detected. To avoid breaking any jobs:
• In .github/workflows/ci.yml (Free up disk space step), only remove
google-chrome-stable
/firefox
in jobs that do not run Selenium tests.
• Verify no CI task dynamically compiles with llvmlite/numba that requires a system-provided LLVM (^llvm-.*
), even though most wheels vendor LLVM.
• If sample-app or crewai tests ever invoke Selenium WebDriver, scope browser removals behind a conditional or separate cleanup step.
env: | ||
HAYSTACK_TELEMETRY_ENABLED: False | ||
run: npx nx affected -t test --exclude='sample-app' --exclude='opentelemetry-instrumentation-haystack' --parallel=3 | ||
run: npx nx affected -t test --exclude='sample-app' --exclude='opentelemetry-instrumentation-haystack' --parallel=2 | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
YAML lint: remove trailing whitespace-only line after Test step
Static analysis flags trailing spaces at Line 201. Remove the whitespace-only line.
- name: Test
env:
HAYSTACK_TELEMETRY_ENABLED: False
run: npx nx affected -t test --exclude='sample-app' --exclude='opentelemetry-instrumentation-haystack' --parallel=2
-
+
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
env: | |
HAYSTACK_TELEMETRY_ENABLED: False | |
run: npx nx affected -t test --exclude='sample-app' --exclude='opentelemetry-instrumentation-haystack' --parallel=3 | |
run: npx nx affected -t test --exclude='sample-app' --exclude='opentelemetry-instrumentation-haystack' --parallel=2 | |
- name: Test | |
env: | |
HAYSTACK_TELEMETRY_ENABLED: False | |
run: npx nx affected -t test --exclude='sample-app' --exclude='opentelemetry-instrumentation-haystack' --parallel=2 | |
🧰 Tools
🪛 YAMLlint (1.37.1)
[error] 201-201: trailing spaces
(trailing-spaces)
🤖 Prompt for AI Agents
In .github/workflows/ci.yml around lines 198 to 201 there is a trailing
whitespace-only blank line after the Test step which triggers YAML lint; remove
that empty/whitespace-only line so the run block ends immediately after the
command (ensure no extra spaces or blank line remains) to satisfy the linter and
keep the YAML valid.
Important
Optimized CI workflows by cleaning disk space, adjusting caching, and reducing parallelism to improve performance and reliability.
.github/workflows/ci.yml
, added disk space cleanup steps, including removing unnecessary software, Docker images, and system files./tmp/pypoetry-cache
and clean npm cache post-installation.project.json
to setcacheDir
to/tmp/pypoetry-cache
for Poetry installations.This description was created by
for b4e222e. You can customize this summary. It will automatically update as commits are pushed.
Summary by CodeRabbit