Backport upstream #1290: Run ingest service python workers as abc, not root#37
Merged
Conversation
With PUID/PGID set (the documented configuration), cps.py already drops to abc via s6-setuidgid in svc-calibre-web-automated/run, and the inotifywait call in cwa-ingest-service/run is also prefixed with s6-setuidgid abc. The three other python invocations in the same run script (watch_fallback.py on the polling path and the two ingest_processor.py calls) run as root, so every auto-imported book lands in /calibre-library as root:root. The web UI, running as abc, then fails to delete these books with Permission denied on metadata.opf. Prefix each of those python3 invocations with s6-setuidgid abc so the polling watcher and the ingest processor run with the same uid and gid as the rest of the stack, producing abc-owned files that the web UI can manage.
new-usemame
added a commit
that referenced
this pull request
May 4, 2026
…orough validation (#54) * Revert "fix: use subquery loading for Books relationships to prevent DetachedInstanceError (#40)" This reverts commit bc9386c. * Revert "Run ingest service python workers as abc, not root (#37)" This reverts commit f2616d6. --------- Co-authored-by: new-usemame <248195428+new-usemame@users.noreply.github.com>
new-usemame
added a commit
that referenced
this pull request
May 4, 2026
* Run ingest service python workers as abc, not root (#37) With PUID/PGID set (the documented configuration), cps.py already drops to abc via s6-setuidgid in svc-calibre-web-automated/run, and the inotifywait call in cwa-ingest-service/run is also prefixed with s6-setuidgid abc. The three other python invocations in the same run script (watch_fallback.py on the polling path and the two ingest_processor.py calls) run as root, so every auto-imported book lands in /calibre-library as root:root. The web UI, running as abc, then fails to delete these books with Permission denied on metadata.opf. Prefix each of those python3 invocations with s6-setuidgid abc so the polling watcher and the ingest processor run with the same uid and gid as the rest of the stack, producing abc-owned files that the web UI can manage. Co-authored-by: new-usemame <248195428+new-usemame@users.noreply.github.com> * feat(s6): one-time chown-to-abc migration alongside #37 re-land #37 (held by #54 from v4.0.16) prefixes the ingest-service python workers with s6-setuidgid abc. Architecturally correct — the web UI runs as abc and was failing to delete books that the ingest worker had created as root. But for users whose libraries already contain root:root-owned author/series subdirectories from before the patch landed, post-#37 ingest of a new book targeting an existing root-owned author dir fails with EACCES. Empirically demonstrated by tests/integration/test_ingest_setuidgid_permission.py: | Scenario | Result | |---------------------------------------------------------|---------------| | Pre-#37 root worker writes into root-owned subdir | PASS | | Post-#37 abc worker writes into root-owned subdir | EACCES | | Post-#37 abc worker after chown -R abc:abc /cal-lib | PASS | | Post-#37 abc worker into clean abc-owned library | PASS | Fix: a new oneshot s6 service cwa-chown-library-migration that runs once per installation: - Gated by /config/.cwa-chown-library-done sentinel (idempotent across reboots; second invocation no-ops) - Skipped when NETWORK_SHARE_MODE is set — chown over NFS is slow and may be refused by the export's policy. Logs guidance for manual run; still writes the sentinel so the skip is recorded. - Best-effort: chown failures are logged but never block boot. cwa-ingest-service is wired to depend on the migration via dependencies.d/cwa-chown-library-migration so the chown completes before any ingest worker starts. 8 integration tests in test_ingest_setuidgid_permission.py exercise the real artifacts inside the v4.0.15 production image (skipped when Docker is unavailable): - baseline pre-#37 behavior - post-#37 regression visibility - migration script chowns + writes sentinel - migration script is idempotent - migration script honors NETWORK_SHARE_MODE - end-to-end: post-migration ingest into formerly root-owned dir works Co-author: @haraldpdl (upstream PR #1290). --------- Co-authored-by: new-usemame <248195428+new-usemame@users.noreply.github.com>
new-usemame
added a commit
that referenced
this pull request
May 18, 2026
…follow-up (#231) Backport of upstream crocodilestick/Calibre-Web-Automated#1349 by @navels into our community-maintained CWA build. Original PR has been open since 2026-05-13 with no upstream review activity; users get the ingest-reliability improvements today rather than waiting for upstream pace. The PR adds five orthogonal improvements: (1) stale ingest events for already-moved/deleted files fast-exit before the heavy startup path, eliminating the NFS/polling-fallback re-emit loop; (2) the s6 ingest-service script tracks per-file processing markers so the watcher can't reprocess the same path while it's in-flight; (3) module-import-time work in scripts/ingest_processor.py defers behind initialize_runtime() so the script can fast-exit without paying cps/CWA_DB/EPUBFixer/audiobook/requests import cost; (4) per-book refresh_cwa_session + invalidate_duplicate_cache + schedule_debounced_duplicate_scan calls consolidate into a single run_post_batch_follow_up() that fires once after the dirty-marker batch goes quiet; (5) bounded retries on the cwa-internal POST endpoints replace single-shot best-effort calls. Reconciliation against fork-original work was non-trivial — full per-file rationale in the fixup commit body. The /cwa-internal/reconnect-db endpoint keeps fork PR #199's synchronous CalibreDB.refresh_for_new_data() (going back to TaskReconnectDatabase via WorkerThread.add would re-introduce the engine-disposal race that caused fork #192); ingest_processor.py keeps PR #199's _run_calibredb_add_with_retry + metadata_db_write_lock() and PR #208's deferred lock-acquire; the s6 service script preserves PR #37/#56's s6-setuidgid abc privilege drop in the new run_processor_with_timeout() helper; PR #210's debounce-default tests updated to no-op the per-script pins for ingest_processor (the value lives server-side now at /cwa-internal/queue-duplicate-scan). Live-verified end-to-end on cwn-local: missing-target fast-exit doesn't initialize runtime or acquire lock; post-batch follow-up fires exactly once after the dirty-marker batch quiets and refreshes the duplicate cache cleanly (groups=2 max_book_id=12 logged); real ingest of alice-in-wonderland.epub triggered the s6 service Post-batch follow-up exactly once (not per book), no errors. 6 new unit tests + 35 smoke tests + 165 shell tests added by @navels all pass; 65 adjacent regression tests stay green. Credit: @navels — both upstream commits authored by Lee Nave <navels@gmail.com> are preserved on the squash; the reconciliation commit landed under new-usemame identity. Dual-audience credit comment will be posted on CWA#1349 after the release tag publishes.
new-usemame
added a commit
that referenced
this pull request
May 18, 2026
) Replaces the O(N) full-library duplicate scan with an O(1) maintained index. Each book import / metadata edit / delete updates the index incrementally instead of triggering a full re-scan; existing /duplicates page becomes the user-facing entry point for the one-time post-upgrade baseline scan. Backport of CWA upstream #1353 by @navels, reconciled with fork-divergent work in scripts/ingest_processor.py and cps/editbooks.py: preserves PR #199 metadata.db flock + _run_calibredb_add_with_retry, PR #210 debounce default 30 + clamp floor 10, PR #212 SQLite UDF connect-event listener, PR #37/#56 s6-setuidgid abc privilege drop on the ingest worker, v4.0.75 fast-exit pattern with _load_fork_cps_imports, and PR #100 batched cache-invalidation semantics (mechanism migrated from per-batch invalidate_duplicate_cache() to per-batch _queue_duplicate_scan_after_change()). Verified on cwn-local: schema migration applied idempotently, 15-book baseline scan populates cwa_duplicate_book_keys table with all rows, the existing duplicate groups (Alice's Adventures in Wonderland ×5, The Republic ×2) render from the index, Resolution Preview Bootstrap modal replaces the old alert() dialog, per-book incremental scan endpoint returns {"success":true,"result_count":N,"message":"..."} in <50ms, container restart clean with no traceback. 173 new+updated unit tests pass. Inspired-by @navels in crocodilestick/Calibre-Web-Automated#1353.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Cherry-pick of upstream crocodilestick/Calibre-Web-Automated#1290 by @haraldpdl.
Original title: Run ingest service python workers as abc, not root
Original PR: crocodilestick/Calibre-Web-Automated#1290
Verification
Walk through
notes/merge/upstream-pr-1290-verify.mdbefore merging.Diff snapshot
See
notes/cherry-pick-1290.diff(captured at prep time; rebase if the upstream PR has moved).(Prepared by an automation pipeline. Do not merge until the verification checklist is signed off.)