Skip to content

feat(pqlmanifest): implement S3 polling support for PQL manifest#2726

Merged
StarpTech merged 6 commits intomainfrom
dustin/support-s3-poling-for-pql-manifest
Apr 2, 2026
Merged

feat(pqlmanifest): implement S3 polling support for PQL manifest#2726
StarpTech merged 6 commits intomainfrom
dustin/support-s3-poling-for-pql-manifest

Conversation

@StarpTech
Copy link
Copy Markdown
Contributor

@StarpTech StarpTech commented Apr 1, 2026

Summary by CodeRabbit

  • New Features
    • Added a provider registry for unified storage lookups and registry-backed config/manifest resolution.
  • Improvements
    • APQ and config/manifest polling now use the registry; manifest mode can be backed by storage providers, supports .gz/.zst compressed manifests, and uses conditional fetches.
  • Bug Fixes
    • Poller skips redundant updates when manifests are unchanged and errors for unsupported filesystem manifest usage.
  • Documentation
    • Added S3-backed manifest configuration guide.
  • Tests
    • Added tests covering the registry, storage-backed manifest fetcher/poller lifecycle, and decompression logic.

Checklist

  • I have discussed my proposed changes in an issue and have received approval to proceed.
  • I have followed the coding standards of the project.
  • Tests or benchmarks have been added or updated.
  • Documentation has been updated on https://github.com/wundergraph/docs-website.
  • I have read the Contributors Guide.

Open Source AI Manifesto

This project follows the principles of the Open Source AI Manifesto. Please ensure your contribution aligns with its principles.

@github-actions
Copy link
Copy Markdown

github-actions bot commented Apr 1, 2026

Router-nonroot image scan passed

✅ No security vulnerabilities found in image:

ghcr.io/wundergraph/cosmo/router:sha-1ad65d3617c27d9ba5ab0ae55cbd5cbde8e66627-nonroot

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai bot commented Apr 1, 2026

Walkthrough

Adds ProviderRegistry, introduces ManifestFetcher and StorageFetcher, refactors router bootstrap to use the registry, reworks persisted-operations manifest polling (conditional reads, decompression, changed detection), updates storage client signatures, and adds tests and docs.

Changes

Cohort / File(s) Summary
Router core & bootstrap
router/core/router.go, router/core/router_config.go, router/core/provider_registry.go, router/core/provider_registry_test.go, router/core/init_config_poller.go
Adds ProviderRegistry, stores it on Router config, constructs it at bootstrap, and updates config-poller/client initialization to accept a *ProviderRegistry.
Persisted-ops manifest fetcher & poller
router/internal/persistedoperation/pqlmanifest/storage_fetcher.go, router/internal/persistedoperation/pqlmanifest/storage_fetcher_test.go, router/internal/persistedoperation/pqlmanifest/poller.go
Adds ManifestFetcher interface and StorageFetcher implementation (last-modified tracking), updates Poller to use the interface, and refactors fetch/change-detection and poll logging; includes tests for fetch semantics and polling lifecycle.
Persisted-ops storage clients & interface
router/internal/persistedoperation/client.go, router/internal/persistedoperation/operationstorage/fs/client.go, router/internal/persistedoperation/operationstorage/cdn/client.go, router/internal/persistedoperation/operationstorage/s3/client.go, router/internal/persistedoperation/operationstorage/s3/client_test.go
Removes ReadManifest from StorageClient interface; CDN/S3 ReadManifest now accept modifiedSince; S3 treats 304 as not-modified and supports .gz/.zst decompression; filesystem client no longer exposes manifest reads; adds S3 decompression tests.
Config & fixtures
router/pkg/config/config.go, router/pkg/config/config.schema.json, router/pkg/config/fixtures/full.yaml, router/pkg/config/testdata/config_defaults.json, router/pkg/config/testdata/config_full.json
Adds PQLManifestConfig.FileName defaulting to manifest.json and updates schema/fixtures/testdata to include the new file_name field.
Docs & router tests
router/internal/persistedoperation/PO_MANIFEST_S3_GUIDE.md, router-tests/operations/pql_manifest_test.go
Adds S3 manifest configuration guide and adjusts router startup log assertion to match changed log text.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Possibly related PRs

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 37.50% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title 'feat(pqlmanifest): implement S3 polling support for PQL manifest' directly and clearly summarizes the main change: adding S3 polling support for PQL manifest. It is specific, concise, and accurately reflects the core functionality introduced across the codebase.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch

Comment @coderabbitai help to get the list of available commands and usage tips.

@codecov
Copy link
Copy Markdown

codecov bot commented Apr 1, 2026

Codecov Report

❌ Patch coverage is 47.54717% with 139 lines in your changes missing coverage. Please review.
✅ Project coverage is 60.86%. Comparing base (91e7f46) to head (bd0c422).
⚠️ Report is 1 commits behind head on main.

Files with missing lines Patch % Lines
router/core/router.go 29.34% 103 Missing and 15 partials ⚠️
...l/persistedoperation/operationstorage/s3/client.go 58.06% 12 Missing and 1 partial ⚠️
router/core/init_config_poller.go 16.66% 5 Missing ⚠️
.../persistedoperation/pqlmanifest/storage_fetcher.go 88.23% 1 Missing and 1 partial ⚠️
.../persistedoperation/operationstorage/cdn/client.go 0.00% 1 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main    #2726      +/-   ##
==========================================
- Coverage   63.35%   60.86%   -2.50%     
==========================================
  Files         249      251       +2     
  Lines       26671    26763      +92     
==========================================
- Hits        16898    16289     -609     
- Misses       8408     9117     +709     
+ Partials     1365     1357       -8     
Files with missing lines Coverage Δ
router/core/provider_registry.go 100.00% <100.00%> (ø)
router/core/router_config.go 88.12% <ø> (-5.63%) ⬇️
router/internal/persistedoperation/client.go 55.93% <ø> (-35.60%) ⬇️
...l/persistedoperation/operationstorage/fs/client.go 0.00% <ø> (-54.84%) ⬇️
.../internal/persistedoperation/pqlmanifest/poller.go 86.36% <100.00%> (ø)
router/pkg/config/config.go 80.51% <ø> (ø)
.../persistedoperation/operationstorage/cdn/client.go 62.24% <0.00%> (ø)
.../persistedoperation/pqlmanifest/storage_fetcher.go 88.23% <88.23%> (ø)
router/core/init_config_poller.go 3.09% <16.66%> (ø)
...l/persistedoperation/operationstorage/s3/client.go 20.22% <58.06%> (+20.22%) ⬆️
... and 1 more

... and 24 files with indirect coverage changes

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@StarpTech StarpTech enabled auto-merge (squash) April 1, 2026 19:45
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@router/core/router.go`:
- Around line 1277-1301: The code can panic if storageProviderID is set but
pClient is nil, so before constructing the storageFetcher
(pqlmanifest.NewStorageFetcher) check pClient for nil and return a clear error
(or fallback) if it is nil; specifically verify pClient != nil before calling
pClient.ReadManifest and before creating storageFetcher/poller (references:
storageProviderID, pClient, pClient.ReadManifest, storageFetcher,
pqlmanifest.NewStorageFetcher, pqlmanifest.NewPoller, poller.FetchInitial,
r.pqlPoller) and fail fast with a descriptive fmt.Errorf when misconfigured.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 0939e58f-06bc-42cb-9bb2-88b8a6432714

📥 Commits

Reviewing files that changed from the base of the PR and between 6002e5b and 54e0f0b.

📒 Files selected for processing (4)
  • router/core/router.go
  • router/internal/persistedoperation/pqlmanifest/poller.go
  • router/internal/persistedoperation/pqlmanifest/storage_fetcher.go
  • router/internal/persistedoperation/pqlmanifest/storage_fetcher_test.go

Copy link
Copy Markdown
Contributor

@SkArchon SkArchon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Two comments

@StarpTech StarpTech requested a review from SkArchon April 2, 2026 00:03
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 4

♻️ Duplicate comments (1)
router/core/router.go (1)

1235-1253: ⚠️ Potential issue | 🟠 Major

Reject bad explicit provider IDs even without a graph token.

An explicit Storage.ProviderID is only rejected inside the r.graphApiToken != "" branch. With a misspelled provider ID and no token, this falls through to return nil, nil, nil, which silently disables persisted-operation storage and leaves the manifest path without a reader instead of failing fast on the config error.

🔧 Suggested fix
-	if r.graphApiToken != "" {
-		if providerID != "" {
-			return nil, nil, fmt.Errorf("unknown storage provider id '%s' for persisted operations", providerID)
-		}
+	if providerID != "" {
+		return nil, nil, fmt.Errorf("unknown storage provider id %q for persisted operations", providerID)
+	}
+
+	if r.graphApiToken != "" {
 
 		c, err := cdn.NewClient(r.cdnConfig.URL, r.graphApiToken, cdn.Options{
 			Logger: r.logger,
 		})
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@router/core/router.go` around lines 1235 - 1253, The code only rejects an
explicit providerID inside the r.graphApiToken != "" branch causing a misspelled
Storage.ProviderID to silently disable persisted operations when no token is
present; add a guard at the start (check providerID != "" before the
graphApiToken branch) that returns an explicit error (same format used
elsewhere, e.g. fmt.Errorf("unknown storage provider id '%s' for persisted
operations", providerID)) so the function fails fast when
Storage.ProviderID/providerID is set but unknown; keep the existing CDN client
creation (r.graphApiToken, cdn.NewClient, r.logger.Debug, return c,
c.ReadManifest) logic unchanged for the token-present case.
🧹 Nitpick comments (1)
router/core/router.go (1)

1258-1277: Return early before resolving Redis storage when APQ is disabled.

The disabled path currently still resolves/logs the Redis provider and may allocate a client that gets discarded immediately. Moving the Enabled guard to the top keeps disabled APQ side-effect free.

♻️ Suggested refactor
 func (r *Router) buildAPQClient(registry *ProviderRegistry) (apq.Client, error) {
+	if !r.automaticPersistedQueriesConfig.Enabled {
+		return nil, nil
+	}
+
 	var kvClient apq.KVClient
 	if provider, ok := registry.Redis(r.automaticPersistedQueriesConfig.Storage.ProviderID); ok {
 		c, err := apq.NewRedisClient(&apq.RedisOptions{
 			Logger:        r.logger,
 			StorageConfig: &provider,
@@
 		)
 	}
-
-	if !r.automaticPersistedQueriesConfig.Enabled {
-		return nil, nil
-	}
 
 	apqClient, err := apq.NewClient(&apq.Options{
 		Logger:    r.logger,
 		ApqConfig: &r.automaticPersistedQueriesConfig,
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@router/core/router.go` around lines 1258 - 1277, In buildAPQClient, the APQ
disabled path still resolves and logs the Redis provider; move the guard
checking r.automaticPersistedQueriesConfig.Enabled to the top of
Router.buildAPQClient so you return early when APQ is disabled before calling
registry.Redis or apq.NewRedisClient, preventing allocation of a Redis client
and avoiding r.logger.Info side effects when not needed.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@router/core/provider_registry.go`:
- Around line 20-55: NewProviderRegistry currently only checks per-type maps
(s3, cdn, redis, fileSystem) for duplicate IDs, allowing the same ID to exist in
multiple provider types; update NewProviderRegistry to enforce global uniqueness
by tracking a single set of seen IDs (e.g., a map[string]struct{}) and checking
it before inserting into any per-type map so that any duplicate ID across types
returns an explicit error (ambiguity) referencing the conflicting ID; apply the
same global-check logic used when populating s3, cdn, redis, and fileSystem (and
the similar code region later at the other occurrence) to ensure provider IDs
are globally unique.

In `@router/internal/persistedoperation/operationstorage/s3/client.go`:
- Around line 121-145: ReadManifest currently uses the caller-supplied
modifiedSince and local time for re-polling which causes races; change
ReadManifest to return object metadata (Last-Modified timestamp and/or ETag)
along with the *pqlmanifest.Manifest so callers can use storage-provided
timestamps for If-Modified-Since/ETag checks. Concretely: update the
ReadManifest signature to return (*pqlmanifest.Manifest, time.Time, string,
error) (or similar), call the S3 API to fetch metadata (use c.client.StatObject
or minioReader.Stat()/GetObjectInfo after GetObject) to read LastModified and
ETag, keep the existing 304 handling, return the parsed manifest plus the
object's LastModified and ETag, and then update the higher-level fetcher to
store and reuse that LastModified/ETag instead of time.Now() for subsequent
modifiedSince checks.

In `@router/internal/persistedoperation/PO_MANIFEST_S3_GUIDE.md`:
- Around line 12-18: The docs currently say the router resolves the manifest at
<object_prefix>/manifest.json but mention support for .gz/.zst without stating
how to select a compressed manifest; update the PO_MANIFEST_S3_GUIDE.md to
document the configuration option or pattern for choosing a compressed manifest
(e.g., explain that the router will attempt <object_prefix>/manifest.json, then
<object_prefix>/manifest.json.gz and/or <object_prefix>/manifest.json.zst based
on the object key suffix or provide the explicit config field for the compressed
manifest path), and link this behavior to the existing `object_prefix`/manifest
lookup and `provider_id: s3` so readers know how to specify a compressed
manifest or remove the compression note if no config/behavior exists.
- Around line 49-53: Clarify that "manifest is authoritative" applies only to
storage-backed, hash-only lookups: when a request contains only an operation
hash and relies on S3/CDN-backed storage, missing hashes in the manifest cause
immediate rejection and per-request fetches are disabled; explicitly call out
the exceptions — if LogUnknown is enabled or if APQ receives a full query body
(i.e., not a hash-only request), the router will still accept/record or fetch
the operation as before — and mention the relevant symbols LogUnknown and APQ so
readers can locate the runtime/config behavior.

---

Duplicate comments:
In `@router/core/router.go`:
- Around line 1235-1253: The code only rejects an explicit providerID inside the
r.graphApiToken != "" branch causing a misspelled Storage.ProviderID to silently
disable persisted operations when no token is present; add a guard at the start
(check providerID != "" before the graphApiToken branch) that returns an
explicit error (same format used elsewhere, e.g. fmt.Errorf("unknown storage
provider id '%s' for persisted operations", providerID)) so the function fails
fast when Storage.ProviderID/providerID is set but unknown; keep the existing
CDN client creation (r.graphApiToken, cdn.NewClient, r.logger.Debug, return c,
c.ReadManifest) logic unchanged for the token-present case.

---

Nitpick comments:
In `@router/core/router.go`:
- Around line 1258-1277: In buildAPQClient, the APQ disabled path still resolves
and logs the Redis provider; move the guard checking
r.automaticPersistedQueriesConfig.Enabled to the top of Router.buildAPQClient so
you return early when APQ is disabled before calling registry.Redis or
apq.NewRedisClient, preventing allocation of a Redis client and avoiding
r.logger.Info side effects when not needed.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 605e5649-4675-4578-a4ce-3066422998f6

📥 Commits

Reviewing files that changed from the base of the PR and between 54e0f0b and 590de81.

📒 Files selected for processing (15)
  • router-tests/operations/pql_manifest_test.go
  • router/core/init_config_poller.go
  • router/core/provider_registry.go
  • router/core/provider_registry_test.go
  • router/core/router.go
  • router/core/router_config.go
  • router/internal/persistedoperation/PO_MANIFEST_S3_GUIDE.md
  • router/internal/persistedoperation/client.go
  • router/internal/persistedoperation/operationstorage/cdn/client.go
  • router/internal/persistedoperation/operationstorage/fs/client.go
  • router/internal/persistedoperation/operationstorage/s3/client.go
  • router/internal/persistedoperation/operationstorage/s3/client_test.go
  • router/internal/persistedoperation/pqlmanifest/poller.go
  • router/internal/persistedoperation/pqlmanifest/storage_fetcher.go
  • router/internal/persistedoperation/pqlmanifest/storage_fetcher_test.go
💤 Files with no reviewable changes (2)
  • router/internal/persistedoperation/operationstorage/fs/client.go
  • router/internal/persistedoperation/client.go
🚧 Files skipped from review as they are similar to previous changes (2)
  • router/internal/persistedoperation/pqlmanifest/poller.go
  • router/internal/persistedoperation/pqlmanifest/storage_fetcher.go

Copy link
Copy Markdown
Contributor

@SkArchon SkArchon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

approved, but make sure to resolve / dismiss the coderabbit issues highlighted.

@wundergraph wundergraph deleted a comment from coderabbitai bot Apr 2, 2026
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@router/internal/persistedoperation/PO_MANIFEST_S3_GUIDE.md`:
- Line 61: Update the documentation sentence that currently refers to a fixed
"manifest.json" to use a generic, configurable filename reference: mention that
the manifest file is given by the configured manifest.file_name (which may
include compression suffixes) instead of hardcoding "manifest.json"; adjust the
schema section wording to say "the manifest file (configured via
manifest.file_name)" or similar so it accurately reflects that the filename is
configurable and may be compressed.
- Line 80: Update the `revision` field description to replace the fragment "Can
be any non-empty string" with a complete sentence; e.g., change it to "It can be
any non-empty string (e.g. a git SHA, timestamp, or UUID)." Ensure the full
sentence reads smoothly with the preceding sentence and retains the requirement
that the value must be changed whenever operations are updated.

In `@router/pkg/config/config.schema.json`:
- Around line 206-210: The "file_name" string property in config.schema.json
currently allows empty values which can break manifest key resolution; update
the "file_name" schema (the property named "file_name") to disallow empty
strings by adding "minLength": 1 to its definition so validation fails for ""
while keeping the existing "default": "manifest.json".
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: d46966a6-dc6d-4338-8d4a-c8a5c87fc2ea

📥 Commits

Reviewing files that changed from the base of the PR and between 062c098 and bd0c422.

📒 Files selected for processing (7)
  • router/core/router.go
  • router/internal/persistedoperation/PO_MANIFEST_S3_GUIDE.md
  • router/pkg/config/config.go
  • router/pkg/config/config.schema.json
  • router/pkg/config/fixtures/full.yaml
  • router/pkg/config/testdata/config_defaults.json
  • router/pkg/config/testdata/config_full.json
✅ Files skipped from review due to trivial changes (3)
  • router/pkg/config/fixtures/full.yaml
  • router/pkg/config/testdata/config_defaults.json
  • router/pkg/config/testdata/config_full.json
🚧 Files skipped from review as they are similar to previous changes (1)
  • router/core/router.go

@StarpTech StarpTech merged commit b5cfa92 into main Apr 2, 2026
37 of 38 checks passed
@StarpTech StarpTech deleted the dustin/support-s3-poling-for-pql-manifest branch April 2, 2026 09:20
StarpTech added a commit that referenced this pull request Apr 2, 2026
return fmt.Errorf("failed to create CDN client: %w", err)
}
pClient = c
providerID := r.persistedOperationsConfig.Storage.ProviderID
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a question, are empty "" storage provider IDs allowed? At the end of the function we are returning nil, nil, nil when nothing was found. If providerID != "" should we log a warning at least?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants