Conversation
Scripts like GTM and GA4 are often blocked by ad blockers and privacy extensions when loaded from third-party domains, leading to data loss. Third-party cookie deprecation further limits tracking durability. This change proxies GTM scripts and analytics beacons through the Trusted Server, establishing a first-party context. It automatically rewrites HTML tags and script content to point to local proxy endpoints, bypassing blockers and extending cookie life. Includes: Proxy endpoints for gtm.js and /collect Content rewriting for redirecting internal GTM calls Configuration and integration tests Resolves: #224
Adds comprehensive tests for: - GTM configuration parsing and default values - HTML processor pipeline integration - Response body rewriting logic
aram356
left a comment
There was a problem hiding this comment.
🔧 Please make sure checks pass before assigning to review.
aram356
left a comment
There was a problem hiding this comment.
Good start. Need to address specific items and the following.
Duplicated rewrite logic across three places
The GTM URL rewriting logic exists in three separate methods: rewrite_gtm_script(), IntegrationAttributeRewriter::rewrite(), and IntegrationScriptRewriter::rewrite(). Each handles a slightly different set of patterns. This is error-prone — a new URL pattern needs to be added in multiple places.
…, set default enablement to false, and update documentation for handling
…ID forwarding for GTM proxy requests
…nctions, improving request configuration for beacons and scripts
…ogle Tag Manager integration
|
@prk-Jr Please add (manual) test plan |
- Widen IntegrationAttributeRewriter to rewrite href/src for gtag/js and google-analytics.com URLs (not just gtm.js), fixing <link rel=preload> tags not being rewritten on Next.js sites - Add client-side script guard for dynamically inserted GTM/GA scripts using the shared createScriptGuard factory (matches DataDome pattern) - Harden URL regex with delimiter capture group to prevent subdomain spoofing (e.g., www.googletagmanager.com.evil.com) - Add is_rewritable_url helper to selectively rewrite only URLs with corresponding proxy routes (excludes ns.html) - Document gtag/js endpoint in integration guide
gtag.js constructs beacon URLs dynamically from bare domain strings, so rewriting them at the script level produces broken URLs. Instead, add a shared beacon_guard that patches navigator.sendBeacon and window.fetch at runtime to intercept requests to google-analytics.com and analytics.google.com, rewriting them to the first-party proxy. - Add shared beacon_guard.ts factory (sendBeacon + fetch interception) - Wire GTM integration to install beacon guard on init - Require // prefix in Rust GTM_URL_PATTERN to prevent bare domain rewrites - Add tests for both shared factory and GTM-specific beacon interception
- Use status 200 instead of 204 (jsdom rejects 204 as null-body status) - Use absolute URLs in test rewriteUrl to satisfy jsdom's Request constructor
|
from @jevansnyc There are some differences. For example the "Request Payload" data that shows up on autoblog and not ts.autoblog.com
|
I looked into it and I found something like this. @jevansnyc can you please point me to the request.
|
- Add container ID format validation (GTM-XXXXXX pattern) - Fix fragile beacon type detection using ends_with() - Improve error handling in TypeScript guards - Add beacon guard cleanup mechanism for testing - Document IP stripping privacy tradeoff - Document cache behavior and stale handling - Simplify upstream_url() logic (rely on serde defaults) All tests passing. No breaking changes.



Scripts like GTM and GA4 are often blocked by ad blockers or privacy extensions when loaded from third-party domains, leading to data loss. Third-party cookie deprecation further limits tracking durability.
This change transparently proxies GTM/GA4 scripts and analytics beacons through the Trusted Server, establishing a first-party context. It automatically rewrites HTML tags (including
<link rel="preload">) and script content to point to local proxy endpoints, bypassing blockers and extending cookie life.Includes:
gtm.js,gtag/js,/collect, and/g/collectwith configurable caching and strict validationsrcandhrefattributes targeting GTM/GA domainswww.googletagmanager.com.evil.com)Manual Test Plan
Prerequisites: Configure
.envwith GTM enabled and a valid container ID, then start the local server.1. Script proxy — gtm.js rewritten
Expected: count > 0
2. Script proxy — gtag/js returns 200
Expected:
2003. gtag/js content rewritten (no Google domains)
Expected:
04. Beacon proxy — POST /g/collect
Expected:
204or2005. Beacon proxy — GET /collect
Expected:
204or2006. Cache headers present
Expected:
cache-control: public, max-age=900Resolves: #224