Conversation
WalkthroughAdds backend support for a new referrer-name filter ( Changes
Sequence Diagram(s)sequenceDiagram
participant U as User
participant FE as Frontend (ViewProject)
participant Filter as filters.tsx
participant API as Backend Analytics
participant Map as referrers.map (backend)
participant RefUtil as web/app/utils/referrers.ts
U->>FE: Apply filter "refn=Google"
FE->>Filter: Validate key 'refn'
Filter-->>FE: Key accepted
FE->>API: Send analytics query with refn="Google"
API->>Map: getDomainsForRefName("Google")
Map-->>API: Return patterns (e.g., "google.com", "https://search.google...")
API->>API: Build matching clause (scheme-prefix OR domain equals/endsWith)
API-->>FE: Return filtered analytics results
FE->>RefUtil: groupRefEntries(results) / getFaviconHost(ref) per row
RefUtil-->>FE: Grouped entries and favicon hosts
FE-->>U: Display grouped referrer rows with favicons
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes
Poem
Pre-merge checks and finishing touches❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✨ Finishing touches
🧪 Generate unit tests (beta)
📜 Recent review detailsConfiguration used: CodeRabbit UI Review profile: CHILL Plan: Pro 📒 Files selected for processing (2)
🚧 Files skipped from review as they are similar to previous changes (2)
Comment |
|
@coderabbitai help |
ChatThere are 3 ways to chat with CodeRabbit:
CodeRabbit commands
Other keywords and placeholders
CodeRabbit configuration file (
|
There was a problem hiding this comment.
Actionable comments posted: 4
🧹 Nitpick comments (2)
backend/tsconfig.json (1)
17-17: OK to enable JSON imports; consider using it or drop the flagYou’re not importing JSON in backend (you read via fs). Either switch referrers.map.ts to import the JSON (benefits: types, bundling) or remove this flag to avoid drift.
web/app/pages/Project/View/Panels.tsx (1)
986-993: Null-safe name sorting added; consider nulls-last for UXReturning '' avoids errors. Prefer explicit nulls-last to keep unknowns after real names.
- if (label === 'name') return entry.name || '' + if (label === 'name') return entry.name ?? '\uffff' // sorts null/empty last
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (11)
backend/apps/cloud/src/analytics/analytics.service.ts(3 hunks)backend/apps/cloud/src/analytics/referrers.map.ts(1 hunks)backend/tsconfig.json(1 hunks)web/app/lib/models/Entry.ts(1 hunks)web/app/pages/Project/View/Panels.tsx(1 hunks)web/app/pages/Project/View/ViewProject.helpers.tsx(1 hunks)web/app/pages/Project/View/ViewProject.tsx(7 hunks)web/app/pages/Project/View/components/RefRow.tsx(2 hunks)web/app/pages/Project/View/utils/filters.tsx(1 hunks)web/app/utils/referrers.map.json(1 hunks)web/app/utils/referrers.ts(1 hunks)
🧰 Additional context used
🧬 Code graph analysis (4)
backend/apps/cloud/src/analytics/analytics.service.ts (1)
backend/apps/cloud/src/analytics/referrers.map.ts (1)
getDomainsForRefName(29-34)
web/app/pages/Project/View/components/RefRow.tsx (1)
web/app/utils/referrers.ts (2)
extractHostname(18-30)REFERRER_MAP(16-16)
web/app/utils/referrers.ts (1)
web/app/lib/models/Entry.ts (1)
Entry(1-6)
web/app/pages/Project/View/ViewProject.tsx (3)
web/app/utils/generic.ts (1)
getLocaleDisplayName(134-144)web/app/lib/models/Entry.ts (1)
Entry(1-6)web/app/utils/referrers.ts (2)
Entry(62-62)groupRefEntries(65-87)
🪛 GitHub Actions: Pull Request Validation Workflow
web/app/utils/referrers.map.json
[error] 1-1: Prettier formatting check failed for app/utils/referrers.map.json. Run 'prettier --write' to fix. Command 'prettier --check --ignore-unknown ./app ./public' exited with code 1.
🔇 Additional comments (6)
web/app/pages/Project/View/ViewProject.helpers.tsx (1)
1562-1563: Add translation mapping for refn — LGTMMaps refn to the ref label consistently. Verify the i18n key exists in all locales.
web/app/pages/Project/View/utils/filters.tsx (1)
28-29: Accepting ‘refn’ as a valid filter — goodFront-end will now parse refn from URL. Ensure backend getFiltersQuery handles it (special path) for all endpoints you route to.
backend/apps/cloud/src/analytics/analytics.service.ts (3)
61-61: New dependency on getDomainsForRefNameImport is fine. Be mindful the loader must resolve the JSON at runtime in the backend container (see separate comment).
1016-1043: refn SQL generation: good matching logic; add small guards and tests
- Good: parameterized patterns, scheme-aware startsWith, domain/subdomain coverage.
- Add COALESCE to avoid NULL pitfalls on ref: domain(ref) may be NULL; explicit coalesce keeps evaluation predictable.
- Unit/integ tests would be valuable here.
- parts.push( - `(domain(ref) != '' AND (lower(domain(ref)) = {${dp}:String} OR endsWith(lower(domain(ref)), concat('.', {${dp}:String})))) OR (domain(ref) = '' AND (lower(ref) = {${dp}:String} OR endsWith(lower(ref), concat('.', {${dp}:String}))))`, - ) + parts.push( + `(coalesce(domain(ref), '') != '' AND (lower(coalesce(domain(ref), '')) = {${dp}:String} OR endsWith(lower(coalesce(domain(ref), '')), concat('.', {${dp}:String})))) + OR (coalesce(domain(ref), '') = '' AND (lower(coalesce(ref, '')) = {${dp}:String} OR endsWith(lower(coalesce(ref, '')), concat('.', {${dp}:String}))))`, + )Would you like me to draft tests for getFiltersQuery(refn) with URLs, subdomains, and scheme-prefixed patterns?
949-957: Now I need to directly examine the codebase to determine if the concern is valid:Based on the script output provided and code analysis, I can now determine the facts needed to resolve the review comment. Let me examine the key findings:From the script output:
- PERFORMANCE_COLUMNS =
['cc', 'rg', 'ct', 'host', 'pg', 'dv', 'br', 'brv']— does NOT include 'ref'- ERROR_COLUMNS — shown in output from constants.ts (line 122-124)
- Both
performanceanderrorstables are created ininitialise_database.js(lines 87 and 114 respectively)- The
refcolumn IS present inanalyticsandcustomEVtables (shown multiple times in migrations)The original concern states that
refnbypasses SUPPORTED_COLUMNS check and will causefiltersQueryfor PERFORMANCE/ERRORS to referenceref/domain(ref). However:
PERFORMANCE_COLUMNSexplicitly excludesref- The code at lines 949-957 allows
refnto bypass the SUPPORTED_COLUMNS check for analytics- If
refis not part of PERFORMANCE_COLUMNS or ERROR_COLUMNS, queries filtering by ref for those tables would failThe issue is valid IF:
- The filter logic applies the
refcolumn to non-analytics tables without checking the DataType- The performance/errors tables don't have a
refcolumnFrom the evidence shown, this appears to be a legitimate concern about allowing
refnbypass without data type validation, but I need definitive confirmation that performance/errors tables lack therefcolumn (not fully visible in truncated output).
Restrict
refnbypass to analytics queries only or verify schema consistencyThe concern has merit:
PERFORMANCE_COLUMNSexplicitly excludesref, suggesting performance and error tables may not have this column. Allowingrefnto bypass SUPPORTED_COLUMNS checks (lines 949-957) creates risk if filter logic later attempts to usedomain(ref)for PERFORMANCE/ERRORS DataTypes. Recommend either (1) restrict therefnbypass toDataType.TRAFFIConly, or (2) verify performance/errors table schemas includerefand adapt queries accordingly. Cannot fully verify schema from current output—the performance and errors table definitions were truncated.web/app/lib/models/Entry.ts (1)
2-2: Entry.name now nullable — audit callsitesType change is reasonable for “unknown” rows. Please verify all consumers that call string methods on name handle null (e.g., toLowerCase, localeCompare, building URLs). I can help run a repo-wide scan.
There was a problem hiding this comment.
Actionable comments posted: 2
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (6)
backend/blog-posts(1 hunks)web/app/pages/Project/View/Panels.tsx(4 hunks)web/app/pages/Project/View/ViewProject.helpers.tsx(2 hunks)web/app/pages/Project/View/components/RefRow.tsx(2 hunks)web/app/referrers.map.json(1 hunks)web/app/utils/referrers.ts(1 hunks)
✅ Files skipped from review due to trivial changes (1)
- backend/blog-posts
🚧 Files skipped from review as they are similar to previous changes (2)
- web/app/pages/Project/View/components/RefRow.tsx
- web/app/pages/Project/View/ViewProject.helpers.tsx
🧰 Additional context used
🧬 Code graph analysis (1)
web/app/utils/referrers.ts (1)
web/app/lib/models/Entry.ts (1)
Entry(1-6)
There was a problem hiding this comment.
Actionable comments posted: 1
♻️ Duplicate comments (2)
web/app/utils/referrers.ts (2)
38-48: Wildcard patterns still compared literally.This is the same issue raised in the previous review: patterns like
*.wikipedia.orgare compared as literal strings and will never match actual hostnames. The pattern normalization suggested in the earlier review comment (stripping*.prefix before comparison) still needs to be applied.
50-64: Non-http(s) schemes still filtered before hostname extraction.This is the same issue raised in the previous review: the function returns
nullfor non-http(s) schemes (lines 54-55) before attempting hostname extraction, preventingandroid-app://URLs from being grouped even though the referrer map supposedly includes Android intent patterns.
🧹 Nitpick comments (1)
web/app/utils/referrers.ts (1)
11-34: Consider stricter IPv4 validation (optional).The IPv4 pattern
/^\d{1,3}(\.\d{1,3}){3}$/accepts invalid addresses like999.999.999.999. While unlikely to appear in referrer data, a stricter check would prevent false positives.Example stricter pattern:
const ipv4Pattern = /^(25[0-5]|2[0-4][0-9]|1[0-9]{2}|[1-9]?[0-9])(\.(25[0-5]|2[0-4][0-9]|1[0-9]{2}|[1-9]?[0-9])){3}$/
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
web/app/utils/referrers.ts(1 hunks)
🧰 Additional context used
🧬 Code graph analysis (1)
web/app/utils/referrers.ts (1)
web/app/lib/models/Entry.ts (1)
Entry(1-6)
🔇 Additional comments (3)
web/app/utils/referrers.ts (3)
1-9: LGTM: Clean type definitions and JSON import.The type definition and JSON import are straightforward and appropriate for the referrer mapping feature.
68-91: Grouping logic is sound, pending type fix.The aggregation and sorting logic correctly:
- Groups entries by canonical name or hostname
- Handles null names explicitly
- Sorts by descending count for UI display
However, the function's return type depends on the
Entrytype issue flagged above.
50-50: ****The function
getCanonicalRefGroupis correctly left without anexportkeyword. It is a private helper function used only internally withingroupRefEntries(line 76), which is the exported function in this module's public API. No other modules importgetCanonicalRefGroup, confirming it is not part of the public API surface. The current code structure is appropriate.Likely an incorrect or invalid review comment.
There was a problem hiding this comment.
Actionable comments posted: 1
♻️ Duplicate comments (2)
web/app/utils/referrers.ts (2)
39-47: Wildcard patterns never matchPatterns like
*.wikipedia.orgare still compared literally, so grouped referrers that rely on wildcards never resolve. Normalise each pattern (strip leading*./.and lower-case) before the equality/suffix check so wildcard entries actually hit.-const matchByMap = (host: string): string | null => { - for (const { name, patterns } of REFERRER_MAP) { - for (const p of patterns) { - // Exact or subdomain match only - if (host === p || host.endsWith(`.${p}`)) { +const matchByMap = (host: string): string | null => { + const normalizedHost = host.toLowerCase() + for (const { name, patterns } of REFERRER_MAP) { + for (const rawPattern of patterns) { + const pattern = rawPattern.toLowerCase().replace(/^\*\./, '').replace(/^\./, '') + if (!pattern) continue + if (normalizedHost === pattern || normalizedHost.endsWith(`.${pattern}`)) { return name } } } return null }
54-58: android-app referrals never groupWe still bail out the moment we see a non-HTTP scheme, so intent URLs like
android-app://com.reddit.frontpage/never reachextractHostnameor the map—yet the JSON includes those entries. Whitelist the supported schemes (e.g.android-app) instead of blanket returning null so these referrals can be grouped.- if (scheme && !/^https?$/i.test(scheme)) { - return null - } + if (scheme && !/^https?$/i.test(scheme)) { + if (!/^android-app$/i.test(scheme)) { + return null + } + }
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (7)
web/app/lib/models/Entry.ts(1 hunks)web/app/pages/Project/View/Panels.tsx(12 hunks)web/app/pages/Project/View/ViewProject.tsx(10 hunks)web/app/pages/Project/View/components/RefRow.tsx(2 hunks)web/app/referrers.map.json(1 hunks)web/app/utils/generic.ts(1 hunks)web/app/utils/referrers.ts(1 hunks)
🚧 Files skipped from review as they are similar to previous changes (2)
- web/app/pages/Project/View/components/RefRow.tsx
- web/app/referrers.map.json
🧰 Additional context used
🧬 Code graph analysis (2)
web/app/pages/Project/View/ViewProject.tsx (3)
web/app/utils/generic.ts (1)
getLocaleDisplayName(134-146)web/app/lib/models/Entry.ts (1)
Entry(1-6)web/app/utils/referrers.ts (1)
groupRefEntries(68-84)
web/app/utils/referrers.ts (1)
web/app/lib/models/Entry.ts (1)
Entry(1-6)
There was a problem hiding this comment.
Actionable comments posted: 0
♻️ Duplicate comments (1)
backend/apps/cloud/src/analytics/utils/referrers.map.ts (1)
8-27: Previously flagged: Runtime FS dependency needs configuration.A past review already identified that reading from
web/app/referrers.map.jsonat runtime will fail in separate deployments. The recommended fix is to add an environment variable override and copy the JSON into the backend image during build.
🧹 Nitpick comments (6)
backend/apps/cloud/src/analytics/utils/referrers.map.ts (2)
18-20: Add JSON schema validation after parsing.The parsed JSON is cast to
ReferrerJson[]without validation. If the file structure is incorrect, this could cause runtime errors downstream when accessingnameorpatternsproperties.Consider adding a validation function:
+const isValidReferrerJson = (data: unknown): data is ReferrerJson[] => { + return Array.isArray(data) && data.every( + item => typeof item === 'object' && + item !== null && + typeof item.name === 'string' && + Array.isArray(item.patterns) && + item.patterns.every((p: unknown) => typeof p === 'string') + ) +} + const loadMap = (): ReferrerJson[] => { if (cachedMap) return cachedMap const candidates = [ path.resolve(__dirname, '../../../../../web/app/referrers.map.json'), path.resolve(__dirname, '../../../../web/app/referrers.map.json'), ] for (const p of candidates) { try { const raw = fs.readFileSync(p, 'utf8') - cachedMap = JSON.parse(raw) as ReferrerJson[] + const parsed = JSON.parse(raw) + if (!isValidReferrerJson(parsed)) { + console.error(`[ERROR] Invalid referrers map structure in ${p}`) + continue + } + cachedMap = parsed return cachedMap } catch { // try next location } } cachedMap = [] return cachedMap }
21-23: Silent error swallowing hinders debugging.File I/O errors are caught but not logged, making it difficult to diagnose why the map failed to load. At minimum, log when all candidates fail.
} catch { // try next location } } + console.warn('[WARN] Failed to load referrers map from all candidate paths, using empty map') cachedMap = [] return cachedMap }backend/apps/community/src/analytics/analytics.service.ts (1)
1014-1016: Consider simplifying the nested OR condition for readability.The domain matching logic is functionally correct but has deeply nested conditions that are hard to parse. Consider extracting the matching logic into a helper or restructuring for clarity.
- parts.push( - `(domain(ref) != '' AND (lower(domain(ref)) = {${dp}:String} OR endsWith(lower(domain(ref)), concat('.', {${dp}:String})))) OR (domain(ref) = '' AND (lower(ref) = {${dp}:String} OR endsWith(lower(ref), concat('.', {${dp}:String}))))`, - ) + // Match domain or subdomain (handles both extracted domain and bare ref) + parts.push( + `( + (domain(ref) != '' AND (lower(domain(ref)) = {${dp}:String} OR endsWith(lower(domain(ref)), concat('.', {${dp}:String})))) + OR + (domain(ref) = '' AND (lower(ref) = {${dp}:String} OR endsWith(lower(ref), concat('.', {${dp}:String})))) + )` + )Alternatively, consider a ClickHouse function that encapsulates the logic if this pattern repeats elsewhere.
backend/apps/community/src/analytics/utils/referrers.map.ts (3)
8-27: Code duplication with cloud version – consider shared utility.This module is nearly identical to
backend/apps/cloud/src/analytics/utils/referrers.map.ts. The only difference is the file location, but the paths resolved are actually the same. Consider extracting this into a shared utility module to maintain DRY principles and reduce maintenance overhead.Create a shared module at
backend/libs/common/src/utils/referrers.map.tsand import it in both cloud and community services. This ensures consistent behavior and simplifies future updates.
18-20: Add JSON schema validation after parsing.Same issue as in the cloud version: parsed JSON lacks validation. If the file structure is incorrect, downstream code will fail when accessing properties.
Apply the same validation pattern suggested for the cloud version (see review comment on backend/apps/cloud/src/analytics/utils/referrers.map.ts lines 18-20).
21-23: Silent error swallowing hinders debugging.Same issue as the cloud version: errors are caught but not logged. Add a warning when all candidate paths fail.
} catch { // try next location } } + console.warn('[WARN] Failed to load referrers map from all candidate paths, using empty map') cachedMap = [] return cachedMap }
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (4)
backend/apps/cloud/src/analytics/analytics.service.ts(3 hunks)backend/apps/cloud/src/analytics/utils/referrers.map.ts(1 hunks)backend/apps/community/src/analytics/analytics.service.ts(5 hunks)backend/apps/community/src/analytics/utils/referrers.map.ts(1 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
- backend/apps/cloud/src/analytics/analytics.service.ts
🧰 Additional context used
🧬 Code graph analysis (2)
backend/apps/cloud/src/analytics/utils/referrers.map.ts (1)
backend/apps/community/src/analytics/utils/referrers.map.ts (1)
getDomainsForRefName(29-34)
backend/apps/community/src/analytics/analytics.service.ts (1)
backend/apps/community/src/analytics/utils/referrers.map.ts (1)
getDomainsForRefName(29-34)
🔇 Additional comments (4)
backend/apps/community/src/analytics/analytics.service.ts (3)
54-54: LGTM!Import correctly references the new referrers.map utility module.
931-931: LGTM!Correctly removes the exclusion of 'refn' from the filter validation, allowing it as a valid filter column alongside other analytics columns.
995-1021: The fallback logic for emptydomain(ref)is correct—no fixes needed.After examining the code and referrer mapping logic, the refn filter implementation is sound:
Pattern generation:
getDomainsForRefName()returns known referrer patterns fromreferrers.map.json, or falls back to the filter value itself. Patterns are guaranteed to be non-empty strings from known sources.Fallback logic is intentional: When
domain(ref) = '', the code checks if therefstring itself matches the pattern. This is the correct behavior for malformed or unparseable URLs—treating the raw ref as a bare domain string is appropriate.No false positive risk: The conditions use strict equality and suffix matching, so even empty refs are handled correctly (they would only match if the pattern is also empty, which is not a realistic scenario given the referrer mapping).
The logic correctly delegates to ClickHouse's
domain()function behavior (returns empty string for malformed URLs) and handles the fallback appropriately.backend/apps/community/src/analytics/utils/referrers.map.ts (1)
29-34: LGTM!The
getDomainsForRefNamefunction correctly performs case-insensitive lookup and returnsnullwhen no match is found, providing a clear API contract.
Changes
If applicable, please describe what changes were made in this pull request.
Community Edition support
Database migrations
Documentation
Summary by CodeRabbit
New Features
Bug Fixes
Chores