Skip to content

Comments

feat: Referrer grouping#418

Merged
Blaumaus merged 16 commits intomainfrom
feature/refs-grouping
Nov 8, 2025
Merged

feat: Referrer grouping#418
Blaumaus merged 16 commits intomainfrom
feature/refs-grouping

Conversation

@Blaumaus
Copy link
Member

@Blaumaus Blaumaus commented Nov 6, 2025

Changes

If applicable, please describe what changes were made in this pull request.

Community Edition support

  • Your feature is implemented for the Swetrix Community Edition
  • This PR only updates the Cloud (Enterprise) Edition code (e.g. Paddle webhooks, blog, payouts, etc.)

Database migrations

  • Clickhouse / MySQL migrations added for this PR
  • No table schemas changed in this PR

Documentation

  • You have updated the documentation according to your PR
  • This PR did not change any publicly documented endpoints

Summary by CodeRabbit

  • New Features

    • Referrer-name filtering for traffic analytics
    • Grouping/aggregation of referrers in traffic sources
    • Favicon display for referrer entries
  • Bug Fixes

    • Improved null-safety in filter link generation, rendering and CSV export lookups
    • More robust handling when referrer names are missing to avoid crashes
  • Chores

    • TypeScript target updated (ES2022) and JSON module resolution enabled
    • Added static referrer mapping data for improved recognition

@Blaumaus Blaumaus self-assigned this Nov 6, 2025
@coderabbitai
Copy link

coderabbitai bot commented Nov 7, 2025

Walkthrough

Adds backend support for a new referrer-name filter (refn) with domain/scheme pattern matching, introduces frontend referrer utilities and data map, widens several frontend types to accept nulls, and upgrades backend TypeScript settings to ES2022 with JSON module resolution.

Changes

Cohort / File(s) Summary
Backend: analytics filter changes
backend/apps/cloud/src/analytics/analytics.service.ts, backend/apps/community/src/analytics/analytics.service.ts
Added special-case handling for refn filters: resolve patterns via getDomainsForRefName, treat scheme-containing patterns as prefix matches, otherwise build domain equality/endsWith checks and inject a bespoke matching clause into query generation.
Backend: referrer map utility
backend/apps/cloud/src/analytics/utils/referrers.map.ts, backend/apps/community/src/analytics/utils/referrers.map.ts
New modules that load referrers.map.json (try dev/compiled paths), cache the map, and export `getDomainsForRefName(name: string): string[]
Backend: TS config & submodule
backend/tsconfig.json, backend/blog-posts
tsconfig target set to es2022 and resolveJsonModule: true; blog-posts submodule pointer updated.
Frontend: referrer utilities & map
web/app/utils/referrers.ts, web/app/referrers.map.json
New referrers.ts exports extractHostname, groupRefEntries, getFaviconHost for hostname extraction, canonical grouping, and favicon host derivation. Added static referrers.map.json.
Frontend: filter recognition
web/app/pages/Project/View/utils/filters.tsx
Added refn to validFilters so it’s recognized as a non-dynamic filter key.
Frontend: models & utils (nullable support)
web/app/lib/models/Entry.ts, web/app/utils/generic.ts
Entry.name widened to `string
Frontend: panels & views null-safety
web/app/pages/Project/View/Panels.tsx, web/app/pages/Project/View/ViewProject.tsx, web/app/pages/Project/View/ViewProject.helpers.tsx
Multiple prop signatures expanded to accept null (e.g., getFilterLink, onClick, getVersionFilterLink, DetailsTableProps.getFilterLink). Internal logic updated with guards/fallbacks (use `entry.name
Frontend: referrer row display
web/app/pages/Project/View/components/RefRow.tsx
rowName prop now `string
Frontend: captcha view filter link
web/app/pages/Captcha/View/ViewCaptcha.tsx
getFilterLink updated to accept `value: string

Sequence Diagram(s)

sequenceDiagram
    participant U as User
    participant FE as Frontend (ViewProject)
    participant Filter as filters.tsx
    participant API as Backend Analytics
    participant Map as referrers.map (backend)
    participant RefUtil as web/app/utils/referrers.ts

    U->>FE: Apply filter "refn=Google"
    FE->>Filter: Validate key 'refn'
    Filter-->>FE: Key accepted
    FE->>API: Send analytics query with refn="Google"
    API->>Map: getDomainsForRefName("Google")
    Map-->>API: Return patterns (e.g., "google.com", "https://search.google...")
    API->>API: Build matching clause (scheme-prefix OR domain equals/endsWith)
    API-->>FE: Return filtered analytics results
    FE->>RefUtil: groupRefEntries(results) / getFaviconHost(ref) per row
    RefUtil-->>FE: Grouped entries and favicon hosts
    FE-->>U: Display grouped referrer rows with favicons
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

  • Areas needing extra attention:
    • Backend refn condition construction (scheme vs domain handling, exclusivity / OR grouping) in both analytics services.
    • Map-loading paths, caching, and failure behavior in referrers.map.ts.
    • Hostname extraction and grouping edge cases in web/app/utils/referrers.ts.
    • Prop signature changes in panels/views and propagation of nullable values through renders and link builders.

Poem

🐇 I hopped through maps of ref and name,
Matching hosts and schemes — that's my game.
Favicons peep where patterns align,
Null-safe hops keep the render fine.
ES2022 gives my code a springtime frame.

Pre-merge checks and finishing touches

❌ Failed checks (1 warning)
Check name Status Explanation Resolution
Description check ⚠️ Warning The PR description follows the template structure but contains only placeholder text and minimal implementation details in the 'Changes' section, with no actual description of the feature implementation. Expand the 'Changes' section with a concrete explanation of what referrer grouping does, how it works, and which files were modified to implement this feature.
✅ Passed checks (2 passed)
Check name Status Explanation
Title check ✅ Passed The title 'feat: Referrer grouping' is concise and accurately describes the main feature added across the changeset, which introduces referrer grouping logic for traffic sources.
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.
✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch feature/refs-grouping

📜 Recent review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 9701a6b and 9b80f01.

📒 Files selected for processing (2)
  • backend/apps/cloud/src/analytics/utils/referrers.map.ts (1 hunks)
  • backend/apps/community/src/analytics/utils/referrers.map.ts (1 hunks)
🚧 Files skipped from review as they are similar to previous changes (2)
  • backend/apps/community/src/analytics/utils/referrers.map.ts
  • backend/apps/cloud/src/analytics/utils/referrers.map.ts

Comment @coderabbitai help to get the list of available commands and usage tips.

@Blaumaus Blaumaus marked this pull request as ready for review November 7, 2025 20:27
@Blaumaus
Copy link
Member Author

Blaumaus commented Nov 7, 2025

@coderabbitai help

@coderabbitai
Copy link

coderabbitai bot commented Nov 7, 2025

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>, please review it.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.

CodeRabbit commands

These commands are invoked using PR/Issue comments.

  • @coderabbitai pause to pause the reviews on a PR.
  • @coderabbitai resume to resume the paused reviews.
  • @coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
  • @coderabbitai full review to do a full review from scratch and review all the files again.
  • @coderabbitai summary to regenerate the summary of the PR.
  • @coderabbitai evaluate custom pre-merge check --instructions <custom-checks-instructions> --name <custom-checks-title> [--mode <error|warning>] to validate the custom pre-merge checks instructions. Defaults to error when --mode is omitted.
    • @coderabbitai ignore pre-merge checks to override pre-merge checks and get an approval on PR.
    • @coderabbitai run pre-merge checks to run pre-merge checks on the pull request.
  • @coderabbitai generate docstrings to generate docstrings for this PR.
  • @coderabbitai generate sequence diagram to generate a sequence diagram of the changes in this PR.
  • @coderabbitai generate unit tests to generate unit tests for this PR.
  • @coderabbitai resolve to resolve all the CodeRabbit review comments.
  • @coderabbitai configuration to show the current CodeRabbit configuration for the repository.
  • @coderabbitai help to get help.

Other keywords and placeholders

  • Add @coderabbitai ignore or @coderabbit ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit configuration file (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • Please see the configuration documentation for more information.
  • You can also validate your configuration using the online YAML validator.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Status, support, documentation and community

  • Visit our status page to check the current availability of CodeRabbit.
  • Create a ticket on our support page for assistance with any issues or questions.
  • Visit our documentation site for detailed information on how to use CodeRabbit.
  • Join our Discord community to connect with other users and get help from the community.
  • Follow us on X/Twitter for updates and announcements.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 4

🧹 Nitpick comments (2)
backend/tsconfig.json (1)

17-17: OK to enable JSON imports; consider using it or drop the flag

You’re not importing JSON in backend (you read via fs). Either switch referrers.map.ts to import the JSON (benefits: types, bundling) or remove this flag to avoid drift.

web/app/pages/Project/View/Panels.tsx (1)

986-993: Null-safe name sorting added; consider nulls-last for UX

Returning '' avoids errors. Prefer explicit nulls-last to keep unknowns after real names.

-      if (label === 'name') return entry.name || ''
+      if (label === 'name') return entry.name ?? '\uffff' // sorts null/empty last
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 57daae5 and 7166e7d.

📒 Files selected for processing (11)
  • backend/apps/cloud/src/analytics/analytics.service.ts (3 hunks)
  • backend/apps/cloud/src/analytics/referrers.map.ts (1 hunks)
  • backend/tsconfig.json (1 hunks)
  • web/app/lib/models/Entry.ts (1 hunks)
  • web/app/pages/Project/View/Panels.tsx (1 hunks)
  • web/app/pages/Project/View/ViewProject.helpers.tsx (1 hunks)
  • web/app/pages/Project/View/ViewProject.tsx (7 hunks)
  • web/app/pages/Project/View/components/RefRow.tsx (2 hunks)
  • web/app/pages/Project/View/utils/filters.tsx (1 hunks)
  • web/app/utils/referrers.map.json (1 hunks)
  • web/app/utils/referrers.ts (1 hunks)
🧰 Additional context used
🧬 Code graph analysis (4)
backend/apps/cloud/src/analytics/analytics.service.ts (1)
backend/apps/cloud/src/analytics/referrers.map.ts (1)
  • getDomainsForRefName (29-34)
web/app/pages/Project/View/components/RefRow.tsx (1)
web/app/utils/referrers.ts (2)
  • extractHostname (18-30)
  • REFERRER_MAP (16-16)
web/app/utils/referrers.ts (1)
web/app/lib/models/Entry.ts (1)
  • Entry (1-6)
web/app/pages/Project/View/ViewProject.tsx (3)
web/app/utils/generic.ts (1)
  • getLocaleDisplayName (134-144)
web/app/lib/models/Entry.ts (1)
  • Entry (1-6)
web/app/utils/referrers.ts (2)
  • Entry (62-62)
  • groupRefEntries (65-87)
🪛 GitHub Actions: Pull Request Validation Workflow
web/app/utils/referrers.map.json

[error] 1-1: Prettier formatting check failed for app/utils/referrers.map.json. Run 'prettier --write' to fix. Command 'prettier --check --ignore-unknown ./app ./public' exited with code 1.

🔇 Additional comments (6)
web/app/pages/Project/View/ViewProject.helpers.tsx (1)

1562-1563: Add translation mapping for refn — LGTM

Maps refn to the ref label consistently. Verify the i18n key exists in all locales.

web/app/pages/Project/View/utils/filters.tsx (1)

28-29: Accepting ‘refn’ as a valid filter — good

Front-end will now parse refn from URL. Ensure backend getFiltersQuery handles it (special path) for all endpoints you route to.

backend/apps/cloud/src/analytics/analytics.service.ts (3)

61-61: New dependency on getDomainsForRefName

Import is fine. Be mindful the loader must resolve the JSON at runtime in the backend container (see separate comment).


1016-1043: refn SQL generation: good matching logic; add small guards and tests

  • Good: parameterized patterns, scheme-aware startsWith, domain/subdomain coverage.
  • Add COALESCE to avoid NULL pitfalls on ref: domain(ref) may be NULL; explicit coalesce keeps evaluation predictable.
  • Unit/integ tests would be valuable here.
- parts.push(
-   `(domain(ref) != '' AND (lower(domain(ref)) = {${dp}:String} OR endsWith(lower(domain(ref)), concat('.', {${dp}:String})))) OR (domain(ref) = '' AND (lower(ref) = {${dp}:String} OR endsWith(lower(ref), concat('.', {${dp}:String}))))`,
- )
+ parts.push(
+   `(coalesce(domain(ref), '') != '' AND (lower(coalesce(domain(ref), '')) = {${dp}:String} OR endsWith(lower(coalesce(domain(ref), '')), concat('.', {${dp}:String}))))
+    OR (coalesce(domain(ref), '') = '' AND (lower(coalesce(ref, '')) = {${dp}:String} OR endsWith(lower(coalesce(ref, '')), concat('.', {${dp}:String}))))`,
+ )

Would you like me to draft tests for getFiltersQuery(refn) with URLs, subdomains, and scheme-prefixed patterns?


949-957: Now I need to directly examine the codebase to determine if the concern is valid:Based on the script output provided and code analysis, I can now determine the facts needed to resolve the review comment. Let me examine the key findings:

From the script output:

  1. PERFORMANCE_COLUMNS = ['cc', 'rg', 'ct', 'host', 'pg', 'dv', 'br', 'brv']does NOT include 'ref'
  2. ERROR_COLUMNS — shown in output from constants.ts (line 122-124)
  3. Both performance and errors tables are created in initialise_database.js (lines 87 and 114 respectively)
  4. The ref column IS present in analytics and customEV tables (shown multiple times in migrations)

The original concern states that refn bypasses SUPPORTED_COLUMNS check and will cause filtersQuery for PERFORMANCE/ERRORS to reference ref/domain(ref). However:

  • PERFORMANCE_COLUMNS explicitly excludes ref
  • The code at lines 949-957 allows refn to bypass the SUPPORTED_COLUMNS check for analytics
  • If ref is not part of PERFORMANCE_COLUMNS or ERROR_COLUMNS, queries filtering by ref for those tables would fail

The issue is valid IF:

  1. The filter logic applies the ref column to non-analytics tables without checking the DataType
  2. The performance/errors tables don't have a ref column

From the evidence shown, this appears to be a legitimate concern about allowing refn bypass without data type validation, but I need definitive confirmation that performance/errors tables lack the ref column (not fully visible in truncated output).


Restrict refn bypass to analytics queries only or verify schema consistency

The concern has merit: PERFORMANCE_COLUMNS explicitly excludes ref, suggesting performance and error tables may not have this column. Allowing refn to bypass SUPPORTED_COLUMNS checks (lines 949-957) creates risk if filter logic later attempts to use domain(ref) for PERFORMANCE/ERRORS DataTypes. Recommend either (1) restrict the refn bypass to DataType.TRAFFIC only, or (2) verify performance/errors table schemas include ref and adapt queries accordingly. Cannot fully verify schema from current output—the performance and errors table definitions were truncated.

web/app/lib/models/Entry.ts (1)

2-2: Entry.name now nullable — audit callsites

Type change is reasonable for “unknown” rows. Please verify all consumers that call string methods on name handle null (e.g., toLowerCase, localeCompare, building URLs). I can help run a repo-wide scan.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 7166e7d and 80370f2.

📒 Files selected for processing (6)
  • backend/blog-posts (1 hunks)
  • web/app/pages/Project/View/Panels.tsx (4 hunks)
  • web/app/pages/Project/View/ViewProject.helpers.tsx (2 hunks)
  • web/app/pages/Project/View/components/RefRow.tsx (2 hunks)
  • web/app/referrers.map.json (1 hunks)
  • web/app/utils/referrers.ts (1 hunks)
✅ Files skipped from review due to trivial changes (1)
  • backend/blog-posts
🚧 Files skipped from review as they are similar to previous changes (2)
  • web/app/pages/Project/View/components/RefRow.tsx
  • web/app/pages/Project/View/ViewProject.helpers.tsx
🧰 Additional context used
🧬 Code graph analysis (1)
web/app/utils/referrers.ts (1)
web/app/lib/models/Entry.ts (1)
  • Entry (1-6)

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

♻️ Duplicate comments (2)
web/app/utils/referrers.ts (2)

38-48: Wildcard patterns still compared literally.

This is the same issue raised in the previous review: patterns like *.wikipedia.org are compared as literal strings and will never match actual hostnames. The pattern normalization suggested in the earlier review comment (stripping *. prefix before comparison) still needs to be applied.


50-64: Non-http(s) schemes still filtered before hostname extraction.

This is the same issue raised in the previous review: the function returns null for non-http(s) schemes (lines 54-55) before attempting hostname extraction, preventing android-app:// URLs from being grouped even though the referrer map supposedly includes Android intent patterns.

🧹 Nitpick comments (1)
web/app/utils/referrers.ts (1)

11-34: Consider stricter IPv4 validation (optional).

The IPv4 pattern /^\d{1,3}(\.\d{1,3}){3}$/ accepts invalid addresses like 999.999.999.999. While unlikely to appear in referrer data, a stricter check would prevent false positives.

Example stricter pattern:

const ipv4Pattern = /^(25[0-5]|2[0-4][0-9]|1[0-9]{2}|[1-9]?[0-9])(\.(25[0-5]|2[0-4][0-9]|1[0-9]{2}|[1-9]?[0-9])){3}$/
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 52b7b8a and 02c1d32.

📒 Files selected for processing (1)
  • web/app/utils/referrers.ts (1 hunks)
🧰 Additional context used
🧬 Code graph analysis (1)
web/app/utils/referrers.ts (1)
web/app/lib/models/Entry.ts (1)
  • Entry (1-6)
🔇 Additional comments (3)
web/app/utils/referrers.ts (3)

1-9: LGTM: Clean type definitions and JSON import.

The type definition and JSON import are straightforward and appropriate for the referrer mapping feature.


68-91: Grouping logic is sound, pending type fix.

The aggregation and sorting logic correctly:

  • Groups entries by canonical name or hostname
  • Handles null names explicitly
  • Sorts by descending count for UI display

However, the function's return type depends on the Entry type issue flagged above.


50-50: ****

The function getCanonicalRefGroup is correctly left without an export keyword. It is a private helper function used only internally within groupRefEntries (line 76), which is the exported function in this module's public API. No other modules import getCanonicalRefGroup, confirming it is not part of the public API surface. The current code structure is appropriate.

Likely an incorrect or invalid review comment.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

♻️ Duplicate comments (2)
web/app/utils/referrers.ts (2)

39-47: Wildcard patterns never match

Patterns like *.wikipedia.org are still compared literally, so grouped referrers that rely on wildcards never resolve. Normalise each pattern (strip leading *./. and lower-case) before the equality/suffix check so wildcard entries actually hit.

-const matchByMap = (host: string): string | null => {
-  for (const { name, patterns } of REFERRER_MAP) {
-    for (const p of patterns) {
-      // Exact or subdomain match only
-      if (host === p || host.endsWith(`.${p}`)) {
+const matchByMap = (host: string): string | null => {
+  const normalizedHost = host.toLowerCase()
+  for (const { name, patterns } of REFERRER_MAP) {
+    for (const rawPattern of patterns) {
+      const pattern = rawPattern.toLowerCase().replace(/^\*\./, '').replace(/^\./, '')
+      if (!pattern) continue
+      if (normalizedHost === pattern || normalizedHost.endsWith(`.${pattern}`)) {
         return name
       }
     }
   }
   return null
 }

54-58: android-app referrals never group

We still bail out the moment we see a non-HTTP scheme, so intent URLs like android-app://com.reddit.frontpage/ never reach extractHostname or the map—yet the JSON includes those entries. Whitelist the supported schemes (e.g. android-app) instead of blanket returning null so these referrals can be grouped.

-  if (scheme && !/^https?$/i.test(scheme)) {
-    return null
-  }
+  if (scheme && !/^https?$/i.test(scheme)) {
+    if (!/^android-app$/i.test(scheme)) {
+      return null
+    }
+  }
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 02c1d32 and b540234.

📒 Files selected for processing (7)
  • web/app/lib/models/Entry.ts (1 hunks)
  • web/app/pages/Project/View/Panels.tsx (12 hunks)
  • web/app/pages/Project/View/ViewProject.tsx (10 hunks)
  • web/app/pages/Project/View/components/RefRow.tsx (2 hunks)
  • web/app/referrers.map.json (1 hunks)
  • web/app/utils/generic.ts (1 hunks)
  • web/app/utils/referrers.ts (1 hunks)
🚧 Files skipped from review as they are similar to previous changes (2)
  • web/app/pages/Project/View/components/RefRow.tsx
  • web/app/referrers.map.json
🧰 Additional context used
🧬 Code graph analysis (2)
web/app/pages/Project/View/ViewProject.tsx (3)
web/app/utils/generic.ts (1)
  • getLocaleDisplayName (134-146)
web/app/lib/models/Entry.ts (1)
  • Entry (1-6)
web/app/utils/referrers.ts (1)
  • groupRefEntries (68-84)
web/app/utils/referrers.ts (1)
web/app/lib/models/Entry.ts (1)
  • Entry (1-6)

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

♻️ Duplicate comments (1)
backend/apps/cloud/src/analytics/utils/referrers.map.ts (1)

8-27: Previously flagged: Runtime FS dependency needs configuration.

A past review already identified that reading from web/app/referrers.map.json at runtime will fail in separate deployments. The recommended fix is to add an environment variable override and copy the JSON into the backend image during build.

🧹 Nitpick comments (6)
backend/apps/cloud/src/analytics/utils/referrers.map.ts (2)

18-20: Add JSON schema validation after parsing.

The parsed JSON is cast to ReferrerJson[] without validation. If the file structure is incorrect, this could cause runtime errors downstream when accessing name or patterns properties.

Consider adding a validation function:

+const isValidReferrerJson = (data: unknown): data is ReferrerJson[] => {
+  return Array.isArray(data) && data.every(
+    item => typeof item === 'object' && 
+            item !== null &&
+            typeof item.name === 'string' && 
+            Array.isArray(item.patterns) &&
+            item.patterns.every((p: unknown) => typeof p === 'string')
+  )
+}
+
 const loadMap = (): ReferrerJson[] => {
   if (cachedMap) return cachedMap
   const candidates = [
     path.resolve(__dirname, '../../../../../web/app/referrers.map.json'),
     path.resolve(__dirname, '../../../../web/app/referrers.map.json'),
   ]
   for (const p of candidates) {
     try {
       const raw = fs.readFileSync(p, 'utf8')
-      cachedMap = JSON.parse(raw) as ReferrerJson[]
+      const parsed = JSON.parse(raw)
+      if (!isValidReferrerJson(parsed)) {
+        console.error(`[ERROR] Invalid referrers map structure in ${p}`)
+        continue
+      }
+      cachedMap = parsed
       return cachedMap
     } catch {
       // try next location
     }
   }
   cachedMap = []
   return cachedMap
 }

21-23: Silent error swallowing hinders debugging.

File I/O errors are caught but not logged, making it difficult to diagnose why the map failed to load. At minimum, log when all candidates fail.

     } catch {
       // try next location
     }
   }
+  console.warn('[WARN] Failed to load referrers map from all candidate paths, using empty map')
   cachedMap = []
   return cachedMap
 }
backend/apps/community/src/analytics/analytics.service.ts (1)

1014-1016: Consider simplifying the nested OR condition for readability.

The domain matching logic is functionally correct but has deeply nested conditions that are hard to parse. Consider extracting the matching logic into a helper or restructuring for clarity.

-            parts.push(
-              `(domain(ref) != '' AND (lower(domain(ref)) = {${dp}:String} OR endsWith(lower(domain(ref)), concat('.', {${dp}:String})))) OR (domain(ref) = '' AND (lower(ref) = {${dp}:String} OR endsWith(lower(ref), concat('.', {${dp}:String}))))`,
-            )
+            // Match domain or subdomain (handles both extracted domain and bare ref)
+            parts.push(
+              `(
+                (domain(ref) != '' AND (lower(domain(ref)) = {${dp}:String} OR endsWith(lower(domain(ref)), concat('.', {${dp}:String})))) 
+                OR 
+                (domain(ref) = '' AND (lower(ref) = {${dp}:String} OR endsWith(lower(ref), concat('.', {${dp}:String}))))
+              )`
+            )

Alternatively, consider a ClickHouse function that encapsulates the logic if this pattern repeats elsewhere.

backend/apps/community/src/analytics/utils/referrers.map.ts (3)

8-27: Code duplication with cloud version – consider shared utility.

This module is nearly identical to backend/apps/cloud/src/analytics/utils/referrers.map.ts. The only difference is the file location, but the paths resolved are actually the same. Consider extracting this into a shared utility module to maintain DRY principles and reduce maintenance overhead.

Create a shared module at backend/libs/common/src/utils/referrers.map.ts and import it in both cloud and community services. This ensures consistent behavior and simplifies future updates.


18-20: Add JSON schema validation after parsing.

Same issue as in the cloud version: parsed JSON lacks validation. If the file structure is incorrect, downstream code will fail when accessing properties.

Apply the same validation pattern suggested for the cloud version (see review comment on backend/apps/cloud/src/analytics/utils/referrers.map.ts lines 18-20).


21-23: Silent error swallowing hinders debugging.

Same issue as the cloud version: errors are caught but not logged. Add a warning when all candidate paths fail.

     } catch {
       // try next location
     }
   }
+  console.warn('[WARN] Failed to load referrers map from all candidate paths, using empty map')
   cachedMap = []
   return cachedMap
 }
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between b540234 and c8f253f.

📒 Files selected for processing (4)
  • backend/apps/cloud/src/analytics/analytics.service.ts (3 hunks)
  • backend/apps/cloud/src/analytics/utils/referrers.map.ts (1 hunks)
  • backend/apps/community/src/analytics/analytics.service.ts (5 hunks)
  • backend/apps/community/src/analytics/utils/referrers.map.ts (1 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
  • backend/apps/cloud/src/analytics/analytics.service.ts
🧰 Additional context used
🧬 Code graph analysis (2)
backend/apps/cloud/src/analytics/utils/referrers.map.ts (1)
backend/apps/community/src/analytics/utils/referrers.map.ts (1)
  • getDomainsForRefName (29-34)
backend/apps/community/src/analytics/analytics.service.ts (1)
backend/apps/community/src/analytics/utils/referrers.map.ts (1)
  • getDomainsForRefName (29-34)
🔇 Additional comments (4)
backend/apps/community/src/analytics/analytics.service.ts (3)

54-54: LGTM!

Import correctly references the new referrers.map utility module.


931-931: LGTM!

Correctly removes the exclusion of 'refn' from the filter validation, allowing it as a valid filter column alongside other analytics columns.


995-1021: The fallback logic for empty domain(ref) is correct—no fixes needed.

After examining the code and referrer mapping logic, the refn filter implementation is sound:

  1. Pattern generation: getDomainsForRefName() returns known referrer patterns from referrers.map.json, or falls back to the filter value itself. Patterns are guaranteed to be non-empty strings from known sources.

  2. Fallback logic is intentional: When domain(ref) = '', the code checks if the ref string itself matches the pattern. This is the correct behavior for malformed or unparseable URLs—treating the raw ref as a bare domain string is appropriate.

  3. No false positive risk: The conditions use strict equality and suffix matching, so even empty refs are handled correctly (they would only match if the pattern is also empty, which is not a realistic scenario given the referrer mapping).

The logic correctly delegates to ClickHouse's domain() function behavior (returns empty string for malformed URLs) and handles the fallback appropriately.

backend/apps/community/src/analytics/utils/referrers.map.ts (1)

29-34: LGTM!

The getDomainsForRefName function correctly performs case-insensitive lookup and returns null when no match is found, providing a clear API contract.

@Blaumaus Blaumaus merged commit fbe0b9c into main Nov 8, 2025
7 checks passed
@coderabbitai coderabbitai bot mentioned this pull request Nov 30, 2025
6 tasks
@coderabbitai coderabbitai bot mentioned this pull request Dec 10, 2025
6 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant