Skip to content

Comments

Added materialized view and duplicated v2 Tinybird endpoints#37

Open
tomerqodo wants to merge 13 commits intocoderabbit_combined_20260121_augment_sentry_coderabbit_1_base_added_materialized_view_and_duplicated_v2_tinybird_endpoints_pr232from
coderabbit_combined_20260121_augment_sentry_coderabbit_1_head_added_materialized_view_and_duplicated_v2_tinybird_endpoints_pr232
Open

Added materialized view and duplicated v2 Tinybird endpoints#37
tomerqodo wants to merge 13 commits intocoderabbit_combined_20260121_augment_sentry_coderabbit_1_base_added_materialized_view_and_duplicated_v2_tinybird_endpoints_pr232from
coderabbit_combined_20260121_augment_sentry_coderabbit_1_head_added_materialized_view_and_duplicated_v2_tinybird_endpoints_pr232

Conversation

@tomerqodo
Copy link

@tomerqodo tomerqodo commented Jan 21, 2026

Benchmark PR from qodo-benchmark#232

Summary by CodeRabbit

Release Notes

  • New Features

    • Added new versioned analytics endpoints (v2) for improved performance tracking, including active visitors, KPIs, device analytics, traffic sources, pages, locations, and UTM campaign tracking.
  • Documentation

    • Updated configuration documentation to support optional version specification for analytics endpoints.
  • Refactor

    • Improved endpoint versioning to use configuration-based approach instead of per-call overrides.
  • Tests

    • Added comprehensive test fixtures for new analytics endpoints.

✏️ Tip: You can customize this high-level summary in your review settings.

@coderabbitai
Copy link

coderabbitai bot commented Jan 21, 2026

Walkthrough

This pull request introduces v2 versions of Tinybird analytics endpoints with infrastructure supporting config-driven versioning, adds new session-data aggregation pipes, removes per-call version overrides, and includes comprehensive test fixtures for all new endpoints.

Changes

Cohort / File(s) Summary
Framework & Config Updates
apps/admin-x-framework/src/providers/framework-provider.tsx, apps/admin-x-framework/src/utils/stats-config.ts
Added optional version property to StatsConfig interface; refactored getStatEndpointUrl to construct endpoint paths using config.version as prefix (e.g., api_kpis_v2) instead of per-call overrides
Core API Endpoint
ghost/core/core/server/api/endpoints/stats.js
Removed tb_version from topContent endpoint options; reduces accepted query surface
Tinybird Data Source
ghost/core/core/server/data/tinybird/datasources/_mv_session_data_v2.datasource
New materialized view data source with AggregatingMergeTree engine; defines aggregated session-level schema with pageviews and attribution fields
Tinybird Infrastructure Pipes
ghost/core/core/server/data/tinybird/pipes/mv_session_data_v2.pipe, ghost/core/core/server/data/tinybird/pipes/filtered_sessions_v2.pipe
New aggregation and filtering pipes; mv_session_data_v2 transforms raw hits into session aggregates; filtered_sessions_v2 provides hit-level and session-level filtering with date/timezone handling
Tinybird v2 Endpoint Pipes
ghost/core/core/server/data/tinybird/endpoints/api_active_visitors_v2.pipe, api_kpis_v2.pipe, api_post_visitor_counts_v2.pipe, api_top_devices_v2.pipe, api_top_locations_v2.pipe, api_top_pages_v2.pipe, api_top_sources_v2.pipe, api_top_utm_campaigns_v2.pipe, api_top_utm_contents_v2.pipe, api_top_utm_mediums_v2.pipe, api_top_utm_sources_v2.pipe, api_top_utm_terms_v2.pipe
Twelve new analytics endpoint definitions using v2 session data and filtering infrastructure; support optional filters (member_status, location, pathname, post_uuid, UTM params) with pagination
Tinybird Test Fixtures
ghost/core/core/server/data/tinybird/tests/api_*.yaml (14 files)
Comprehensive YAML test datasets covering date ranges, filters, timezones, and combined filter scenarios for validation of all v2 endpoints
Tinybird Documentation
ghost/core/core/server/data/tinybird/README.md, ghost/core/core/server/data/tinybird/endpoints/README.md
Documentation updates reflecting version field in config and new Merge-based aggregation approach for session-level attributes
Service Layer Updates
ghost/core/core/server/services/stats/ContentStatsService.js, ghost/core/core/server/services/stats/utils/tinybird.js, ghost/core/core/server/services/tinybird/TinybirdService.js
Removed tbVersion per-call override handling; refactored URL construction to use config.version; expanded pipe whitelist for v2 endpoints
Service Tests
ghost/core/test/unit/server/services/stats/utils/tinybird.test.js
Updated test to expect version sourcing from config instead of tbVersion option; removed local endpoint version override test

Sequence Diagram

sequenceDiagram
    participant Client
    participant StatsService
    participant Config as StatsConfig
    participant TinybirdUtil as tinybird.js
    participant TinybirdAPI as Tinybird Backend

    Client->>StatsService: request analytics data
    StatsService->>Config: read version setting
    Config-->>StatsService: return version (e.g., "v2")
    StatsService->>TinybirdUtil: buildRequest(pipeName, options)
    TinybirdUtil->>TinybirdUtil: check Config.version
    alt version present
        TinybirdUtil->>TinybirdUtil: construct URL: /v0/pipes/{pipeName}_{version}.json
    else version absent
        TinybirdUtil->>TinybirdUtil: construct URL: /v0/pipes/{pipeName}.json
    end
    TinybirdUtil-->>TinybirdUtil: filter options (remove tbVersion)
    TinybirdUtil->>TinybirdAPI: POST URL with query params
    TinybirdAPI-->>TinybirdUtil: analytics result
    TinybirdUtil-->>StatsService: formatted response
    StatsService-->>Client: return data
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Poem

🐰 Hopping through v2 with joy and delight,
New sessions aggregated, filters burning bright,
Config-driven versioning, no more per-call fuss,
Twelve endpoints ready—data flowing for us!

🚥 Pre-merge checks | ✅ 2 | ❌ 1
❌ Failed checks (1 warning)
Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately summarizes the main changes: introducing materialized views (_mv_session_data_v2) and creating v2 versions of multiple Tinybird endpoints and pipes. The title is specific and directly reflects the primary additions in the changeset.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing touches
  • 📝 Generate docstrings

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 9

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
ghost/core/core/server/services/stats/ContentStatsService.js (1)

75-83: Source filter dropped from Tinybird options.

options.source is still documented but no longer forwarded, so source-based filtering is ignored. Re-add the mapping or remove the option from the API contract and callers.

🔧 Proposed fix
         // Only add location if defined
         if (options.location) {
             tinybirdOptions.location = options.location;
         }
+
+        // Only add source if defined
+        if (options.source) {
+            tinybirdOptions.source = options.source;
+        }
🤖 Fix all issues with AI agents
In `@apps/admin-x-framework/src/providers/framework-provider.tsx`:
- Line 90: The FrameworkContext.Provider is passing children via the children
prop which trips react/no-children-prop; change the JSX to render
FrameworkContext.Provider with nested children (i.e., <FrameworkContext.Provider
value={props}> {children} </FrameworkContext.Provider>) instead of using the
children prop; update the file's FrameworkContext.Provider usage to remove the
children prop and place the children between the opening and closing Provider
tags so the context value remains value={props}.

In `@apps/admin-x-framework/src/utils/stats-config.ts`:
- Around line 17-20: When building the endpoint string in stats-config (using
config.version and endpointName), guard against endpointName being undefined so
you don't produce names like `${config.version}_undefined`; update the logic
that computes finalEndpointName to check endpointName (the variable used in the
template) and either default it to a safe value (e.g., '') or throw/return an
error—e.g., compute finalEndpointName from endpointName ?
`${config.version}_${endpointName}` : endpointName and ensure the returned URL
uses that sanitized finalEndpointName.

In `@ghost/core/core/server/data/tinybird/endpoints/api_kpis_v2.pipe`:
- Around line 104-109: The SELECT uses invalid ClickHouse syntax uniq(distinct
s.session_id); replace that expression in the SELECT list so visits is computed
correctly by using either uniqExact(s.session_id) or COUNT(DISTINCT
s.session_id) (keep the alias visits), i.e. update the select clause where
uniq(distinct s.session_id) appears in the api_kpis_v2.pipe to one of those
valid ClickHouse functions.
- Line 8: The _single_day assignment calls day_diff(date_from, date_to) without
ensuring date_to is defined; update the condition that sets _single_day to first
check both date_from and date_to are defined (e.g., defined(date_from) and
defined(date_to)) before calling day_diff(date_from, date_to), and apply the
same guard fix to the other occurrences of this pattern in this file (the lines
using _single_day / day_diff).

In `@ghost/core/core/server/data/tinybird/endpoints/api_top_devices_v2.pipe`:
- Around line 18-26: The ORDER BY clause in the query (select device, count() as
visits from session_data sd inner join filtered_sessions_v2 fs on fs.session_id
= sd.session_id group by device order by visits desc limit ...) lacks a stable
secondary key, causing non-deterministic pagination when visits tie; update the
ORDER BY to include device as a tie-breaker (e.g., order by visits desc, device
asc) so rows with equal visits are deterministically ordered for OFFSET/LIMIT
pagination.

In `@ghost/core/core/server/data/tinybird/endpoints/api_top_locations_v2.pipe`:
- Around line 24-26: The endpoint template is missing a device filter so tests
expecting device=desktop will fail; add a conditional block similar to the
existing ones for location/pathname/post_uuid: detect the template variable
"device" (e.g. {% if defined(device) %}) and add a filter clause that binds the
"device" parameter using the same String(...) pattern (matching description and
required=False) so the endpoint filters rows by device when provided.

In `@ghost/core/core/server/data/tinybird/endpoints/api_top_sources_v2.pipe`:
- Around line 18-26: The ORDER BY in the query currently only sorts by visits
(count) which can yield non-deterministic pagination when counts tie; modify the
ORDER BY clause in the API query (the select grouping that produces visits and
source) to add a secondary sort key "source asc" after "visits desc" so ties are
consistently broken by source, ensuring stable pagination across requests.

In `@ghost/core/core/server/data/tinybird/endpoints/api_top_utm_mediums_v2.pipe`:
- Around line 18-28: The ORDER BY clause currently sorts only by visits which
allows non-deterministic ordering for ties; update the query's ORDER BY to
include a stable secondary key (for example utm_medium ASC) so rows with equal
count() (visits) are consistently ordered across pages—modify the ORDER BY
visits desc to ORDER BY visits DESC, utm_medium ASC (referencing utm_medium,
visits, session_data sd and filtered_sessions_v2 fs in the pipe).

In `@ghost/core/core/server/data/tinybird/endpoints/README.md`:
- Line 10: The README incorrectly conflates v1 and v2 datasources and the
ClickHouse function names: update the text around _mv_session_data to explicitly
state it refers to the v1 datasource and uses argMin() (not argMinState()), and
add a short note that v2 endpoints use _mv_session_data_v2 and use argMinState()
aggregates; ensure both symbols _mv_session_data, _mv_session_data_v2 and
functions argMin, argMinState are mentioned so readers know which view and
function apply to v1 vs v2.
🧹 Nitpick comments (5)
ghost/core/core/server/data/tinybird/tests/api_active_visitors_v2.yaml (1)

2-13: Deduplicate or differentiate the first two test cases.
default_site_uuid and active_visitors_default are identical, so one is redundant unless you intend to validate a distinct default-parameter path.

♻️ Suggested cleanup (remove the duplicate test)
-- name: default_site_uuid
-  description: Test active visitors with default site UUID
-  parameters: site_uuid=mock_site_uuid
-  expected_result: |
-    {"active_visitors":16}
-
ghost/core/core/server/data/tinybird/tests/api_top_devices_v2.yaml (1)

1-62: Good test coverage for filter scenarios.

The test suite covers essential filter combinations (device, location, member_status, pathname, source, and combined filters). Consider adding test cases for:

  1. Pagination: The endpoint supports skip and limit parameters, but no tests verify this behavior.
  2. Edge cases: Empty results scenario, boundary dates.
Example pagination test case
- name: Pagination - skip and limit
  description: Test pagination with skip=1 and limit=1
  parameters: site_uuid=mock_site_uuid&date_from=2100-01-01&date_to=2100-01-07&timezone=Etc/UTC&skip=1&limit=1
  expected_result: |
    {"device":"bot","visits":1}
ghost/core/core/server/data/tinybird/tests/api_top_utm_campaigns_v2.yaml (1)

49-62: Consider adding member_status=free test case.

The test suite includes member_status=paid (line 49-54) and member_status=undefined (line 56-62), but lacks coverage for member_status=free. For completeness with the devices v2 tests which include this case, consider adding a free member status test.

ghost/core/core/server/data/tinybird/tests/api_top_sources_v2.yaml (1)

1-131: Test coverage looks comprehensive.

The test fixture covers a good range of filter combinations including location, pathname, post_uuid, source, member_status, timezone, and UTM parameters. The expected results are well-structured.

Consider adding test cases for:

  • Pagination (skip and limit parameters) to verify the endpoint handles pagination correctly
  • Edge cases such as empty result sets or boundary conditions
ghost/core/core/server/data/tinybird/tests/api_top_pages_v2.yaml (1)

1-135: Comprehensive test coverage for the top pages endpoint.

The test fixture covers the full range of filters including post_type and device filters specific to this endpoint. The description on line 129 helpfully notes that the desktop filter excludes bot sessions.

As with the sources tests, consider adding pagination test cases (skip/limit) to ensure the endpoint handles offset-based pagination correctly.

📜 Review details

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 94c132b and 4bc9eeb.

📒 Files selected for processing (36)
  • apps/admin-x-framework/src/providers/framework-provider.tsx
  • apps/admin-x-framework/src/utils/stats-config.ts
  • ghost/core/core/server/api/endpoints/stats.js
  • ghost/core/core/server/data/tinybird/README.md
  • ghost/core/core/server/data/tinybird/datasources/_mv_session_data_v2.datasource
  • ghost/core/core/server/data/tinybird/endpoints/README.md
  • ghost/core/core/server/data/tinybird/endpoints/api_active_visitors_v2.pipe
  • ghost/core/core/server/data/tinybird/endpoints/api_kpis_v2.pipe
  • ghost/core/core/server/data/tinybird/endpoints/api_post_visitor_counts_v2.pipe
  • ghost/core/core/server/data/tinybird/endpoints/api_top_devices_v2.pipe
  • ghost/core/core/server/data/tinybird/endpoints/api_top_locations_v2.pipe
  • ghost/core/core/server/data/tinybird/endpoints/api_top_pages_v2.pipe
  • ghost/core/core/server/data/tinybird/endpoints/api_top_sources_v2.pipe
  • ghost/core/core/server/data/tinybird/endpoints/api_top_utm_campaigns_v2.pipe
  • ghost/core/core/server/data/tinybird/endpoints/api_top_utm_contents_v2.pipe
  • ghost/core/core/server/data/tinybird/endpoints/api_top_utm_mediums_v2.pipe
  • ghost/core/core/server/data/tinybird/endpoints/api_top_utm_sources_v2.pipe
  • ghost/core/core/server/data/tinybird/endpoints/api_top_utm_terms_v2.pipe
  • ghost/core/core/server/data/tinybird/pipes/filtered_sessions_v2.pipe
  • ghost/core/core/server/data/tinybird/pipes/mv_session_data_v2.pipe
  • ghost/core/core/server/data/tinybird/tests/api_active_visitors_v2.yaml
  • ghost/core/core/server/data/tinybird/tests/api_kpis_v2.yaml
  • ghost/core/core/server/data/tinybird/tests/api_post_visitor_counts_v2.yaml
  • ghost/core/core/server/data/tinybird/tests/api_top_devices_v2.yaml
  • ghost/core/core/server/data/tinybird/tests/api_top_locations_v2.yaml
  • ghost/core/core/server/data/tinybird/tests/api_top_pages_v2.yaml
  • ghost/core/core/server/data/tinybird/tests/api_top_sources_v2.yaml
  • ghost/core/core/server/data/tinybird/tests/api_top_utm_campaigns_v2.yaml
  • ghost/core/core/server/data/tinybird/tests/api_top_utm_contents_v2.yaml
  • ghost/core/core/server/data/tinybird/tests/api_top_utm_mediums_v2.yaml
  • ghost/core/core/server/data/tinybird/tests/api_top_utm_sources_v2.yaml
  • ghost/core/core/server/data/tinybird/tests/api_top_utm_terms_v2.yaml
  • ghost/core/core/server/services/stats/ContentStatsService.js
  • ghost/core/core/server/services/stats/utils/tinybird.js
  • ghost/core/core/server/services/tinybird/TinybirdService.js
  • ghost/core/test/unit/server/services/stats/utils/tinybird.test.js
💤 Files with no reviewable changes (1)
  • ghost/core/core/server/api/endpoints/stats.js
🧰 Additional context used
🧬 Code graph analysis (2)
ghost/core/core/server/services/stats/ContentStatsService.js (1)
ghost/core/test/unit/server/services/stats/content.test.js (6)
  • options (64-67)
  • options (324-327)
  • options (352-355)
  • options (399-399)
  • options (458-463)
  • options (494-507)
ghost/core/core/server/services/stats/utils/tinybird.js (1)
ghost/core/core/server/services/public-config/config.js (1)
  • statsConfig (33-33)
🪛 Biome (2.1.2)
apps/admin-x-framework/src/providers/framework-provider.tsx

[error] 90-90: Avoid passing children using a prop

The canonical way to pass children in React is to use JSX elements

(lint/correctness/noChildrenProp)

🔇 Additional comments (32)
ghost/core/core/server/data/tinybird/tests/api_post_visitor_counts_v2.yaml (1)

1-19: Good test coverage for the v2 endpoint.

The test file covers the essential scenarios: single post, multiple posts (validating descending order by visits), and the no-data edge case. The expected results correctly align with the endpoint's ORDER BY visits desc behavior.

Minor: Line 1 has an unnecessary leading blank line that could be removed for cleanliness.

ghost/core/core/server/data/tinybird/endpoints/api_post_visitor_counts_v2.pipe (2)

1-2: Token definitions follow established patterns.

The stats_page and axis read tokens are consistent with other v2 endpoints in this PR.


4-15: Well-structured endpoint implementation.

The SQL logic correctly counts unique sessions per post using uniqExact(session_id), which provides exact counts. The required parameters and ordering are appropriate. The _mv_hits materialized view contains all required columns (site_uuid, post_uuid, session_id).

ghost/core/core/server/data/tinybird/endpoints/api_active_visitors_v2.pipe (1)

1-15: LGTM — clean, minimal endpoint definition.

ghost/core/core/server/services/tinybird/TinybirdService.js (1)

59-72: The v2 pipes are properly whitelisted—no issues found.

The added v2 endpoints match their v1 counterparts correctly. Verification confirms that while two additional pipes exist in the codebase (api_monitoring_ingestion and api_monitoring_ingestion_aggregated), they are not referenced in server code and appear to be internal monitoring pipes intentionally excluded from the JWT scope whitelist.

ghost/core/core/server/data/tinybird/README.md (1)

81-83: Doc update for versioned endpoint config looks good.

Clear explanation of the optional version suffix.

apps/admin-x-framework/src/providers/framework-provider.tsx (1)

13-13: Optional stats version field is fine.

No concerns with making this config optional.

ghost/core/core/server/data/tinybird/tests/api_top_utm_terms_v2.yaml (1)

2-82: Good coverage across filter combinations.

Nice set of cases (location/pathname/source/member status/timezone/combined).

ghost/core/core/server/data/tinybird/tests/api_top_utm_mediums_v2.yaml (1)

1-79: LGTM!

The test fixture provides comprehensive coverage for UTM mediums filtering scenarios. The structure is consistent with other v2 test fixtures in this PR.

ghost/core/core/server/data/tinybird/tests/api_top_utm_contents_v2.yaml (1)

1-81: LGTM!

The test fixture is well-structured and consistent with other v2 UTM test fixtures. The variety of utm_content values provides good coverage for realistic scenarios.

ghost/core/test/unit/server/services/stats/utils/tinybird.test.js (2)

73-87: LGTM - Config-driven versioning is cleaner.

The shift from per-call tbVersion to config-based version is a good design decision. It centralizes version control and reduces API surface area.


102-118: The test correctly reflects current behavior, but local.token in config is unused.

The implementation unconditionally uses tinybirdService.getToken() for authorization (line 85 of tinybird.js) regardless of local mode. Local mode only switches the endpoint (line 31). The local.token field in the config (line 110) is never used—this appears to be an incomplete feature implementation. The test is correct for the current code, but consider either: (1) implementing local token support if intended, or (2) removing local.token from the config structure to avoid confusion. The comment "temporary until we have a proper way to use mock data locally" (tinybird.js, lines 26-28) suggests this feature may still be in progress.

ghost/core/core/server/data/tinybird/endpoints/api_top_utm_campaigns_v2.pipe (1)

21-23: No action required—tenant isolation is already enforced.

The join on session_id alone is safe because both session_data and filtered_sessions_v2 are pre-filtered by site_uuid in their WHERE clauses before any joins occur. Each pipe returns only records for the specified tenant, making cross-tenant collisions impossible. Adding site_uuid to the join condition would be redundant.

Likely an incorrect or invalid review comment.

ghost/core/core/server/data/tinybird/endpoints/api_top_utm_contents_v2.pipe (2)

4-13: LGTM on the session_data aggregation logic.

The use of argMinMerge(utm_content) correctly retrieves the first UTM content value per session from the materialized view state.


21-23: No action required—data isolation is maintained through parameter passing.

The join between session_data and filtered_sessions_v2 on session_id alone is safe. Both nodes receive the same site_uuid parameter, and each filters to that tenant before the join occurs:

  • session_data explicitly filters by site_uuid in its WHERE clause
  • filtered_sessions_v2 is designed to accept the site_uuid parameter and returns only session_ids belonging to that site

This parameter-level tenant isolation is sufficient and is the consistent pattern across all v2 UTM endpoint pipes (campaigns, contents, mediums, sources, terms).

ghost/core/core/server/data/tinybird/endpoints/api_top_locations_v2.pipe (2)

11-13: Consider adding site_uuid to the join condition.

Similar to the UTM contents endpoint, the join with filtered_sessions_v2 only matches on session_id. Adding site_uuid to the join would provide explicit tenant isolation:

         from _mv_hits h
         inner join filtered_sessions_v2 fs
-            on fs.session_id = h.session_id
+            on fs.session_id = h.session_id and fs.site_uuid = h.site_uuid

16-23: Member status filtering logic is well-implemented.

The array expansion with conditional inclusion of 'comped' when 'paid' is selected correctly implements the business rule that comped members should be included in paid member filters.

ghost/core/core/server/data/tinybird/pipes/mv_session_data_v2.pipe (1)

3-22: Well-designed materialized view for session data aggregation.

The use of argMinState(field, timestamp) correctly implements first-touch attribution by capturing the value of each field at the earliest timestamp within each session. This is an efficient pattern for pre-aggregating session-level data.

The countState(), minState(), and maxState() functions appropriately capture pageview counts and session time boundaries.

ghost/core/core/server/data/tinybird/tests/api_top_locations_v2.yaml (2)

78-86: Device filter test depends on endpoint implementation fix.

This test case expects the device=desktop filter to work. As noted in the endpoint review, the api_top_locations_v2.pipe is currently missing the device filter condition. This test will not produce the expected results until the endpoint is updated to support the device filter.


2-76: Test coverage for core filters is comprehensive.

The test cases cover the essential filter combinations: date range, location, pathname, post_uuid, source, member_status (including both 'paid' and 'undefined'), timezone, and combined filters. The expected results are well-documented.

ghost/core/core/server/services/stats/utils/tinybird.js (2)

35-42: LGTM! Clean config-driven versioning implementation.

The URL construction logic correctly appends the version suffix when configured, producing clean URLs like /v0/pipes/api_kpis_v2.json. The optional chaining on statsConfig?.version safely handles the case when version is not configured.


66-77: LGTM! Removal of tbVersion from options handling aligns with config-driven versioning.

The change correctly removes tbVersion from the exclusion list since per-call version overrides are no longer supported. No remaining callers in the codebase pass tbVersion in options.

ghost/core/core/server/data/tinybird/endpoints/api_top_pages_v2.pipe (3)

17-24: LGTM! Member status filter correctly handles paid/comped expansion.

The logic appropriately includes comped members when filtering for paid status, which is the expected business behavior.


28-34: LGTM! Post type filtering logic handles null values correctly.

The conditional properly distinguishes between posts and non-posts (including null values for pages/other content types).


12-14: The join is safe from cross-tenant data leakage. Both _mv_hits and filtered_sessions_v2 are already filtered by the same site_uuid parameter before the join occurs:

  • _mv_hits is filtered in the WHERE clause: site_uuid = {{String(site_uuid, ...)}}
  • filtered_sessions_v2 internally filters _mv_hits by the same site_uuid parameter in its first node

Since both join branches are tenant-scoped before the join, the session_id-only join condition is safe regardless of whether session IDs are globally unique across all tenants.

ghost/core/core/server/data/tinybird/tests/api_kpis_v2.yaml (1)

1-13: LGTM! Comprehensive test fixture coverage.

The test suite provides excellent coverage including date ranges, various filters (location, pathname, post_uuid, source, member_status), timezone handling, UTM parameters, and device filtering. The use of future dates (2100-01-*) ensures test isolation from production data.

ghost/core/core/server/data/tinybird/tests/api_top_utm_sources_v2.yaml (1)

1-14: LGTM! Good test coverage for UTM sources endpoint.

The fixtures appropriately test various filter combinations and verify the expected ordering by visits. The combined filter test case ensures proper AND logic in query construction.

ghost/core/core/server/data/tinybird/endpoints/api_top_utm_sources_v2.pipe (2)

21-23: Same concern: Consider adding site_uuid to the join condition.

Similar to api_top_pages_v2.pipe, the join between session_data and filtered_sessions_v2 is only on session_id. While session_data already filters by site_uuid, adding it to the join provides defense-in-depth against cross-tenant data issues.


4-13: LGTM! Correct use of argMinMerge for first-hit attribution.

The session_data node properly extracts the first utm_source value per session from the materialized view, which is the correct approach for session-based attribution.

ghost/core/core/server/data/tinybird/endpoints/api_top_utm_terms_v2.pipe (1)

1-30: LGTM! Consistent implementation with other UTM endpoints.

The structure correctly mirrors api_top_utm_sources_v2.pipe, with appropriate substitution of utm_term for the grouping and filtering. The same join pattern consideration mentioned for other endpoints applies here as well.

ghost/core/core/server/data/tinybird/datasources/_mv_session_data_v2.datasource (1)

1-16: LGTM — schema/engine look consistent.
No concerns with the AggregatingMergeTree layout or column definitions in this context.

ghost/core/core/server/data/tinybird/pipes/filtered_sessions_v2.pipe (1)

3-85: Looks good — clear separation of hit vs. session filters.
The split between hit-level and first-hit session attributes is well structured.

✏️ Tip: You can disable this entire section by setting review_details to false in your review settings.

<FrameworkContext.Provider value={props}>
{children}
</FrameworkContext.Provider>
<FrameworkContext.Provider children={children} value={props} />
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

🧩 Analysis chain

🏁 Script executed:

# First, let's locate and examine the file
find . -name "framework-provider.tsx" -type f

Repository: qodo-benchmark/Ghost-combined-coderabbit

Length of output: 224


🏁 Script executed:

# Check file size to understand how to read it
wc -l apps/admin-x-framework/src/providers/framework-provider.tsx

Repository: qodo-benchmark/Ghost-combined-coderabbit

Length of output: 146


🏁 Script executed:

# Examine the area around line 90
sed -n '80,100p' apps/admin-x-framework/src/providers/framework-provider.tsx

Repository: qodo-benchmark/Ghost-combined-coderabbit

Length of output: 547


🏁 Script executed:

# Look for Biome configuration in the repository
find . -name "biome.json" -o -name "biome.jsonc" -o -name ".biomeignore" 2>/dev/null | head -20

Repository: qodo-benchmark/Ghost-combined-coderabbit

Length of output: 96


🏁 Script executed:

# Examine the Biome configuration
cat biome.json

Repository: qodo-benchmark/Ghost-combined-coderabbit

Length of output: 1858


🏁 Script executed:

# Search for any Biome-related linting rules about children
rg "children" biome.json -A 2 -B 2

Repository: qodo-benchmark/Ghost-combined-coderabbit

Length of output: 66


🏁 Script executed:

# Let's check the React rules in Biome documentation or configuration more broadly
rg "a11y|react|children" biome.json -i

Repository: qodo-benchmark/Ghost-combined-coderabbit

Length of output: 99


🏁 Script executed:

# Search for any ESLint or lint configuration that might flag this
find . -name ".eslintrc*" -o -name "eslint.config.*" -o -name "package.json" | head -10

Repository: qodo-benchmark/Ghost-combined-coderabbit

Length of output: 478


🏁 Script executed:

# Let's check the actual imports and React version to understand context
head -30 apps/admin-x-framework/src/providers/framework-provider.tsx

Repository: qodo-benchmark/Ghost-combined-coderabbit

Length of output: 981


🏁 Script executed:

# Search for Biome rules related to children or JSX in the codebase
rg "children" apps/admin-x-framework/src/providers/ -B 2 -A 2

Repository: qodo-benchmark/Ghost-combined-coderabbit

Length of output: 8166


🏁 Script executed:

# Check the ESLint configuration for the admin-x-framework
cat apps/admin-x-framework/.eslintrc.cjs

Repository: qodo-benchmark/Ghost-combined-coderabbit

Length of output: 1066


🏁 Script executed:

# Let's also check for React-specific linting rules in Biome
# Search for any references to react/no-children-prop or similar
rg "react" apps/admin-x-framework/.eslintrc.cjs -i

Repository: qodo-benchmark/Ghost-combined-coderabbit

Length of output: 531


🏁 Script executed:

# Check if there are any TypeScript or React type definitions that might validate this
rg "FrameworkContext.Provider" apps/admin-x-framework/src/ -B 2 -A 2

Repository: qodo-benchmark/Ghost-combined-coderabbit

Length of output: 609


Use nested children instead of passing via prop.

ESLint's react/no-children-prop rule (from the react/recommended configuration) flags this pattern. Pass children as nested JSX content instead of as a prop.

🔧 Proposed fix
-                <FrameworkContext.Provider children={children} value={props} />
+                <FrameworkContext.Provider value={props}>
+                    {children}
+                </FrameworkContext.Provider>
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
<FrameworkContext.Provider children={children} value={props} />
<FrameworkContext.Provider value={props}>
{children}
</FrameworkContext.Provider>
🧰 Tools
🪛 Biome (2.1.2)

[error] 90-90: Avoid passing children using a prop

The canonical way to pass children in React is to use JSX elements

(lint/correctness/noChildrenProp)

🤖 Prompt for AI Agents
In `@apps/admin-x-framework/src/providers/framework-provider.tsx` at line 90, The
FrameworkContext.Provider is passing children via the children prop which trips
react/no-children-prop; change the JSX to render FrameworkContext.Provider with
nested children (i.e., <FrameworkContext.Provider value={props}> {children}
</FrameworkContext.Provider>) instead of using the children prop; update the
file's FrameworkContext.Provider usage to remove the children prop and place the
children between the opening and closing Provider tags so the context value
remains value={props}.

Comment on lines +17 to +20
// Append version suffix if provided (e.g., "v2" -> "api_kpis_v2")
const finalEndpointName = config.version ? `${config.version}_${endpointName}` : endpointName;

return `${baseUrl}/v0/pipes/${finalEndpointName}.json?${params}`;
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Guard against undefined endpoint names when version is set.

With config.version and a missing endpointName, the URL becomes v2_undefined. Consider a simple guard.

🔧 Proposed fix
-    // Append version suffix if provided (e.g., "v2" -> "api_kpis_v2")
-    const finalEndpointName = config.version ? `${config.version}_${endpointName}` : endpointName;
-
-    return `${baseUrl}/v0/pipes/${finalEndpointName}.json?${params}`;
+    if (!endpointName) {
+        return '';
+    }
+
+    // Append version suffix if provided (e.g., "v2" -> "api_kpis_v2")
+    const finalEndpointName = config.version ? `${config.version}_${endpointName}` : endpointName;
+
+    return `${baseUrl}/v0/pipes/${finalEndpointName}.json?${params}`;
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
// Append version suffix if provided (e.g., "v2" -> "api_kpis_v2")
const finalEndpointName = config.version ? `${config.version}_${endpointName}` : endpointName;
return `${baseUrl}/v0/pipes/${finalEndpointName}.json?${params}`;
if (!endpointName) {
return '';
}
// Append version suffix if provided (e.g., "v2" -> "api_kpis_v2")
const finalEndpointName = config.version ? `${config.version}_${endpointName}` : endpointName;
return `${baseUrl}/v0/pipes/${finalEndpointName}.json?${params}`;
🤖 Prompt for AI Agents
In `@apps/admin-x-framework/src/utils/stats-config.ts` around lines 17 - 20, When
building the endpoint string in stats-config (using config.version and
endpointName), guard against endpointName being undefined so you don't produce
names like `${config.version}_undefined`; update the logic that computes
finalEndpointName to check endpointName (the variable used in the template) and
either default it to a safe value (e.g., '') or throw/return an error—e.g.,
compute finalEndpointName from endpointName ?
`${config.version}_${endpointName}` : endpointName and ensure the returned URL
uses that sanitized finalEndpointName.

SQL >

%
{% set _single_day = defined(date_from) and day_diff(date_from, date_to) == 0 %}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

🧩 Analysis chain

🏁 Script executed:

find . -name "api_kpis_v2.pipe" -o -name "*.pipe" | head -20

Repository: qodo-benchmark/Ghost-combined-coderabbit

Length of output: 224


🏁 Script executed:

git ls-files | grep -E "(tinybird|pipe)" | head -20

Repository: qodo-benchmark/Ghost-combined-coderabbit

Length of output: 1267


🏁 Script executed:

git ls-files | grep "api_kpis_v2.pipe"

Repository: qodo-benchmark/Ghost-combined-coderabbit

Length of output: 147


🏁 Script executed:

cat -n ghost/core/core/server/data/tinybird/endpoints/api_kpis_v2.pipe | head -50

Repository: qodo-benchmark/Ghost-combined-coderabbit

Length of output: 2196


🏁 Script executed:

cat -n ghost/core/core/server/data/tinybird/endpoints/api_kpis_v2.pipe

Repository: qodo-benchmark/Ghost-combined-coderabbit

Length of output: 7570


🏁 Script executed:

# Search for day_diff usage in tinybird files to understand its behavior
rg "day_diff" --type-list | head -5

Repository: qodo-benchmark/Ghost-combined-coderabbit

Length of output: 473


🏁 Script executed:

rg "day_diff" ghost/core/core/server/data/tinybird/ -A 2 -B 2

Repository: qodo-benchmark/Ghost-combined-coderabbit

Length of output: 4660


Guard _single_day against undefined date_to.

Line 8: Since both date_from and date_to are optional parameters (required=False), calling day_diff(date_from, date_to) when date_to is undefined will fail. The condition should check that both are defined before invoking day_diff.

Suggested change
-        {% set _single_day = defined(date_from) and day_diff(date_from, date_to) == 0 %}
+        {% set _single_day = defined(date_from) and defined(date_to) and day_diff(date_from, date_to) == 0 %}

Note: The same issue appears in multiple locations in this file (lines 84, 124, 132).

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
{% set _single_day = defined(date_from) and day_diff(date_from, date_to) == 0 %}
{% set _single_day = defined(date_from) and defined(date_to) and day_diff(date_from, date_to) == 0 %}
🤖 Prompt for AI Agents
In `@ghost/core/core/server/data/tinybird/endpoints/api_kpis_v2.pipe` at line 8,
The _single_day assignment calls day_diff(date_from, date_to) without ensuring
date_to is defined; update the condition that sets _single_day to first check
both date_from and date_to are defined (e.g., defined(date_from) and
defined(date_to)) before calling day_diff(date_from, date_to), and apply the
same guard fix to the other occurrences of this pattern in this file (the lines
using _single_day / day_diff).

Comment on lines +104 to +109
select
a.date,
uniq(distinct s.session_id) as visits,
sum(s.pageviews) as pageviews,
truncate(avg(s.is_bounce), 2) as bounce_rate,
truncate(avg(s.session_sec), 2) as avg_session_sec
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

# First, let's check if the file exists and examine the context
find . -name "api_kpis_v2.pipe" -type f

Repository: qodo-benchmark/Ghost-combined-coderabbit

Length of output: 224


🌐 Web query:

ClickHouse uniq function syntax distinct parameter

💡 Result:

Short answer: uniq itself does not take a DISTINCT keyword parameter. Use uniq(x[, ...]) (or an alternative uniq* variant) to count distinct values; ClickHouse supports COUNT(DISTINCT ...) (which is implemented by a chosen uniq* function) and also aggregate-function DISTINCT combinator for other aggregates (e.g., sum(DISTINCT x)). See ClickHouse docs. [1][2][3]

Sources:

  • uniq function docs. [1]
  • Aggregate-function combinators (includes -Distinct / DISTINCT usage). [2]
  • count / COUNT(DISTINCT) behavior (maps to uniq* via count_distinct_implementation). [3]

🏁 Script executed:

# Also search for similar patterns in the codebase to understand how distinct counts are being done
rg "uniq|countDistinct" --type sql --max-count 20

Repository: qodo-benchmark/Ghost-combined-coderabbit

Length of output: 224


uniq(distinct ...) is invalid ClickHouse/Tinybird SQL syntax.

Line 106: Replace with uniqExact(s.session_id) or COUNT(DISTINCT s.session_id). ClickHouse's uniq() function does not accept DISTINCT as a parameter; use a uniq* variant or standard COUNT(DISTINCT) syntax to count distinct session IDs.

Suggested change
-        uniq(distinct s.session_id) as visits,
+        uniqExact(s.session_id) as visits,
🤖 Prompt for AI Agents
In `@ghost/core/core/server/data/tinybird/endpoints/api_kpis_v2.pipe` around lines
104 - 109, The SELECT uses invalid ClickHouse syntax uniq(distinct
s.session_id); replace that expression in the SELECT list so visits is computed
correctly by using either uniqExact(s.session_id) or COUNT(DISTINCT
s.session_id) (keep the alias visits), i.e. update the select clause where
uniq(distinct s.session_id) appears in the api_kpis_v2.pipe to one of those
valid ClickHouse functions.

Comment on lines +18 to +26
select
device,
count() as visits
from session_data sd
inner join filtered_sessions_v2 fs
on fs.session_id = sd.session_id
group by device
order by visits desc
limit {{ Int32(skip, 0) }},{{ Int32(limit, 50) }}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

🧩 Analysis chain

🏁 Script executed:

find . -name "api_top_devices_v2.pipe" -type f

Repository: qodo-benchmark/Ghost-combined-coderabbit

Length of output: 224


🌐 Web query:

Tinybird pagination ORDER BY stability offset limit deterministic ordering

💡 Result:

Short answer: use a stable, deterministic ORDER BY (include a unique tie‑breaker) or switch to keyset (cursor) pagination. OFFSET+LIMIT without a deterministic ORDER BY can return inconsistent pages.

Recommended options:

  • Deterministic OFFSET/LIMIT: ORDER BY your desired keys plus a unique column (e.g., id). Example:
    SELECT * FROM my_table
    ORDER BY created_at DESC, id DESC
    LIMIT 50 OFFSET 100.
    This guarantees stable ordering across pages. [1][2]
  • Better at scale: use keyset (cursor) pagination (WHERE (created_at, id) < (last_created_at, last_id)) and return an opaque next_cursor. This is more efficient and avoids OFFSET scan costs. [3]

References:

  • Tinybird docs: pagination with LIMIT/OFFSET and note to add ORDER BY for consistent pagination. [1]
  • Explanation and example of adding a modulus/secondary tie‑breaker and why tie‑breakers are needed for stable pagination. [2]
  • Best-practice recommendation for keyset pagination over offset pagination for stability and performance. [3]

🌐 Web query:

Tinybird .pipe SQL ORDER BY tie-breaker secondary key required

💡 Result:

Short answer: Yes — if you need deterministic ordering from a Tinybird .pipe (SELECT ... ORDER BY), include a unique secondary/tie-breaker key (e.g. a unique id or monotonic timestamp). Tinybird pipes run on ClickHouse semantics, and ClickHouse returns rows with identical ORDER BY values in arbitrary/non‑deterministic order unless you break ties with additional expressions. [1][2]

Recommendation: ORDER BY primary_col, tie_breaker_col (e.g. ORDER BY event_time DESC, id ASC) to guarantee stable results. [2]

Sources:

  • Tinybird Pipes / Query API docs. [1]
  • ClickHouse ORDER BY: rows with identical sort expressions are returned in arbitrary/non-deterministic order. [2]

Add tie-breaker to stabilize pagination ordering for deterministic results.

Line 25: When multiple devices have the same visit count, rows are returned in arbitrary order without a deterministic secondary sort key. This causes inconsistent pagination across pages when using OFFSET/LIMIT. Add device as a secondary ordering column to guarantee stable results.

Suggested change
-    order by visits desc
+    order by visits desc, device asc
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
select
device,
count() as visits
from session_data sd
inner join filtered_sessions_v2 fs
on fs.session_id = sd.session_id
group by device
order by visits desc
limit {{ Int32(skip, 0) }},{{ Int32(limit, 50) }}
select
device,
count() as visits
from session_data sd
inner join filtered_sessions_v2 fs
on fs.session_id = sd.session_id
group by device
order by visits desc, device asc
limit {{ Int32(skip, 0) }},{{ Int32(limit, 50) }}
🤖 Prompt for AI Agents
In `@ghost/core/core/server/data/tinybird/endpoints/api_top_devices_v2.pipe`
around lines 18 - 26, The ORDER BY clause in the query (select device, count()
as visits from session_data sd inner join filtered_sessions_v2 fs on
fs.session_id = sd.session_id group by device order by visits desc limit ...)
lacks a stable secondary key, causing non-deterministic pagination when visits
tie; update the ORDER BY to include device as a tie-breaker (e.g., order by
visits desc, device asc) so rows with equal visits are deterministically ordered
for OFFSET/LIMIT pagination.

Comment on lines +24 to +26
{% if defined(location) %} and location = {{ String(location, description="Location to filter on", required=False) }} {% end %}
{% if defined(pathname) %} and pathname = {{ String(pathname, description="Pathname to filter on", required=False) }} {% end %}
{% if defined(post_uuid) %} and post_uuid = {{ String(post_uuid, description="Post UUID to filter on", required=False) }} {% end %}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Missing device filter - test expects it but endpoint doesn't implement it.

The test file api_top_locations_v2.yaml includes a test case for filtering by device=desktop (lines 78-86), but this endpoint doesn't have a condition to filter by device. This will cause the test to fail or return unexpected results.

Add the device filter condition:

             {% if defined(post_uuid) %} and post_uuid = {{ String(post_uuid, description="Post UUID to filter on", required=False) }} {% end %}
+            {% if defined(device) %} and device = {{ String(device, description="Device type to filter on", required=False) }} {% end %}
         group by location
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
{% if defined(location) %} and location = {{ String(location, description="Location to filter on", required=False) }} {% end %}
{% if defined(pathname) %} and pathname = {{ String(pathname, description="Pathname to filter on", required=False) }} {% end %}
{% if defined(post_uuid) %} and post_uuid = {{ String(post_uuid, description="Post UUID to filter on", required=False) }} {% end %}
{% if defined(location) %} and location = {{ String(location, description="Location to filter on", required=False) }} {% end %}
{% if defined(pathname) %} and pathname = {{ String(pathname, description="Pathname to filter on", required=False) }} {% end %}
{% if defined(post_uuid) %} and post_uuid = {{ String(post_uuid, description="Post UUID to filter on", required=False) }} {% end %}
{% if defined(device) %} and device = {{ String(device, description="Device type to filter on", required=False) }} {% end %}
🤖 Prompt for AI Agents
In `@ghost/core/core/server/data/tinybird/endpoints/api_top_locations_v2.pipe`
around lines 24 - 26, The endpoint template is missing a device filter so tests
expecting device=desktop will fail; add a conditional block similar to the
existing ones for location/pathname/post_uuid: detect the template variable
"device" (e.g. {% if defined(device) %}) and add a filter clause that binds the
"device" parameter using the same String(...) pattern (matching description and
required=False) so the endpoint filters rows by device when provided.

Comment on lines +18 to +26
select
source,
count() as visits
from session_data sd
inner join filtered_sessions_v2 fs
on fs.session_id = sd.session_id
group by source
order by visits desc
limit {{ Int32(skip, 0) }},{{ Int32(limit, 50) }}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

🧩 Analysis chain

🏁 Script executed:

# First, locate and examine the specific file
find . -name "api_top_sources_v2.pipe" -type f

Repository: qodo-benchmark/Ghost-combined-coderabbit

Length of output: 156


🏁 Script executed:

# Look for similar top_* endpoints to check for consistency
find . -path "*/tinybird/endpoints/*" -name "*top_*.pipe" -type f

Repository: qodo-benchmark/Ghost-combined-coderabbit

Length of output: 1412


🏁 Script executed:

# Get the actual content of the file to verify the code snippet
cat -n ghost/core/core/server/data/tinybird/endpoints/api_top_sources_v2.pipe 2>/dev/null | head -40

Repository: qodo-benchmark/Ghost-combined-coderabbit

Length of output: 913


🏁 Script executed:

# Search for order by patterns in tinybird endpoints to understand conventions
rg "order by" . --glob="*/tinybird/endpoints/*.pipe" -A 2 -B 2

Repository: qodo-benchmark/Ghost-combined-coderabbit

Length of output: 66


🏁 Script executed:

# Let me examine several similar endpoint files to check for ordering patterns
for file in ./ghost/core/core/server/data/tinybird/endpoints/api_top_devices_v2.pipe ./ghost/core/core/server/data/tinybird/endpoints/api_top_pages_v2.pipe ./ghost/core/core/server/data/tinybird/endpoints/api_top_locations_v2.pipe; do
  echo "=== $file ==="
  cat -n "$file" | grep -A 5 "order by"
done

Repository: qodo-benchmark/Ghost-combined-coderabbit

Length of output: 700


🏁 Script executed:

# Also check the older v1 versions for comparison
for file in ./ghost/core/core/server/data/tinybird/endpoints/api_top_sources.pipe ./ghost/core/core/server/data/tinybird/endpoints/api_top_devices.pipe; do
  echo "=== $file ==="
  cat -n "$file" | grep -A 5 "order by"
done

Repository: qodo-benchmark/Ghost-combined-coderabbit

Length of output: 483


🏁 Script executed:

# Let's get a broader view - show all the query sections with order by
grep -r "order by" ./ghost/core/core/server/data/tinybird/endpoints/ --include="*.pipe" -B 3 -A 3

Repository: qodo-benchmark/Ghost-combined-coderabbit

Length of output: 17817


Add secondary sort key to stabilize pagination ordering.
When multiple sources have equal visit counts, pagination can return different row orderings across successive requests. Add source asc as a tie-breaker to ensure deterministic results.

Suggested change
-    order by visits desc
+    order by visits desc, source asc
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
select
source,
count() as visits
from session_data sd
inner join filtered_sessions_v2 fs
on fs.session_id = sd.session_id
group by source
order by visits desc
limit {{ Int32(skip, 0) }},{{ Int32(limit, 50) }}
select
source,
count() as visits
from session_data sd
inner join filtered_sessions_v2 fs
on fs.session_id = sd.session_id
group by source
order by visits desc, source asc
limit {{ Int32(skip, 0) }},{{ Int32(limit, 50) }}
🤖 Prompt for AI Agents
In `@ghost/core/core/server/data/tinybird/endpoints/api_top_sources_v2.pipe`
around lines 18 - 26, The ORDER BY in the query currently only sorts by visits
(count) which can yield non-deterministic pagination when counts tie; modify the
ORDER BY clause in the API query (the select grouping that produces visits and
source) to add a secondary sort key "source asc" after "visits desc" so ties are
consistently broken by source, ensuring stable pagination across requests.

Comment on lines +18 to +28
select
utm_medium,
count() as visits
from session_data sd
inner join filtered_sessions_v2 fs
on fs.session_id = sd.session_id
where
utm_medium != ''
group by utm_medium
order by visits desc
limit {{ Int32(skip, 0) }},{{ Int32(limit, 50) }}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

🧩 Analysis chain

🏁 Script executed:

find . -type f -name "*api_top_utm_mediums_v2.pipe" -o -name "*api_top_utm_mediums*"

Repository: qodo-benchmark/Ghost-combined-coderabbit

Length of output: 377


🌐 Web query:

Tinybird pagination deterministic ordering stability ClickHouse

💡 Result:

Short answer: Use ORDER BY with a unique tie-breaker (or use cursor/seek pagination) — otherwise LIMIT/OFFSET can return non‑deterministic/stable pages on ClickHouse-backed Tinybird endpoints.

Why / how (concise):

  • Tinybird supports LIMIT/OFFSET pagination and explicitly recommends adding ORDER BY for consistent pagination. [1][2]
  • ClickHouse returns rows with identical ORDER BY values in arbitrary (non‑deterministic) order; so ORDER BY alone is not enough unless the ORDER BY expression(s) produce a unique ordering. Add a stable unique column (e.g., an immutable id or (timestamp,id) tuple) as the final tie‑breaker. [3][4]
  • Alternatives: use cursor/seek-style pagination (WHERE (key, id) > (last_key, last_id) or equivalent) to avoid OFFSET performance and stability issues, or use LIMIT ... WITH TIES deliberately if you want to include tied rows. [3][4]

Practical checklist:

  • Add an explicit ORDER BY that includes a unique column (id or (ts, id)). [3]
  • Prefer cursor/seek pagination for large offsets or strict stability. [3][4]
  • If using Tinybird endpoints, parameterize LIMIT/OFFSET or cursor values as query params. [1][2]

Sources:
[1] Tinybird Endpoints (pagination & LIMIT/OFFSET).
[2] Tinybird query-parameters (pagination guidance).
[3] ClickHouse ORDER BY (rows with identical ORDER BY values are arbitrary).
[4] ClickHouse LIMIT docs (LIMIT ... WITH TIES and non-determinism without ORDER BY).


🏁 Script executed:

cat -n ./ghost/core/core/server/data/tinybird/endpoints/api_top_utm_mediums_v2.pipe

Repository: qodo-benchmark/Ghost-combined-coderabbit

Length of output: 982


Stabilize pagination ordering for deterministic results.

Line 27: ordering only by visits can shuffle equally-ranked rows between pages when multiple utm_mediums have the same visit count. Add a secondary key for stable pagination.

Suggested change
-    order by visits desc
+    order by visits desc, utm_medium asc
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
select
utm_medium,
count() as visits
from session_data sd
inner join filtered_sessions_v2 fs
on fs.session_id = sd.session_id
where
utm_medium != ''
group by utm_medium
order by visits desc
limit {{ Int32(skip, 0) }},{{ Int32(limit, 50) }}
select
utm_medium,
count() as visits
from session_data sd
inner join filtered_sessions_v2 fs
on fs.session_id = sd.session_id
where
utm_medium != ''
group by utm_medium
order by visits desc, utm_medium asc
limit {{ Int32(skip, 0) }},{{ Int32(limit, 50) }}
🤖 Prompt for AI Agents
In `@ghost/core/core/server/data/tinybird/endpoints/api_top_utm_mediums_v2.pipe`
around lines 18 - 28, The ORDER BY clause currently sorts only by visits which
allows non-deterministic ordering for ties; update the query's ORDER BY to
include a stable secondary key (for example utm_medium ASC) so rows with equal
count() (visits) are consistently ordered across pages—modify the ORDER BY
visits desc to ORDER BY visits DESC, utm_medium ASC (referencing utm_medium,
visits, session_data sd and filtered_sessions_v2 fs in the pipe).


#### Session-Level Attributes
These are captured from the **first hit** (earliest timestamp) in a session using `argMin(field, timestamp)` in the `mv_session_data` materialized view:
These are captured from the **first hit** (earliest timestamp) in a session using `argMinState(field, timestamp)` in the `_mv_session_data` materialized view (an `AggregatingMergeTree` table):
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

🧩 Analysis chain

🏁 Script executed:

rg -n "_mv_session_data_v2|_mv_session_data\b" ghost/core/core/server/data/tinybird -g '*.pipe' -g '*.datasource'

Repository: qodo-benchmark/Ghost-combined-coderabbit

Length of output: 1127


🏁 Script executed:

cat -n ghost/core/core/server/data/tinybird/endpoints/README.md | head -50

Repository: qodo-benchmark/Ghost-combined-coderabbit

Length of output: 2406


🏁 Script executed:

find ghost/core/core/server/data/tinybird -name "*.pipe" | grep -E "(filtered_sessions|mv_session_data)" | head -20

Repository: qodo-benchmark/Ghost-combined-coderabbit

Length of output: 349


🏁 Script executed:

head -30 ghost/core/core/server/data/tinybird/pipes/filtered_sessions.pipe

Repository: qodo-benchmark/Ghost-combined-coderabbit

Length of output: 1672


🏁 Script executed:

head -30 ghost/core/core/server/data/tinybird/pipes/filtered_sessions_v2.pipe

Repository: qodo-benchmark/Ghost-combined-coderabbit

Length of output: 1664


🏁 Script executed:

grep -A 10 "sessions_filtered_by_session_attributes" ghost/core/core/server/data/tinybird/pipes/filtered_sessions.pipe | head -20

Repository: qodo-benchmark/Ghost-combined-coderabbit

Length of output: 474


🏁 Script executed:

grep -A 10 "sessions_filtered_by_session_attributes" ghost/core/core/server/data/tinybird/pipes/filtered_sessions_v2.pipe | head -20

Repository: qodo-benchmark/Ghost-combined-coderabbit

Length of output: 459


🏁 Script executed:

grep -A 20 "sessions_filtered_by_session_attributes" ghost/core/core/server/data/tinybird/pipes/filtered_sessions_v2.pipe | head -30

Repository: qodo-benchmark/Ghost-combined-coderabbit

Length of output: 914


🏁 Script executed:

cat -n ghost/core/core/server/data/tinybird/pipes/mv_session_data.pipe | head -30

Repository: qodo-benchmark/Ghost-combined-coderabbit

Length of output: 1424


🏁 Script executed:

cat -n ghost/core/core/server/data/tinybird/pipes/mv_session_data_v2.pipe | head -30

Repository: qodo-benchmark/Ghost-combined-coderabbit

Length of output: 960


Clarify v1 vs v2 datasources and correct function name.

Lines 10 and 42 describe _mv_session_data, which is the v1 datasource. However, v2 endpoints use _mv_session_data_v2. Additionally, line 10 states argMinState(field, timestamp), but v1 actually uses argMin() (not State), while v2 uses argMinState(). Update the documentation to either specify it covers v1 only, or add a note clarifying v2 uses _mv_session_data_v2 with argMinState() functions.

🤖 Prompt for AI Agents
In `@ghost/core/core/server/data/tinybird/endpoints/README.md` at line 10, The
README incorrectly conflates v1 and v2 datasources and the ClickHouse function
names: update the text around _mv_session_data to explicitly state it refers to
the v1 datasource and uses argMin() (not argMinState()), and add a short note
that v2 endpoints use _mv_session_data_v2 and use argMinState() aggregates;
ensure both symbols _mv_session_data, _mv_session_data_v2 and functions argMin,
argMinState are mentioned so readers know which view and function apply to v1 vs
v2.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants