Skip to content

[OPIK-6128] [QA] feat: Trace Explore smoke (SDK-create + UI-verify)#6823

Open
AndreiCautisanu wants to merge 15 commits into
mainfrom
andreic/OPIK-6128-trace-explore
Open

[OPIK-6128] [QA] feat: Trace Explore smoke (SDK-create + UI-verify)#6823
AndreiCautisanu wants to merge 15 commits into
mainfrom
andreic/OPIK-6128-trace-explore

Conversation

@AndreiCautisanu
Copy link
Copy Markdown
Contributor

Details

First CUJ smoke test in the Opik 2.0 E2E suite — slot 1 of 6 in the data-plane test sequence. Seeds traces via the Python SDK code path (@opik.track → bridge route) and verifies they render correctly in the Logs page and Trace detail panel. Sets the pattern (bridge route → SDK client method → fixture → POMs → test) the next five CUJ tickets will copy.

The PR is intentionally cross-package: one FE testid addition (testId prop on MetricCard) so the Logs page's trace count is selectable, plus an auth-on-the-fly setup so the suite can run in fresh CI environments without pre-staged credentials.

Change checklist

  • User facing
  • Documentation update

Issues

  • OPIK-6128

AI-WATERMARK

AI-WATERMARK: yes

  • Tools: Claude Code
  • Model(s): Claude Opus 4.7 (1M context)
  • Scope: full implementation across bridge route, SDK client method, fixture, POMs, tests, FE testid, auth-on-the-fly globalSetup
  • Human verification: code review + manual staging runs + local OSS runs

Testing

Both deployment modes verified end-to-end with the bridge auto-spawning via Playwright's webServer config:

OSS (local Opik):
```
OPIK_DEPLOYMENT=oss OPIK_BASE_URL=http://localhost:5174 \
WORKERS=1 npx playwright test tests/trace-explore/
```
Result: `2 passed (15.5s)`. No auth needed.

Cloud (staging.dev.comet.com):
```
OPIK_DEPLOYMENT=cloud OPIK_BASE_URL=https://staging.dev.comet.com/opik \
OPIK_TEST_USER_EMAIL= OPIK_TEST_USER_PASSWORD= \
OPIK_TEST_USER_NAME= OPIK_WORKSPACE= \
WORKERS=1 npx playwright test tests/trace-explore/
```
Result: `2 passed (25.7s)`. globalSetup minted `.auth/user.json` from the login flow; the minted API key was forwarded to the bridge via `X-Opik-Api-Key` header.

Independence verified: each test passes individually (`-g`), sequentially (`WORKERS=1`), and concurrently (`WORKERS=2`/`3`, each test on its own worker, its own project). Post-run prefix sweep returns `total: 0` (no orphans).

Quality gates: `npx tsc --noEmit` clean on the e2e suite, `npm run typecheck` + `npm run lint` clean on `apps/opik-frontend`.

What each test verifies (UI-first, zero REST in test bodies/POMs)

Test Assertions
`Logs view shows seeded traces with correct count and ordering` Seeds 3 traces via the bridge. Asserts the metrics summary count via UI = 3, and that DOM row order matches the reverse creation order (most-recent first).
`Trace panel renders the seeded trace header, input, output, span count, and project breadcrumb` Seeds 1 trace, deep-links into the panel via `?trace=`. Asserts trace name in panel header, Input/Output sections expanded, the seeded input/output text rendered inside the panel, `Spans (1)` label visible, and the underlying Logs page breadcrumb showing the project name.

REST is used only by the fixtures (project create, trace create via bridge) and by global-teardown (project delete + orphan sweep). Tests and POMs never call REST.

CI contract

A workflow needs only env vars / secrets:

OSS (e.g. docker-compose-driven local stack):
```
OPIK_DEPLOYMENT=oss
OPIK_BASE_URL=http://localhost:5173
```

Cloud (staging or prod):
```
OPIK_DEPLOYMENT=cloud
OPIK_BASE_URL=https:///opik
OPIK_TEST_USER_EMAIL=
OPIK_TEST_USER_PASSWORD=
OPIK_TEST_USER_NAME=
OPIK_WORKSPACE=
```

No `.auth/*.json` files need to be checked in, downloaded from a secrets store, or pre-staged. globalSetup mints them per run.

Documentation

Local-only design docs and discovery reports for this work live at `docs/superpowers/{specs,plans,discovery}/2026-05-21-opik-6128-*` — these are not committed (per standing workflow convention). The user-facing change is one new `testId` prop on `MetricCard`; no public-API or docs surface is altered.

- LogsPage.waitForReady waits for tr[data-row-id] (real rows), not the skeleton
- LogsPage.countTraces falls back to 'Traces N' text when metrics-card-count testid
  is not yet deployed (FE testid was added in this PR; staging needs the FE redeploy)
- LogsPage.openTraceById rebuilds the URL from the projectId tracked in goto() so it
  doesn't accidentally drop default query params captured by the FE
- TracePanelPage.closeButton uses getByRole('button', {name: 'Close'}) because the
  TraceDetailsPanel custom header replaces the default Sheet close button (which
  was the one with data-testid='side-panel-close')
- TracePanelPage.waitForFullyLoaded first waits for the trace= query param to land
- BackendClient.getTraceSpans accepts an optional projectId; falls back to looking
  it up via getTraceById(traceId).projectId. getSpansByProject requires it.
- playwright.config wires storageState from .auth/<deployment>.json or OPIK_STORAGE_STATE
The previous draft used REST to read spans and project metadata, which defeats
the point of a UI smoke test: we want to assert that what the SDK logs actually
renders in the UI. Test 2 was REST-only and didn't verify any UI surface; tests
also coupled to fixture-name conventions that hinted at shared state.

Changes:
- TracePanelPage drops readSpans()/readMetadata() (both REST) in favour of
  DOM-scoped locators: traceNameInHeader, inputValue, outputValue,
  spansCountLabel, inputSection, outputSection. All locators are scoped to
  the panel's data-testid='traces' root so they never collide with row
  previews in the table behind the side-pane.
- LogsPage drops the BackendClient dependency (no REST in POM) and gains
  breadcrumbProjectLink for project-metadata verification via UI.
- BackendClient drops getTrace/getProject/getTraceSpans wrappers (unused now).
- trace.fixture exposes the seeded input/output strings on TraceRef so tests
  can assert exactly what was logged appears in the UI.
- The custom Span[] matchers are removed: they operated on REST data and were
  the wrong abstraction for a UI-first smoke. Plain Playwright assertions on
  panel locators are clearer and stricter.
- Tests rewritten:
  * Test 1: trace appears in Logs table; panel renders trace name + seeded
    input/output + the expected Spans (1) label.
  * Test 2: while the panel is open, the Logs breadcrumb shows the project
    name. Verifies project metadata via UI, not REST.
  * Test 3: no longer depends on opikTrace; seeds three traces itself with
    the testNamespace prefix and asserts UI count + row order.

All three tests now pass when run in isolation, sequentially with WORKERS=1,
and concurrently with WORKERS=3 (each on its own project).
The previous three tests overlapped: Test 1 mixed Logs-table and panel
assertions, Test 2 was a near-empty breadcrumb-only check, and Test 3
duplicated the count assertion from Test 1 to also check row ordering.

Replace with two tests where each has one clear concern:

  1. Logs view shows seeded traces with correct count and ordering
     - Seeds 3 traces, asserts count=3 and row order via DOM (data-row-id).
     - No panel interaction.

  2. Trace panel renders the seeded trace header, input, output, span count,
     and project breadcrumb
     - Seeds 1 trace via opikTrace fixture, opens panel via URL, asserts
       all panel content + breadcrumb on the underlying Logs page.
     - No table-level count/order assertions.

Both tests pass individually, sequentially with WORKERS=1 (35s), and in
parallel with WORKERS=2 (19s). Each test runs in its own project via the
existing fixture cascade.
…cloud

Make the suite CI-ready by removing dependence on pre-captured .auth files.

What changed:
- env.config: cloud deployments now accept OPIK_TEST_USER_EMAIL +
  OPIK_TEST_USER_PASSWORD as an alternative to OPIK_API_KEY. Either path
  satisfies the validation.
- global-setup: for non-oss deployments without a pre-captured auth state,
  POSTs {email, plainTextPassword} to <root>/api/auth/login (auth lives at
  the Comet root domain, not under /opik), saves the resulting browser
  storage state to .auth/user.json, and propagates the minted API key via
  process.env.OPIK_API_KEY. OSS deployments skip this step entirely. Mirrors
  the legacy tests_end_to_end/typescript-tests/ pattern.
- playwright.config: storageState resolves to .auth/user.json for
  cloud/self-hosted and undefined for oss. Removes the deployment-named-file
  lookup and OPIK_STORAGE_STATE workaround.
- python-sdk-client: forwards process.env.OPIK_API_KEY as X-Opik-Api-Key on
  every bridge request (read at request time, not at module load) so the
  bridge picks up the minted key even though it was spawned before
  globalSetup ran.
- bridge routes/projects.py + routes/traces.py: read the X-Opik-Api-Key
  header and pass it to opik.Opik(api_key=...). The trace route additionally
  set_global_client(client, context_wise=True) so @opik.track's global
  client uses the request-scoped credentials. When the header is absent the
  routes fall back to OPIK_API_KEY env, preserving local-dev defaults.
- typescript-sdk: drops the hard require of env.apiKey for oss deployments.
- .env.local.example + .env.cloud.example: document the env-var contract
  for both deployment modes.

Verification:
- OSS (http://localhost:5174, no creds):       2/2 passed in 15.6s
- Cloud (staging, fresh .auth/user.json):       2/2 passed in 36.0s

CI authors set OPIK_DEPLOYMENT=<oss|cloud> + OPIK_BASE_URL (+ creds for
cloud) — no pre-staged .auth files required.
@github-actions github-actions Bot added python Pull requests that update Python code Frontend tests Including test files, or tests related like configuration. typescript *.ts *.tsx labels May 21, 2026
Comment thread tests_end_to_end/e2e/global-setup.ts
Comment thread tests_end_to_end/e2e/pom/logs.page.ts Outdated
- MetricCard: add 'metrics-card-<type>-value' testid on the value <span> so
  LogsPage.countTraces() can target the value-only element instead of parsing
  the whole card text. The card also renders a delta percentage that would
  produce a wrong number once a previous-period baseline exists (the count
  regression baz flagged).
- LogsPage.countTraces: switch primary selector to '-value' and keep the
  'Traces N' body-text fallback for staging deploys that don't yet have the
  new FE testid.
- global-setup auth: tighten the contract. The previous early-return required
  BOTH a pre-set OPIK_API_KEY AND an existing .auth/user.json — meaning a
  fresh cloud run with only OPIK_API_KEY would fall through to the
  email/password requirement and abort with an unclear message. Now:
    - apiKey + storage state on disk => power-user debug path, skip login.
    - email+password => canonical CI path, mint storage state via login.
    - apiKey alone with no storage state => clear, actionable error.
  env.config message updated to explain both paths.
- bridge: extract opik_factory.make_opik_client(workspace, api_key) helper
  so projects.py + traces.py + future routes share the wiring (was duplicated
  per route).
- bridge: register an app-wide ApiError -> JSONResponse handler in main.py so
  each route no longer needs a try/except wrapping ApiError just to re-raise
  the same status/body as HTTPException. routes/projects.py and traces.py
  drop their nested try blocks accordingly.

Verified both modes still pass:
- OSS (http://localhost:5174):       2/2 in 15.4s
- Cloud (staging, fresh user.json):  2/2 in 26.7s
@AndreiCautisanu AndreiCautisanu marked this pull request as ready for review May 21, 2026 18:12
@AndreiCautisanu AndreiCautisanu requested review from a team as code owners May 21, 2026 18:12
Comment on lines +37 to +40
)
finally:
client.end(flush=False)
atexit.unregister(client.end)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

routes/traces.py and routes/projects.py duplicate the same make_opik_client(...) / client.end(flush=False) / atexit.unregister(client.end) cleanup, should we wrap that lifecycle in a shared helper/context manager in opik_factory?

Finding type: Code Dedup and Conventions | Severity: 🟢 Low


Want Baz to fix this for you? Activate Fixer

Copy link
Copy Markdown
Contributor

@andriidudar andriidudar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The FE part looks good, approved!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Frontend python Pull requests that update Python code tests Including test files, or tests related like configuration. typescript *.ts *.tsx

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants