Skip to content

Design Robson administrative observability and debug access model #83

@ldamasio

Description

@ldamasio

Design Robson administrative observability and debug access model

Context

PR #82 was closed without merge after value-risk review.

The PR mixed useful diagnostic intent with risky implementation choices:

  • unauthenticated /debug/armed-positions;
  • debug-support state coupled into runtime;
  • info-level logs containing sensitive trading/operational fields;
  • no tenant/account scoping;
  • no alignment with future rbx-identity;
  • no clear separation between public API, admin API, metrics, logs and debug surfaces.

The lesson is not "no observability". The lesson is that Robson needs an explicit administrative observability and debug access model before adding endpoints or broad logs that expose operational trading state.

Goals

Define a safe model for:

  • public health/status endpoints;
  • metrics exposure;
  • logs and structured events;
  • admin/operator debug endpoints;
  • local/testnet-only diagnostics;
  • redaction of trading-sensitive fields;
  • tenant/account/trading_account scoping;
  • integration with future rbx-identity;
  • audit logging and break-glass access.

Required design questions

  1. Which endpoints may be public or unauthenticated?
  2. Which endpoints require admin/operator authentication?
  3. Which fields are never safe in public logs or public metrics?
  4. Which fields may appear only in admin-scoped views?
  5. How should tenant_id, account_id and trading_account_id scope observability?
  6. How should local/testnet diagnostics differ from production?
  7. What is the default behavior when no API token or identity provider is configured?
  8. What should be redacted from logs?
  9. What should be emitted as metrics, logs, traces or audit events?
  10. What should wait for rbx-identity instead of being implemented in robsond directly?

Deferred salvage from PR #82

The only potentially useful small diagnostic idea from PR #82 was detector-level visibility for immediate-entry failures, such as:

  • whether first market data was seen;
  • why OHLCV/technical-stop classification failed;
  • whether immediate-entry arming did not progress.

This may become a future tiny PR only if there is an active incident or clearly documented operational need.

Such a PR must not include:

  • /debug/* endpoint;
  • ArmedPositionDebug;
  • last_market_data_timestamps cache;
  • broad lifecycle info-level logging;
  • account/symbol/side/policy/entry/stop fields in unsafe logs;
  • docs churn unrelated to the minimal diagnostic.

Acceptance criteria

Before any future debug/admin observability PR lands, we need:

  • clear public/admin/debug API boundary;
  • auth-required behavior that does not silently become public when token config is missing;
  • tenant/account scoping plan;
  • redaction policy;
  • logging safety rules;
  • audit policy for admin/debug access;
  • local/testnet/production gating;
  • tests for unauthorized access and default-off debug surfaces.

Status

Design issue. No implementation authorized yet.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions