Grafana Loki

Loki ingests OTLP/HTTP logs natively at /otlp/v1/logs since Loki 3.0 (2024). Tracecore reaches it directly through the upstream otlphttp exporter bundled in the OCB-assembled tracecore distro; no Loki-specific exporter is required, and the deprecated contrib lokiexporter is intentionally not bundled (RFC-0013 §2 adoption matrix). The tenant ID travels in the X-Scope-OrgID header.

Deployment shape:

tracecore (otlphttp exporter) ──▶ Loki distributor (/otlp/v1/logs)

Config

# docs/integrations/examples/loki.yaml
receivers:
  otlp:
    protocols:
      http:
        endpoint: 0.0.0.0:4318

exporters:
  otlphttp/loki:
    endpoint: http://loki-distributor.observability.svc.cluster.local:3100/otlp
    compression: gzip
    headers:
      X-Scope-OrgID: tracecore

service:
  pipelines:
    logs/loki:
      receivers: [otlp]
      exporters: [otlphttp/loki]

Validate with the in-tree binary:

./tracecore validate --config=docs/integrations/examples/loki.yaml

Endpoint and tenant

The endpoint is the Loki distributor's HTTP listener at the path /otlp; the otlphttp exporter appends the OTLP-spec /v1/logs suffix automatically, so the request lands at /otlp/v1/logs. Do not include /v1/logs in the YAML — the exporter rejects the duplicated path.
X-Scope-OrgID identifies the tenant when Loki's distributor runs with auth_enabled: true. Single-tenant clusters (auth_enabled: false) accept requests without the header and route them under the synthetic tenant fake; you can drop the headers: block in that case.
Loki Operator and Grafana Enterprise Logs (GEL) layer additional multi-tenant auth on top (e.g. mTLS gateways, per-tenant rate limits); those are optional, not required for the basic OSS install.

Labels vs. structured metadata (the cardinality footgun)

Loki indexes logs by stream labels and stores everything else as structured metadata (queryable in LogQL, NOT indexed). Label cardinality directly drives index size and query cost; the canonical Loki guidance is to keep label values in the low hundreds per stream.

The distributor's OTLP receiver maps OTLP attributes in three buckets:

Source	Default mapping	Cardinality risk
OTLP resource attributes	Index labels (only the ones in `default_resource_attributes_as_index_labels`)	Bounded; the default list is curated.
OTLP scope attributes	Structured metadata	Low — instrumentation-scope is rarely high-cardinality.
OTLP log attributes	Structured metadata	Safe by default; high-cardinality keys (e.g. `pattern.verdict_json`) stay out of the label index.

The Loki-side defaults at the distributor pick up these resource attributes as stream labels (from default_resource_attributes_as_index_labels):

service.name, service.namespace, deployment.environment, deployment.environment.name, cloud.region, cloud.availability_zone, k8s.cluster.name, k8s.namespace.name, k8s.container.name, container.name, k8s.replicaset.name, k8s.deployment.name, k8s.statefulset.name, k8s.daemonset.name, k8s.cronjob.name, k8s.job.name.

Operator-side tuning lives in Loki's config, not in tracecore:

# loki.yaml (on the LOKI side, NOT in tracecore)
limits_config:
  allow_structured_metadata: true   # default in Loki 3.0+
  otlp_config:
    resource_attributes:
      attributes_config:
        - action: index_label
          regex: k8s\.node\.name     # opt-in: index by node
    log_attributes:
      - action: structured_metadata
        attributes:
          - pattern.id
          - pattern.headline
          - pattern.remediation
          - pattern.confidence
          - pattern.verdict_json

When OTLP attributes flow into Loki via the native OTLP endpoint (/otlp/v1/logs, this recipe's target), they land as structured metadata with dots translated to underscores at the LogQL surface — no bucket prefix. An attribute pattern.id on a log record is queried as pattern_id; a resource attribute k8s.node.name is queried as k8s_node_name. Verify against Loki upstream's "Format considerations" doc (docs/sources/shared/otel.md); the structured-metadata + dots → underscores normalization is stable since Loki 3.0.

Promtail / Grafana Alloy users see different keys. When tracecore logs are routed through a Promtail / Alloy pipeline with a JSON parser stage (| json), OTLP attributes appear as JSON-body fields with the attributes_ / resources_ bucket prefix (e.g. attributes_pattern_id, resources_k8s_node_name). That is the Promtail-extraction surface, NOT the native OTLP surface. This recipe targets the native endpoint; if you must use Promtail/Alloy, add | json to LogQL queries and switch to the prefixed names.

Tracecore-specific attributes

The patterndetectorprocessor emits verdict records carrying these attributes (defined in module/processor/patterndetectorprocessor/patterndetector.go):

pattern.id, pattern.headline, pattern.remediation, pattern.confidence, pattern.verdict_json
k8s.pod.name, k8s.pod.namespace, k8s.node.name
k8s.event.reason
nccl.fr.pg_id, nccl.fr.collective_seq_id, nccl.fr.hanging_ranks_count

All ship as log attributes, so all land in Loki as structured metadata by default. This is the right shape: pattern.verdict_json in particular is per-incident JSON and would explode the label index if promoted. The dashboards consume them as pattern_id, k8s_node_name, etc. — bare-underscored, no bucket prefix, matching the native-OTLP surface (see ## See also below).

Only resource attributes on the verdict's containing log record are candidates for the label index, and the default list above already covers k8s.namespace.name / k8s.cluster.name / service.name / the rest of the k8s workload axis.

Retention

Retention is configured on the Loki side via compactor.retention_* and per-stream limits_config.retention_period. Tracecore does not control retention; the recipe assumes the operator has set a global retention compatible with the verdict signal (~14-30d is typical for incident review; longer for compliance). If the cluster has retention disabled, verdicts accumulate indefinitely until disk fills — set at least a default retention_period before pointing tracecore at the cluster.

Secret handling

Same shape as the other recipes: render the literal X-Scope-OrgID value at deploy time through envsubst, Helm, or a CSI secret driver if the tenant identifier is sensitive. The example file ships the literal tracecore so tracecore validate succeeds offline. Single- tenant Loki clusters can drop the headers: block entirely.

Failure modes

Symptom	First check
HTTP 401 / 403 from Loki	Auth gateway in front of the distributor is rejecting the request. Confirm the deployed `X-Scope-OrgID` value matches the gateway's tenant allow-list.
HTTP 400 `the request body is too large`	Tracecore is sending batches above `limits_config.distributor.ingestion_rate_mb`. Lower the batchprocessor flush size or raise the Loki limit.
HTTP 400 `structured metadata is not allowed`	Loki is below 3.0 OR `limits_config.allow_structured_metadata` is `false`. Upgrade Loki, or flip the limit. The OTLP receiver always emits structured metadata for non-label attributes.
HTTP 429 with `Retry-After`	Loki's per-tenant ingestion rate-limit is engaged. Either aggregate at tracecore (`batchprocessor`) before the exporter or raise `ingestion_rate_mb` / `ingestion_burst_size_mb` on the Loki side.
Verdicts arrive but `pattern.id` is missing from LogQL	The Loki distributor dropped log attributes per `otlp_config.log_attributes`. Confirm the operator-side config includes `action: structured_metadata` for `pattern.*` (see the labels-vs-metadata section above).
Repeated TLS handshake failures	The default trust store covers most managed Lokis. If a corporate proxy MITMs egress, install the proxy CA in the system trust store; do not enable `insecure_skip_verify` in production.
Stream cardinality alerts on the Loki cluster	Confirm no high-cardinality OTLP resource attribute (e.g. `service.instance.id`) was added to `default_resource_attributes_as_index_labels`; that list defaults sanely but is the most common operator footgun.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Grafana Loki

Config

Endpoint and tenant

Labels vs. structured metadata (the cardinality footgun)

Tracecore-specific attributes

Retention

Secret handling

Failure modes

See also

Uh oh!

FilesExpand file tree

loki.md

Latest commit

History

loki.md

File metadata and controls

Grafana Loki

Config

Endpoint and tenant

Labels vs. structured metadata (the cardinality footgun)

Tracecore-specific attributes

Retention

Secret handling

Failure modes

See also