Feature Request: Optional OpenTelemetry Integration for Observability and Performance Tuning

## Summary

This feature request proposes adding an **optional** integration of [OpenTelemetry](https://opentelemetry.io/) to the Zarr-Python codebase. OpenTelemetry is a widely adopted, vendor-neutral standard for generating, collecting, and exporting telemetry data (traces, metrics, and logs) used by many modern observability platforms. The goal is to improve observability, facilitate performance tuning, and enable integration with full-stack monitoring systems — all while preserving a lightweight default behavior.

---

## 📌 Motivation

Zarr is widely used in performance-critical and production environments such as:

- Large-scale data processing
- Scientific computing
- Cloud-native workflows
- Backend data source for web APIs (e.g. Xpublish)

Currently, Zarr provides limited visibility into internal operations like:

- Chunk reads/writes
- Compression and decompression
- Storage backend access
- Performance bottlenecks

By integrating OpenTelemetry (OTel), Zarr users and developers would benefit from:

- Enhanced **observability** into internal workflows
- Easier **performance tuning** via traces and profiling tools (e.g., Jaeger, Zipkin, Grafana Tempo)
- Seamless **integration** into modern observability pipelines

☝ Each of these are particularly important following Zarr's recent adoption of asyncio - where the execution of concurrent operations is increasingly hard to track explicitly.

---

## 🧩 Proposal

- Introduce optional support for OpenTelemetry instrumentation in key parts of the Zarr codebase:
  - Data access (inside stores)
  - Compression/decompression
  - Encoding/decoding
- Provide a clean interface or hooks to register and emit OpenTelemetry traces.
- Default behavior should be:
  - **No-op** (i.e. tracing is disabled unless explicitly enabled)
  - Optionally fall back to a basic Python logger for basic introspection
- Ensure **zero overhead** when OpenTelemetry is not enabled

---

## ✅ Benefits

- Opt-in observability with minimal performance impact
- Compatibility with OpenTelemetry-native tools and frameworks
- Aids in debugging and performance analysis
- Foundation for future enhancements (e.g., metrics, structured logging)

---

## 🛠️ Implementation Notes

- Introduce a `tracing.py` module (or similar) to encapsulate OpenTelemetry usage
- Use `@contextmanager` or `Tracer.start_as_current_span()` decorators in key areas
- Conditional instrumentation based on config or environment variable(s)

---

## 🙋‍♂️ Call for Feedback

We would love to hear from maintainers and the community:

- Does OpenTelemetry seem like a good fit for Zarr?
- Are there specific areas of the codebase that would benefit most from tracing?
- Would a structured logger fallback be helpful in low-overhead environments?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Feature Request: Optional OpenTelemetry Integration for Observability and Performance Tuning #2958

Summary

📌 Motivation

🧩 Proposal

✅ Benefits

🛠️ Implementation Notes

🙋‍♂️ Call for Feedback

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Feature Request: Optional OpenTelemetry Integration for Observability and Performance Tuning #2958

Description

Summary

📌 Motivation

🧩 Proposal

✅ Benefits

🛠️ Implementation Notes

🙋‍♂️ Call for Feedback

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions