This is a library for common telemetry functionality, especially subscribers for Tokio tracing libraries. Here we simply package many common subscribers, such as writing trace data to Jaeger, distributed tracing, common logs and metrics destinations, etc. into a easy to configure common package. There are also some unique layers such as one to automatically create Prometheus latency histograms for spans.
We also purposely separate out logging levels from span creation. This is often needed by production apps as normally it is not desired to log at very high levels, but still desirable to gather sampled span data all the way down to TRACE level spans.
Getting started is easy. In your app:
let config = telemetry::TelemetryConfig::new("my_app");
let guard = telemetry::init(config);
It is important to retain the guard until the end of the program. Assign it in the main fn and keep it, for once it drops then log output will stop.
There is a builder API available: just do TelemetryConfig::new()...
Another convenient initialization method
is TelemetryConfig::new().with_env()
to populate the config from environment vars.
You can also run the example and see output in ANSI color:
cargo run --example easy-init
otlp
- this feature is enabled by default as it enables otlp tracingjson
- Bunyan formatter - JSON log output, optionaltokio-console
- Tokio-console subscriber, optional
By default, logs (but not spans) are formatted for human readability and output to stdout, with key-value tags at the end of every line.
RUST_LOG
can be configured for custom logging output, including filtering.
By setting log_file
in the config, one can write log output to a daily-rotated file.
Detailed span start and end logs can be generated by defining the json_log_output
config variable. Note that this causes all output to be in JSON format, which is not as human-readable, so it is not enabled by default.
This output can easily be fed to backends such as ElasticSearch for indexing, alerts, aggregation, and analysis.
NOTE: JSON output requires the json
crate feature to be enabled.
- In
docker/grafana-local
rundocker compose up
to start a local grafana instance. - Set
TRACE_FILTER=<filter expression>
- for local useTRACE_FILTER=sui=trace,info
is a good place to start. - Start the sui-node or other process.
- Go to http://localhost:3000 (or http://localhost:3000/ with traces already filtered to sui-node
- Select
Tempo
as the data source.
Because tracing is expensive, it is not enabled by default. To enable trace exporting on a production machine:
-
Ensure the process was started with
TRACE_FILTER=off
- this enables the OTLP system but filters out all spans. -
Using the filter expression and duration of your choice, run:
$ curl -X POST 'http://127.0.0.1:1337/enable-tracing?filter=sui-node=trace,info&duration=10s'
Tracing will automatically be disabled after the specified duration has elapsed, in order to avoid leaving tracing on unintentionally.
Included in this library is a tracing-subscriber layer named PrometheusSpanLatencyLayer
. It will create
a Prometheus histogram to track latencies for every span in your app, which is super convenient for tracking
span performance in production apps.
Enabling this layer can only be done programmatically, by passing in a Prometheus registry to TelemetryConfig
.
Tokio-console is an awesome CLI tool designed to analyze and help debug Rust apps using Tokio, in real time! It relies on a special subscriber.
- Build your app using a special flag:
RUSTFLAGS="--cfg tokio_unstable" cargo build
- Enable the
tokio-console
feature for this crate. - Set the
tokio_console
config setting when running your app (or set TOKIO_CONSOLE env var if using configwith_env()
method) - Clone the console repo and
cargo run
to launch the console
NOTE: setting tokio TRACE logs is NOT necessary. It says that in the docs but there's no need to change Tokio logging levels at all. The console subscriber has a special filter enabled taking care of that.
By default, Tokio console listens on port 6669. To change this setting as well as other setting such as the retention policy, please see the configuration guide.
This library installs a custom panic hook which records a log (event) at ERROR level using the tracing crate. This allows span information from the panic to be properly recorded as well.
To exit the process on panic, set the CRASH_ON_PANIC
environment variable.