Description
openedon Jun 8, 2021
While this issue aims to address #97934 main concern: provide the ability to trace ES query back to a source in Kibana code that initiated the request
, we want to lay the foundation for e2e tracing in the whole Stack. To make it happen, Kibana will rely on the built-in capabilities of APM-RUM and nodejs APM agents, and their integration with Elasticsearch service.
High-level picture
Kibana Frontend
Context should allow Kibana users to unambiguously identify the source of a query in the Kibana App in the browser, Kibana server, or the task manager
.
interface KibanaExecutionContext {
// kibana entity type
type: 'visualization' | 'actions' | 'alert' | ..;
// kibana entity id
id: string;
// human readable description, a vis title, action name,
description: string;
// in browser - url to navigate to a current page, on server - endpoint path, for task: task SO url
url?: string;
}
APM RUM agent doesn't provide support for async context propagation in the browser. Kibana will have to implement manual context passing.
A plugin creates an execution context
object with API provided by Core. Returned value is opaque to the plugin.
const executionContext: KibanaExecutionContext = createExecutionContext({ .. })
Obtained execution context
should be passed to the Kibana server manually through all the layers of abstractions in Kibana. Kibana sets it as a custom request header before issuing a request to the Kibana server:
await fetch('/api/something', {
headers: {
'kbn-context': executionContext.toString(),
}
});
await fetch('/api/something', {
method: 'post',
body: {
contest: executionContext.toJSON(),
}
});
For the first implementation, we start with context
capturing the single context level - visualizations
.
In the next iteration, we can add support for nested execution contexts. It can be used to compose execution context relationships across different apps.
Application service context
--> Dashboard context
--> Visualization context
.
Server-side
Depends on: APM agents can be used without APM server elastic/apm-agent-nodejs#2101
- The APM Node.js agent intercepts all the incoming requests and creates an APM transaction.
- The APM Node.js agent instruments all the requests to the Elasticsearch server to pass the current transaction id via the
traceparent
header. - Elasticsearch team is working on adding support for tracing headers Adds minimal traceparent header support to Elasticsearch elasticsearch#74210
We need to get their commitment shipping it inv7.15
. - This
traceparent
header will be used for log correlation across Kibana and Elasticsearch server. To make it possible, Kibana should addtrace.id
to the log records.
TODO: discuss with the Elasticsearch team in what form they are going to include it into the Elasticsearch logs. It's likely will be present in ECS-JSON logs by default. Presence in the Text logs is discussable. - Kibana intercepts all the incoming requests and retrieves
execution context
from the'kbn-context'
header. The context +trace.id
are emitted to Kibana logs. The minimal subset of theexecution context
data, in the formkibana:type:name:id
(kibana:visualization:gauge:1234-5678
, for example) is attached to the current APM transaction askibanaContext
label. - Kibana server plugins may create
execution context
on the server-side as well. The context passing works in the same way as for the client-side counterpart. - Whenever Kibana requests Elasticsearch server, Kibana adds the
kibanaContext
label tox-opaque-id
header. It allows Stack users to identify the source of a query inslowlogs
without the necessity to inspect Kibana logs.
TODO: discuss with the Elasticsearch teamtrace.id
is included in theslowlogs
as well.
Instrumentation
The list of instrumentation points should be discussed with every team separately. We are primarily interested in instrumenting plugins that may cause performance problems in Elasticsearch:
- Visualizations
- vis_type_metric
- vis_type_table
- vis_type_tagcloud
- vis_type_timelion
- vis_type_timeseries
- vis_type_vega
- vis_type_vislib
- vis_type_xy
- vis_type_pie
- input_control_vis
- Lens
- Discover
- Kibana server request handlers
- Tasks
- Actions
- Alerts
- Reporting
- Canvas
- Maps
- Observability
- APM
- Security solutions
- ML
- Logs
- Metrics
- Console
During the initial implementation, the Core team will instrument several plugins and implements integration testing as an example. Later, we will create separate issues for code owners to help us with this work.
List of sub-tasks
Context propagation
- Implement context management service on the client-side Implement execution context management service #102626
- Implement manual context propagation for Kibana Entities: [Meta] Implement context propagation for Kibana entities #102629
- Provide recommendations on debugging Kibana with data sent to Elasticsearch
slowlogs
Log correlation
- update APM nodejs agent update APM nodejs agent to a version usable without APM-server #102624
- Refactor logging system to include
trace.id
in the logs for log correlation purposes. Include tracing information in the log records #102699- align with the Elasticsearch team on the logging format
- Provide settings to run Kibana with APM agent enabled, APM agent disabled, APM agent working in the tracing mode (without sending data to APM server) Support configuring APM modes #102704
- Measure the solution overhead and its influence on the Kibana performance Measure performance overhead of tracing solution for Kibana server #102706
- updated APM RUM agent updated APM RUM agent to version instrumenting tracing headers for custom transactions #102625