Status | |
---|---|
Stability | alpha: metrics |
development: logs | |
Distributions | contrib, observiq, splunk, sumo |
Issues | |
Code Owners | @dmitryax, @pmcollins |
The SQL Query Receiver uses custom SQL queries to generate metrics from a database connection.
🚧 This receiver is in ALPHA. Behavior, configuration fields, and metric data model are subject to change.
The configuration supports the following top-level fields:
driver
(required): The name of the database driver: one of postgres, mysql, snowflake, sqlserver, hdb (SAP HANA), or oracle (Oracle DB).datasource
(required): The datasource value passed to sql.Open. This is a driver-specific string usually consisting of at least a database name and connection information. This is sometimes referred to as the "connection string" in driver documentation. e.g. host=localhost port=5432 user=me password=s3cr3t sslmode=disablequeries
(required): A list of queries, where a query is a sql statement and one or morelogs
and/ormetrics
sections (details below).collection_interval
(optional): The time interval between query executions. Defaults to 10s.storage
(optional, default""
): The ID of a storage extension to be used to track processed results.
A query consists of a sql statement and one or more logs
and/or metrics
section.
At least one logs
or one metrics
section is required.
Note that technically you can put both logs
and metrics
sections in a single query section,
but it's probably not a real world use case, as the requirements for logs and metrics queries
are quite different.
Additionally, each query
section supports the following properties:
tracking_column
(optional, default""
) Applies only to logs. In case of a parameterized query, defines the column to retrieve the value of the parameter on subsequent query runs. See the below section Tracking processed results.tracking_start_value
(optional, default""
) Applies only to logs. In case of a parameterized query, defines the initial value for the parameter. See the below section Tracking processed results.
Example:
receivers:
sqlquery:
driver: postgres
datasource: "host=localhost port=5432 user=postgres password=s3cr3t sslmode=disable"
queries:
- sql: "select * from my_logs where log_id > $$1"
tracking_start_value: "10000"
tracking_column: log_id
logs:
- body_column: log_body
- sql: "select count(*) as count, genre from movie group by genre"
metrics:
- metric_name: movie.genres
value_column: "count"
The logs
section is in development.
body_column
(required) defines the column to use as the log record's body.
With the default configuration and a non-parameterized logs query like select * from my_logs
,
the receiver will run the same query every collection interval, which can cause reading the same rows
over and over again, unless there's an external actor removing the old rows from the my_logs
table.
To prevent reading the same rows on every collection interval, use a parameterized query like select * from my_logs where id_column > ?
,
together with the tracking_start_value
and tracking_column
configuration properties.
The receiver will use the configured tracking_start_value
as the value for the query parameter when running the query for the first time.
After each query run, the receiver will store the value of the tracking_column
from the last row of the result set and use it as the value for the query parameter on next collection interval. To prevent duplicate log downloads, make sure to sort the query results in ascending order by the tracking_column value.
Note that the notation for the parameter depends on the database backend. For example in MySQL this is ?
, in PostgreSQL this is $1
, in Oracle this is any string identifier starting with a colon :
, for example :my_parameter
.
Use the storage
configuration property of the receiver to persist the tracking value across collector restarts.
Each metrics
section consists of a
metric_name
, a value_column
, and additional optional fields.
Each metric in the configuration will produce one OTel metric per row returned from its sql query.
metric_name
(required): the name assigned to the OTel metric.value_column
(required): the column name in the returned dataset used to set the value of the metric's datapoint. This may be case-sensitive, depending on the driver (e.g. Oracle DB).attribute_columns
(optional): a list of column names in the returned dataset used to set attibutes on the datapoint. These attributes may be case-sensitive, depending on the driver (e.g. Oracle DB).data_type
(optional): can begauge
orsum
; defaults togauge
.value_type
(optional): can beint
ordouble
; defaults toint
.monotonic
(optional): boolean; whether a cumulative sum's value is monotonically increasing (i.e. never rolls over or resets); defaults to false.aggregation
(optional): only applicable fordata_type=sum
; can becumulative
ordelta
; defaults tocumulative
.description
(optional): the description applied to the metric.unit
(optional): the units applied to the metric.static_attributes
(optional): static attributes applied to the metrics.start_ts_column
(optional): the name of the column containing the start timestamp, the value of which is applied to the metric's start timestamp (otherwise the current time is used). Only applies if the metric is of type cumulative sum.ts_column
(optional): the name of the column containing the timestamp, the value of which is applied to the metric's timestamp. This can be current timestamp depending upon the time of last recorded metric's datapoint.
receivers:
sqlquery:
driver: postgres
datasource: "host=localhost port=5432 user=postgres password=s3cr3t sslmode=disable"
storage: file_storage
queries:
- sql: "select * from my_logs where log_id > $$1"
tracking_start_value: "10000"
tracking_column: log_id
logs:
- body_column: log_body
- sql: "select count(*) as count, genre from movie group by genre"
metrics:
- metric_name: movie.genres
value_column: "count"
attribute_columns: ["genre"]
static_attributes:
dbinstance: mydbinstance
Given a movie
table with three rows:
name | genre |
---|---|
E.T. | sci-fi |
Star Wars | sci-fi |
Die Hard | action |
If there are two rows returned from the query select count(*) as count, genre from movie group by genre
:
count | genre |
---|---|
2 | sci-fi |
1 | action |
then the above config will produce two metrics at each collection interval:
Metric #0
Descriptor:
-> Name: movie.genres
-> DataType: Gauge
NumberDataPoints #0
Data point attributes:
-> genre: STRING(sci-fi)
-> dbinstance: STRING(mydbinstance)
Value: 2
Metric #1
Descriptor:
-> Name: movie.genres
-> DataType: Gauge
NumberDataPoints #0
Data point attributes:
-> genre: STRING(action)
-> dbinstance: STRING(mydbinstance)
Value: 1
Avoid queries that produce any NULL values. If a query produces a NULL value, a warning will be logged. Furthermore, if a configuration references the column that produces a NULL value, an additional error will be logged. However, in either case, the receiver will continue to operate.
Refer to the config file provided for an example of using the
Oracle DB driver to connect and query the same table schema and contents as the example above.
The Oracle DB driver documentation can be found here.
Another usage example is the go_ora
example here.
The datasource
format for MySQL works as follows:
user:password@tcp(host:port)/databasename