Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[receiver/sqlquery] Log signals reprocessed when tracking_column type is timestamp, and value has milliseconds precision #35194

Closed
Grandys opened this issue Sep 15, 2024 · 2 comments · Fixed by #35195
Labels
bug Something isn't working internal/sqlquery receiver/sqlquery SQL query receiver

Comments

@Grandys
Copy link
Contributor

Grandys commented Sep 15, 2024

Component(s)

internal/sqlquery, receiver/sqlquery

What happened?

Description

Milliseconds from timestamp column is trimmed during processing, when it's used as a tracking column. As a result, when tracking_column has timestamp type and value has milliseconds precision, rows are re-processed.

Steps to Reproduce

Here is setup:
https://github.com/Grandys/otel-collector-sqlqueryreceiver-setup, branch reporocess-ts-column

git clone git@github.com:Grandys/otel-collector-sqlqueryreceiver-setup.git
cd otel-collector-sqlqueryreceiver-setup/
git checkout reporocess-ts-column
# Build custom otel collector with OCB
make run_rebuild

Verify collector output or check local LGTM (http://localhost:3000 admin/admin).

For example, when tracking has value 2024-09-15 16:45:16.222464
And executed sql is:

select 'Datapoint value for ' || log_type || ' is ' || datapoint as datapoint, log_type, created_at
from logs_data
where created_at > $$1 order by created_at

2024-09-15 16:45:16 is taken as a parameter (without .222464). As a result, the same rows are reprocessed.

Expected Result

For my setup, I'd expect 4 log entries generated in total

Actual Result

Log signals are generated every 10 seconds, for the same records, ex:

Timestamp Log body
2024-09-15 19:05:13.244 Datapoint value for db is 1
2024-09-15 19:05:13.244 Datapoint value for app is 4
2024-09-15 19:05:13.244 Datapoint value for web is 2
2024-09-15 19:05:03.244 Datapoint value for web is 2
2024-09-15 19:05:03.244 Datapoint value for app is 4
2024-09-15 19:05:03.244 Datapoint value for db is 1
2024-09-15 19:04:53.244 Datapoint value for web is 2
2024-09-15 19:04:53.244 Datapoint value for db is 1
2024-09-15 19:04:53.244 Datapoint value for app is 4
2024-09-15 19:04:43.244 Datapoint value for web is 2
2024-09-15 19:04:43.244 Datapoint value for app is 4
2024-09-15 19:04:43.244 Datapoint value for db is 1
2024-09-15 19:04:33.245 Datapoint value for web is 2
2024-09-15 19:04:33.245 Datapoint value for db is 1
2024-09-15 19:04:33.245 Datapoint value for app is 4
2024-09-15 19:04:23.245 Datapoint value for web is 2
2024-09-15 19:04:23.245 Datapoint value for app is 4
2024-09-15 19:04:23.245 Datapoint value for db is 1
2024-09-15 19:04:13.245 Datapoint value for db is 1
2024-09-15 19:04:13.245 Datapoint value for app is 4
2024-09-15 19:04:13.245 Datapoint value for web is 2
2024-09-15 19:04:03.246 Datapoint value for web is 2
2024-09-15 19:04:03.246 Datapoint value for app is 4
2024-09-15 19:04:03.246 Datapoint value for db is 1

Collector version

v0.109.0

Environment information

Environment

OS: macOS Monterey
Compiler: go1.22.6 darwin/amd64

OpenTelemetry Collector configuration

receivers:
  sqlquery:
    driver: postgres
    datasource: "host=localhost port=5432 user=otel password=otel database=otel sslmode=disable"
    queries:
      - sql: "select 'Datapoint value for ' || log_type || ' is ' || datapoint as datapoint, log_type, created_at from logs_data where created_at > $$1 order by created_at"
        tracking_column: created_at
        tracking_start_value: '2024-09-15 00:00:00'
        logs:
          - body_column: datapoint
            attribute_columns: [ "log_type" ]

exporters:
  debug:
    verbosity: detailed
  otlp/lgtm:
    endpoint: localhost:4317
    tls:
      insecure: true

processors:
  batch:
  resource:
    attributes:
      - key: service.name
        value: "sql-query-reader"
        action: insert

service:
  pipelines:
    metrics/sqlquery:
      receivers: [ sqlquery ]
      processors: [ resource, batch ]
      exporters: [ otlp/lgtm ]
    logs/sqlquery:
      receivers: [ sqlquery ]
      processors: [ resource, batch ]
      exporters: [ debug,otlp/lgtm ]

Log output

No response

Additional context

Table definition:

create table logs_data
(
    id         serial
        primary key,
    created_at timestamp,
    datapoint  integer,
    log_type   text
);
@Grandys Grandys added bug Something isn't working needs triage New item requiring triage labels Sep 15, 2024
Copy link
Contributor

Pinging code owners:

See Adding Labels via Comments if you do not have permissions to add labels yourself.

@crobert-1
Copy link
Member

This change makes sense to me, removing needs triage.

@crobert-1 crobert-1 removed the needs triage New item requiring triage label Sep 25, 2024
jriguera pushed a commit to springernature/opentelemetry-collector-contrib that referenced this issue Oct 4, 2024
…pen-telemetry#35195)

**Description:**

Formats retrieved time columns with milliseconds precision, so they are
not reprocessed when used as a tracking_column

**Link to tracking Issue:** open-telemetry#35194

**Testing:** Added integration test, updated test data

**Documentation:** n/a

Closes open-telemetry#35194
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working internal/sqlquery receiver/sqlquery SQL query receiver
Projects
None yet
2 participants