Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Is db.connection_string necessary? #724

Closed
lmolkova opened this issue Feb 9, 2024 · 13 comments · Fixed by #769
Closed

Is db.connection_string necessary? #724

lmolkova opened this issue Feb 9, 2024 · 13 comments · Fixed by #769
Assignees

Comments

@lmolkova
Copy link
Contributor

lmolkova commented Feb 9, 2024

Connection strings can have secrets and have very specific format which makes them not that useful for visualization tooling (or user queries).
They can also be long and redundant on every span.

They can (secrets redacted) be useful to log to help investigate configuration issues, but probably should not appear on spans/metrics.

Proposal is to remove the attribute or make it opt-in.

@trask
Copy link
Member

trask commented Feb 9, 2024

@open-telemetry/java-instrumentation-approvers do you know of anything important being done with db.connection_string that we are capturing in Java instrumentation?

@lmolkova
Copy link
Contributor Author

lmolkova commented Feb 13, 2024

Looking into java instrumentations, it seems database instrumentation either don't set connection string or set driver:scheme://host:port.

host and port are covered with server.address|port and we can record scheme with url.scheme (maybe for specific systems?)
We can also introduce attribute for driver (when applicable)

No usages in python-contrib

JS-contrib uses scheme://host:port/database convention

So it seems individual components of connection string are in most cases reported through other attributes and the only part of connection string that is not represented by any existing ones is the driver name.

@joaopgrassi
Copy link
Member

Maybe we should ping other maintainers here to get a final opinion?

@trask
Copy link
Member

trask commented Feb 28, 2024

cc @open-telemetry/go-approvers @open-telemetry/javascript-approvers @open-telemetry/python-approvers @open-telemetry/opentelemetry-python-contrib-approvers @open-telemetry/dotnet-approvers @open-telemetry/dotnet-contrib-approvers

@pellared
Copy link
Member

pellared commented Feb 28, 2024

Connection strings can have passwords.

Example from Go: https://github.com/go-sql-driver/mysql/?tab=readme-ov-file#dsn-data-source-name

Example from .NET: https://learn.microsoft.com/en-us/dotnet/framework/data/adonet/connection-string-syntax

Sometimes the libraries have connection string parsing function that could allow redacting. I think it should be at least opt-in. However, I rather support removal and welcome other attributes like server.address|port.

help investigate configuration issues

People should use e.g. logs for that. These attributes should help in relating spans/metrics to concrete databases each to find e.g. slow databases.

@breedx-splk
Copy link
Contributor

We think that changing the default behavior to omit/exclude the db.connection_string attribute will cause immediate degradation of some of our product features. My preference would be to have the default include the attribute but somehow scrub passwords.

@trask
Copy link
Member

trask commented Feb 29, 2024

@breedx-splk can you provide a bit more details about how you are using db.connection_string? thanks

@trask
Copy link
Member

trask commented Feb 29, 2024

also, is this only for Java customers, or includes other languages as well?

@pichlermarc
Copy link
Member

I'd also prefer to drop db.connection_string or make it opt-in.

As mentioned above, in js-contrib we re-construct connection strings from other data that's added to the span already (host, port, db name).

I just reviewed all of our usages, and I think 100% of the data we add via the db.connection_string attribute is redundant.

@lmolkova
Copy link
Contributor Author

lmolkova commented Mar 1, 2024

@breedx-splk is there some specific information that you get from the connection string that's not available via other attributes?
If so, we can try to capture that information directly and/or document event/log that captures (scrubbed) connection string.

@breedx-splk
Copy link
Contributor

@breedx-splk can you provide a bit more details about how you are using db.connection_string? thanks

Sure, yeah. From what I understand, our APM UI leverages the db.connection_string as part of its fallback logic when naming inferred databases (eg. where a client span is talking to a non-instrumented database system). As @lmolkova suggested, we should be able to use other fields (like db.name and db.system) to infer the name/type, but we do use it as a fallback when the others aren't present.

@trask
Copy link
Member

trask commented Mar 6, 2024

@breedx-splk we can add a flag in Java instrumentation to keep around db.connection_string for a while if you need a transition period

@breedx-splk
Copy link
Contributor

@breedx-splk we can add a flag in Java instrumentation to keep around db.connection_string for a while if you need a transition period

Thanks, I appreciate that offer @trask. I'm checking with our internal folks to see what they think.

Another option too is to make it configurable, default it to false (excluded), and allow a distro to override the default (that way we can manage the time we need to keep it around while vanilla can remain spec compliant).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Development

Successfully merging a pull request may close this issue.

7 participants