Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[exporter/datasetexporter] Improved handling of "observed timestamp" and body / message fields, add config option for exporting scope attributes #23826

Conversation

tomaz-s1
Copy link
Contributor

NOTE: This PR diff is slightly larger since it also includes / it's based on top of changes from #23672.

I assume / hope that PR will be merged before this one since it would make it a bit easier for me. Technically, I could also create a new branch based on top of upstream main branch, but this would make it harder for me since some of the changes in this branch depend on changes in # 23672.

Once that PR will be merged, I will sync up this branch with latest main and then the diff should get smaller.


This pull request includes 3 small changes related to the consistency on how we handle various field types.

  1. Use consistent handling for the "observed timestamp" field

Field has been renamed to sca:observedTime.

This way we follow sca: prefix convention for special / internal fields (e.g. we already have sca:ingestTime field added on the server-side in some cases). The field value has also been changed from ISO date time string to nano seconds since epoch. This way it's consistent with other internal timestamp fields on the DataSet side.

  1. Use consistent handling for message / body field

I have simplified and removed body.type and body.value approach. This approach is not used by any other existing DataSet integrations so it could confuse the end users.

For simple value types (string, bool, int, double) we simply populate the message field and don't also store the value redundantly in additional body.{int,bool,double} field.

Only exception right now is a complex map type - we decompose this one into multiple fields in manner of body.map fields.

Since this is also something which doesn't happen in other DataSet integrations I've put it behind config option / feature flag.

For now, it defaults to true, but in the future (based on user feedback, etc.) we may want to switch it to false, aka not perform this decomposition by default.

In other integrations, this decomposition is usually handles elsewhere - e.g. either on the client (part of the processor or similar) or on the DataSet server side using a (JSON) parser.

  1. Allow users to exclude scope information from the DataSet events.

This is similar to the export_resource_info_on_event change we made recently.

I introduced a new config option / feature flag with which user can disable inclusion of scope information with each event.

Scope information can also be quite noisy (depending on how instrumentation is configured) and users may want to exclude it in case they are not utilizing it on the server side - in this case it would just result in increased (billable) DataSet log volume without any additional value for the end user.

This data is less noisy and verbose than resource one so I decided to leave the default value to true, but in case it turns out that most people don't use it and false makes more sense for the default value, we can change it later on.

zdaratom-s1 and others added 30 commits June 7, 2023 14:49
observed timetamp in case LogRecord doesn't contain timestamp.

Even though ObservedTimestamp should always be present, we fall back to
current time in case it's not.

In addition to that, remove duplicate and redundant "timestamp"
attribute from the AddEvents event.
DSET-3998 feat: export Logs resource info based on export_resource_info_on_event configuration
…try-collector-contrib into event_no_timestamp_fix_use_occurence_time
…e_time

[DSET-4071] [DSET-4007] Fix timestamp handling for LogRecord objects without a timestamp
severity to DataSet severity.

Also remove "severity.text" and "severity.number" field from the event
since it's redundant - we already have event severity (sev) field value.
information which is not already available (events which are sampled
will already be ingested and visible in the ui so flags.is_sampled is
redundant).
functions. Also move functionality for making a test record into a
utility function.
[DSET-4037] [DSET-4014] [DSET-4034] [DSET-4034] [DSET-4098] Dataset log exporter improvements
Also add missing "dataset-exporter" suffix to one of the files to
prevent possible conflicts in the future.
note: "Allow include Logs resource info export to DataSet based on new export_resource_info_on_event configuration. Fix timestamp handling."

# Mandatory: One or more tracking issues related to the change. You can use the PR number here if no issue exists.
issues: [20660, 23250]
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I noticed that previous changelog entries weren't fully up to date and didn't actually reference related PR number so I went ahead and fixed / updated that as well.

I also added missing dataset-exporter- prefix to one of the changelog files which was missing it (to avoid possible conflicts and similar in the future).

@tomaz-s1
Copy link
Contributor Author

@atoulme As a code owner, could you please review when you get a chance? Thanks.

@tomaz-s1
Copy link
Contributor Author

tomaz-s1 commented Jul 7, 2023

Now that #23672 has been merged, I went ahead and updated / synced up this branch with latest main branch so the diff should be much smaller now (since the PR was based on / relied on changes from #23672).

Copy link
Contributor

@martin-majlis-s1 martin-majlis-s1 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great work. :)

Could you, please, also adjust the README.md to contain the information about these configuration options?

@tomaz-s1
Copy link
Contributor Author

tomaz-s1 commented Jul 7, 2023

@martin-majlis-s1 Thanks. I updated readme in c7e2038.

@tomaz-s1
Copy link
Contributor Author

tomaz-s1 commented Jul 7, 2023

It looks like some integration tests (sqlqueryreceiver) are failing, but those failures don't seem to be related to my changes.

I checked and verified that my branch is already synced up and contains all the latest changes from the main branch. Is this a known issue (race / flaky test / similar)?

Thanks.


EDIT: It looks like another sync up with latest main branch did the trick - not sure though if the issue was fixed in the main branch (diff doesn't seem to indicate so) or simply due to the re-run and flakey nature of the tests.

@tomaz-s1
Copy link
Contributor Author

@atoulme Would appreciate it if you could have a look when you get a chance. Thanks!

@tomaz-s1
Copy link
Contributor Author

@mx-psi It looks like the PR has been approved and I just synced it up with the latest main branch and resolved the conflicts. Can you please have a look when you get a chance and merge it? Thanks!

exporter/datasetexporter/README.md Outdated Show resolved Hide resolved
exporter/datasetexporter/config.go Outdated Show resolved Hide resolved
tomaz-s1 and others added 4 commits July 19, 2023 18:15
@tomaz-s1
Copy link
Contributor Author

@mx-psi Thanks, I committed your typo fixes and synced it up with the latest main branch.

@mx-psi mx-psi merged commit 247e2bb into open-telemetry:main Jul 20, 2023
@github-actions github-actions bot added this to the next release milestone Jul 20, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants