-
Notifications
You must be signed in to change notification settings - Fork 2.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[exporter/datasetexporter] Improved handling of "observed timestamp" and body / message fields, add config option for exporting scope attributes #23826
Conversation
…n_event configuration
observed timetamp in case LogRecord doesn't contain timestamp. Even though ObservedTimestamp should always be present, we fall back to current time in case it's not. In addition to that, remove duplicate and redundant "timestamp" attribute from the AddEvents event.
DSET-3998 feat: export Logs resource info based on export_resource_info_on_event configuration
…try-collector-contrib into event_no_timestamp_fix_use_occurence_time
…e_time [DSET-4071] [DSET-4007] Fix timestamp handling for LogRecord objects without a timestamp
This prefix is nor desired nor wanted.
severity to DataSet severity. Also remove "severity.text" and "severity.number" field from the event since it's redundant - we already have event severity (sev) field value.
information which is not already available (events which are sampled will already be ingested and visible in the ui so flags.is_sampled is redundant).
and update affected code and tests.
functions. Also move functionality for making a test record into a utility function.
[DSET-4037] [DSET-4014] [DSET-4034] [DSET-4034] [DSET-4098] Dataset log exporter improvements
…y-collector-contrib into dataset_log_exporter_improvements_fixes
Also add missing "dataset-exporter" suffix to one of the files to prevent possible conflicts in the future.
note: "Allow include Logs resource info export to DataSet based on new export_resource_info_on_event configuration. Fix timestamp handling." | ||
|
||
# Mandatory: One or more tracking issues related to the change. You can use the PR number here if no issue exists. | ||
issues: [20660, 23250] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I noticed that previous changelog entries weren't fully up to date and didn't actually reference related PR number so I went ahead and fixed / updated that as well.
I also added missing dataset-exporter-
prefix to one of the changelog files which was missing it (to avoid possible conflicts and similar in the future).
@atoulme As a code owner, could you please review when you get a chance? Thanks. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great work. :)
Could you, please, also adjust the README.md to contain the information about these configuration options?
@martin-majlis-s1 Thanks. I updated readme in c7e2038. |
It looks like some integration tests (sqlqueryreceiver) are failing, but those failures don't seem to be related to my changes. I checked and verified that my branch is already synced up and contains all the latest changes from the Thanks. EDIT: It looks like another sync up with latest main branch did the trick - not sure though if the issue was fixed in the main branch (diff doesn't seem to indicate so) or simply due to the re-run and flakey nature of the tests. |
@atoulme Would appreciate it if you could have a look when you get a chance. Thanks! |
@mx-psi It looks like the PR has been approved and I just synced it up with the latest main branch and resolved the conflicts. Can you please have a look when you get a chance and merge it? Thanks! |
Co-authored-by: Pablo Baeyens <pbaeyens31+github@gmail.com>
Co-authored-by: Pablo Baeyens <pbaeyens31+github@gmail.com>
@mx-psi Thanks, I committed your typo fixes and synced it up with the latest main branch. |
NOTE: This PR diff is slightly larger since it also includes / it's based on top of changes from #23672.
I assume / hope that PR will be merged before this one since it would make it a bit easier for me. Technically, I could also create a new branch based on top of upstream
main
branch, but this would make it harder for me since some of the changes in this branch depend on changes in # 23672.Once that PR will be merged, I will sync up this branch with latest
main
and then the diff should get smaller.This pull request includes 3 small changes related to the consistency on how we handle various field types.
Field has been renamed to
sca:observedTime
.This way we follow
sca:
prefix convention for special / internal fields (e.g. we already havesca:ingestTime
field added on the server-side in some cases). The field value has also been changed from ISO date time string to nano seconds since epoch. This way it's consistent with other internal timestamp fields on the DataSet side.message
/body
fieldI have simplified and removed
body.type
andbody.value
approach. This approach is not used by any other existing DataSet integrations so it could confuse the end users.For simple value types (string, bool, int, double) we simply populate the
message
field and don't also store the value redundantly in additionalbody.{int,bool,double}
field.Only exception right now is a complex map type - we decompose this one into multiple fields in manner of
body.map
fields.Since this is also something which doesn't happen in other DataSet integrations I've put it behind config option / feature flag.
For now, it defaults to true, but in the future (based on user feedback, etc.) we may want to switch it to
false
, aka not perform this decomposition by default.In other integrations, this decomposition is usually handles elsewhere - e.g. either on the client (part of the processor or similar) or on the DataSet server side using a (JSON) parser.
This is similar to the
export_resource_info_on_event
change we made recently.I introduced a new config option / feature flag with which user can disable inclusion of scope information with each event.
Scope information can also be quite noisy (depending on how instrumentation is configured) and users may want to exclude it in case they are not utilizing it on the server side - in this case it would just result in increased (billable) DataSet log volume without any additional value for the end user.
This data is less noisy and verbose than resource one so I decided to leave the default value to
true
, but in case it turns out that most people don't use it andfalse
makes more sense for the default value, we can change it later on.