Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

x-pack/metricbeat/module/gcp: CrossSeriesReducer Support #40614

Open
wants to merge 12 commits into
base: main
Choose a base branch
from

Conversation

Linu-Elias
Copy link
Contributor

@Linu-Elias Linu-Elias commented Aug 26, 2024

Proposed commit message

Checklist

  • My code follows the style guidelines of this project
  • I have commented my code, particularly in hard-to-understand areas
  • I have made corresponding changes to the documentation
  • I have made corresponding change to the default configuration files
  • I have added tests that prove my fix is effective or that my feature works
  • I have added an entry in CHANGELOG.next.asciidoc or CHANGELOG-developer.next.asciidoc.

Disruptive User Impact

Author's Checklist

  • [ ]

How to test this PR locally

Related issues

Use cases

Screenshots

Logs

@botelastic botelastic bot added the needs_team Indicates that the issue/PR needs a Team:* label label Aug 26, 2024
Copy link
Contributor

mergify bot commented Aug 26, 2024

This pull request does not have a backport label.
If this is a bug or security fix, could you label this PR @Linu-Elias? 🙏.
For such, you'll need to label your PR with:

  • The upcoming major version of the Elastic Stack
  • The upcoming minor version of the Elastic Stack (if you're not pushing a breaking change)

To fixup this pull request, you need to add the backport labels for the needed
branches, such as:

  • backport-v8./d.0 is the label to automatically backport to the 8./d branch. /d is the digit

@gpop63 gpop63 marked this pull request as ready for review August 27, 2024 09:55
@gpop63 gpop63 requested review from a team as code owners August 27, 2024 09:55
@gpop63 gpop63 added the Team:obs-ds-hosted-services Label for the Observability Hosted Services team label Aug 28, 2024
@elasticmachine
Copy link
Collaborator

Pinging @elastic/obs-ds-hosted-services (Team:obs-ds-hosted-services)

@botelastic botelastic bot removed the needs_team Indicates that the issue/PR needs a Team:* label label Aug 28, 2024
@gpop63 gpop63 added the Team:Obs-InfraObs Label for the Observability Infrastructure Monitoring team label Aug 28, 2024
@Linu-Elias Linu-Elias requested a review from a team as a code owner August 29, 2024 12:18
@gpop63
Copy link
Contributor

gpop63 commented Aug 30, 2024

I noticed that labels are omitted when using reducers due to data aggregation across multiple time series, but using GroupBy with specific fields might fix this issue. Without labels not sure how useful using a reducer would be.

Copy link
Contributor

mergify bot commented Sep 30, 2024

This pull request is now in conflicts. Could you fix it? 🙏
To fixup this pull request, you can check out it locally. See documentation: https://help.github.com/articles/checking-out-pull-requests-locally/

git fetch upstream
git checkout -b gcp_reducer_support upstream/gcp_reducer_support
git merge upstream/main
git push upstream gcp_reducer_support

Copy link
Contributor

mergify bot commented Sep 30, 2024

backport-8.x has been added to help with the transition to the new branch 8.x.
If you don't need it please use backport-skip label and remove the backport-8.x label.

@mergify mergify bot added the backport-8.x Automated backport to the 8.x branch with mergify label Sep 30, 2024
@gpop63
Copy link
Contributor

gpop63 commented Oct 2, 2024

I did a bit of testing on this will leave here some notes.

When we're fetching time series data and we apply a CrossSeriesReducer like REDUCE_SUM, REDUCE_MEAN, etc., it combines multiple time series into fewer ones or even a single one. By default, if we don't specify GroupByFields, the reducer will aggregate all the time series together, and we lose all the labels like zone, instance_id, project_id.

Example without GroupByFields:

We have CPU usage metrics from multiple instances across different zones. We apply a REDUCE_SUM without any GroupByFields. All CPU usage data is summed up into a single time series. We lose visibility into which data came from which zone or instance because the labels are not preserved.

Example with GroupByFields:

Same as above. We apply a REDUCE_SUM and set GroupByFields to ["resource.labels.zone", "resource.labels.instance_id"]. The reducer sums up CPU usage separately for each zone and instance. We retain the zone and instance_id labels in the results, so we can see the aggregated data per zone and per instance.

So I believe we have to expose GroupByFields in the config and allow users to specify by which labels they want to group the data.

@Linu-Elias
Copy link
Contributor Author

I agree with @gpop63, Additionally I have observed that we only receive the labels explicitly specified in the GroupByFields along with project_id.

For example, if GroupByFields is set to only resource labels such as resource.labels.instance_id, the ListTimeSeries() will return the timeseries data without other important labels like resource.labels.zone and metric label metric.labels.instance_name.

Therefore, it's crucial to ensure that all the important labels are included while using GroupByFields in order to group the result based on those labels, So if we plan to expose GroupByFields in the config and allow users to specify by which labels they want to group the data, we also have to mention that that labels they specify will the the ones included in the instance ECS cloud fields.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backport-8.x Automated backport to the 8.x branch with mergify Team:obs-ds-hosted-services Label for the Observability Hosted Services team Team:Obs-InfraObs Label for the Observability Infrastructure Monitoring team
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[GCP] Add CrossSeriesReducer support in ListTimeSeries request
4 participants