Skip to content

Conversation

azsefi
Copy link

@azsefi azsefi commented Oct 17, 2025

What changes were proposed in this pull request?

Stage level metrics like shuffle record counts, memory/disk spill bytes etc., are enabled to be exposed by the prometheus servlet.

Why are the changes needed?

The metrics listed in the documentation, but not exposed in the prometheus API. The missing metrics are quite critical for building monitoring dashboards.

Does this PR introduce any user-facing change?

New metrics have been added to the prometheus API.

metrics_executor_memoryBytesSpilled_bytes_total{application_id="local-1760714421896", application_name="Spark shell", executor_id="driver"} 0
metrics_executor_diskBytesSpilled_bytes_total{application_id="local-1760714421896", application_name="Spark shell", executor_id="driver"} 0
metrics_executor_shuffleWriteRecords_total{application_id="local-1760714421896", application_name="Spark shell", executor_id="driver"} 28
metrics_executor_shuffleReadRecords_total{application_id="local-1760714421896", application_name="Spark shell", executor_id="driver"} 28
metrics_executor_inputRecords_total{application_id="local-1760714421896", application_name="Spark shell", executor_id="driver"} 100100
metrics_executor_outputBytes_bytes_total{application_id="local-1760714421896", application_name="Spark shell", executor_id="driver"} 0
metrics_executor_outputRecords_total{application_id="local-1760714421896", application_name="Spark shell", executor_id="driver"} 0

How was this patch tested?

Run the change in local mode on laptop, and in cluster mode on kubernetes, and checked the metrics API against the numbers show in the spark UI.

Was this patch authored or co-authored using generative AI tooling?

No

@github-actions github-actions bot added the CORE label Oct 17, 2025
@HyukjinKwon HyukjinKwon changed the title [SPARK-38117][METRICS] Expose stage level metrics in prometheus endpoint [SPARK-38117][CORE] Expose stage level metrics in prometheus endpoint Oct 20, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant