Skip to content

Conversation

@alexeykudinkin
Copy link
Contributor

Changes

  1. Modified Ray Core's generator handling sequence to inject back object creation & serialization durations
  2. Updated task_completion_time_excl_backpressure_s to track both UDF block generation time AND block serialization overhead

Related issues

Link related issues: "Fixes #1234", "Closes #1234", or "Related to #1234".

Additional information

Optional: Add implementation details, API changes, usage examples, screenshots, etc.

Signed-off-by: Alexey Kudinkin <ak@anyscale.com>
Signed-off-by: Alexey Kudinkin <ak@anyscale.com>
Signed-off-by: Alexey Kudinkin <ak@anyscale.com>
Signed-off-by: Alexey Kudinkin <ak@anyscale.com>
Signed-off-by: Alexey Kudinkin <ak@anyscale.com>
Updated `task_completion_time_excl_backpressure_s` to include block serde time

Signed-off-by: Alexey Kudinkin <ak@anyscale.com>
Signed-off-by: Alexey Kudinkin <ak@anyscale.com>
Signed-off-by: Alexey Kudinkin <ak@anyscale.com>
…des serde time;

Updated tests;

Signed-off-by: Alexey Kudinkin <ak@anyscale.com>
Signed-off-by: Alexey Kudinkin <ak@anyscale.com>
Signed-off-by: Alexey Kudinkin <ak@anyscale.com>
Signed-off-by: Alexey Kudinkin <ak@anyscale.com>
@alexeykudinkin alexeykudinkin requested review from a team as code owners January 29, 2026 01:07
Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces tracking for block serialization time, which is a valuable addition for performance monitoring in Ray Data. The core change involves modifying the generator execution flow in _raylet.pyx to feed back serialization duration to the caller using gen.send(). This new metric is then integrated throughout the data stack, including BlockExecStats, OpRuntimeMetrics, and relevant operators like MapOperator and HashShuffleOperator. The tests have also been updated to cover these new metrics.

The implementation is clean and effective. The use of yield expressions to pass data back into generators is a good pattern for this use case. I have one minor suggestion to improve code clarity.

Comment on lines 277 to 280
def udf_time_s(self, reset: bool) -> float:
cur_time_s = self._udf_time_s
self._udf_time_s = 0
return cur_time_s
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The reset parameter in udf_time_s is not used in the function body; the timer self._udf_time_s is always reset to 0. Since the only call site in map_operator.py passes reset=True, the behavior is correct for now. However, to improve clarity and prevent potential misuse in the future, I suggest removing the reset parameter from the method signature and updating the call in map_operator.py:772 to map_transformer.udf_time_s().

Suggested change
def udf_time_s(self, reset: bool) -> float:
cur_time_s = self._udf_time_s
self._udf_time_s = 0
return cur_time_s
def udf_time_s(self) -> float:
cur_time_s = self._udf_time_s
self._udf_time_s = 0
return cur_time_s

@ray-gardener ray-gardener bot added data Ray Data-related issues observability Issues related to the Ray Dashboard, Logging, Metrics, Tracing, and/or Profiling labels Jan 29, 2026
Signed-off-by: Alexey Kudinkin <ak@anyscale.com>
Signed-off-by: Alexey Kudinkin <ak@anyscale.com>
@alexeykudinkin alexeykudinkin added the go add ONLY when ready to merge, run all tests label Jan 29, 2026
Copy link

@cursor cursor bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cursor Bugbot has reviewed your changes and found 1 potential issue.

Signed-off-by: Alexey Kudinkin <ak@anyscale.com>
Copy link
Collaborator

@edoakes edoakes left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From offline discussion, core change looks ok to me. Let's consider it a private API for now. I will follow up to add a disclaimer & context in code comments.

Defer to others to review the data changes. You may want to consider an alternative name to "serialization time" since it is not purely serialization, but also includes memory allocation & copy time. Perhaps "block_write_time_s" or "block_output_time_s"

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

data Ray Data-related issues go add ONLY when ready to merge, run all tests observability Issues related to the Ray Dashboard, Logging, Metrics, Tracing, and/or Profiling

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Ray fails to serialize self-reference objects

3 participants