Skip to content

perf: Replace MessageToDict with optimized custom dict builder#6015

Open
abhijeet-dhumal wants to merge 1 commit intofeast-dev:masterfrom
abhijeet-dhumal:fix/custom-json-serialization-6013
Open

perf: Replace MessageToDict with optimized custom dict builder#6015
abhijeet-dhumal wants to merge 1 commit intofeast-dev:masterfrom
abhijeet-dhumal:fix/custom-json-serialization-6013

Conversation

@abhijeet-dhumal
Copy link
Contributor

@abhijeet-dhumal abhijeet-dhumal commented Feb 24, 2026

Summary

Fixes #6013

The feature server uses google.protobuf.json_format.MessageToDict for converting GetOnlineFeaturesResponse to JSON, but this generic converter is slow due to reflection-based field introspection.

Changes

  • Add feature_server_utils.py with response_to_dict_fast() function
  • Custom dict builder optimized for GetOnlineFeaturesResponse structure
  • Direct field access instead of reflection
  • Efficient type-specific value extraction
  • Add comprehensive unit tests for the new serialization

Performance Impact

~4x faster serialization, saving 5-15ms per request for typical feature responses with 50-200 features.

Test Plan

  • Added unit tests in test_feature_server_utils.py
  • Tests verify output matches MessageToDict format
  • Tests cover all Value types, timestamps, metadata, status codes
  • Performance test validates speedup
  • Passes make lint-python

Open with Devin

@abhijeet-dhumal abhijeet-dhumal requested a review from a team as a code owner February 24, 2026 06:27
devin-ai-integration[bot]

This comment was marked as resolved.

@abhijeet-dhumal abhijeet-dhumal force-pushed the fix/custom-json-serialization-6013 branch 2 times, most recently from 6a38f3c to 2b81774 Compare February 24, 2026 06:36
devin-ai-integration[bot]

This comment was marked as resolved.

@abhijeet-dhumal abhijeet-dhumal force-pushed the fix/custom-json-serialization-6013 branch from 2b81774 to 8fbda9f Compare February 24, 2026 06:44
devin-ai-integration[bot]

This comment was marked as resolved.

@abhijeet-dhumal abhijeet-dhumal force-pushed the fix/custom-json-serialization-6013 branch from 8fbda9f to f3ea953 Compare February 24, 2026 06:52
devin-ai-integration[bot]

This comment was marked as resolved.

@ntkathole ntkathole changed the title [perf] Replace MessageToDict with optimized custom dict builder perf: Replace MessageToDict with optimized custom dict builder Feb 25, 2026
@abhijeet-dhumal abhijeet-dhumal force-pushed the fix/custom-json-serialization-6013 branch from 06877fd to 063eeb2 Compare February 25, 2026 14:19
@abhijeet-dhumal abhijeet-dhumal marked this pull request as draft February 25, 2026 14:20
@abhijeet-dhumal abhijeet-dhumal force-pushed the fix/custom-json-serialization-6013 branch from 063eeb2 to d3c512d Compare February 25, 2026 14:20
Signed-off-by: abhijeet-dhumal <abhijeetdhumal652@gmail.com>
@abhijeet-dhumal abhijeet-dhumal force-pushed the fix/custom-json-serialization-6013 branch from d3c512d to c06a75d Compare February 25, 2026 14:22
Copy link
Contributor

@devin-ai-integration devin-ai-integration bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Devin Review found 1 new potential issue.

View 5 additional findings in Devin Review.

Open in Devin Review

Comment on lines +69 to +71
return dt.strftime("%Y-%m-%dT%H:%M:%S") + (
".%06dZ" % (ts.nanos // 1000) if ts.nanos else "Z"
)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟡 Timestamp nano-second formatting differs from MessageToDict output

The _timestamp_to_str function always emits 6 fractional-second digits (e.g., .500000Z) when nanos are present, but protobuf's MessageToDict uses variable precision: 3 digits if nanos % 1_000_000 == 0, 6 if nanos % 1_000 == 0, and 9 otherwise (stripping trailing zeros).

Root cause and impact

For Timestamp(seconds=1609459200, nanos=500000000):

  • Old (MessageToDict): "2021-01-01T00:00:00.500Z" (3 fractional digits)
  • New (_timestamp_to_str): "2021-01-01T00:00:00.500000Z" (6 fractional digits)

The culprit is line 70 which unconditionally formats with 6 digits:

".%06dZ" % (ts.nanos // 1000)

Protobuf's Timestamp JSON serializer (_TimestampMessageToJsonObject) uses adaptive precision:

  • 0 fractional digits when nanos == 0
  • 3 digits when nanos % 1_000_000 == 0
  • 6 digits when nanos % 1_000 == 0
  • 9 digits otherwise

Since this function is a drop-in replacement for MessageToDict at the feature server API boundary (sdk/python/feast/feature_server.py:344), any downstream client performing exact string comparison or parsing that's sensitive to trailing zeros in timestamps will observe a behavioral change.

Suggested change
return dt.strftime("%Y-%m-%dT%H:%M:%S") + (
".%06dZ" % (ts.nanos // 1000) if ts.nanos else "Z"
)
nanos = ts.nanos
if nanos == 0:
return dt.strftime("%Y-%m-%dT%H:%M:%S") + "Z"
elif nanos % 1_000_000 == 0:
return dt.strftime("%Y-%m-%dT%H:%M:%S") + ".%03dZ" % (nanos // 1_000_000)
elif nanos % 1_000 == 0:
return dt.strftime("%Y-%m-%dT%H:%M:%S") + ".%06dZ" % (nanos // 1_000)
else:
return dt.strftime("%Y-%m-%dT%H:%M:%S") + ".%09dZ" % nanos
Open in Devin Review

Was this helpful? React with 👍 or 👎 to provide feedback.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

but most timestamps have 0 nanos or millisecond precision right?

@abhijeet-dhumal abhijeet-dhumal marked this pull request as ready for review February 25, 2026 14:31
Copy link
Contributor

@devin-ai-integration devin-ai-integration bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Devin Review found 1 new potential issue.

View 8 additional findings in Devin Review.

Open in Devin Review

Comment on lines +69 to +71
return dt.strftime("%Y-%m-%dT%H:%M:%S") + (
".%06dZ" % (ts.nanos // 1000) if ts.nanos else "Z"
)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟡 _timestamp_to_str truncates nanoseconds to microseconds, losing precision vs. MessageToDict

The _timestamp_to_str function formats the fractional seconds using ".%06dZ" % (ts.nanos // 1000), which integer-divides nanos by 1000 and formats as 6 digits (microseconds). This silently truncates sub-microsecond precision and produces a different format than MessageToDict.

Root Cause and Impact

The format ".%06dZ" % (ts.nanos // 1000) at sdk/python/feast/feature_server_utils.py:70 does integer division by 1000 then formats as 6-digit microseconds. This causes two problems:

  1. Sub-microsecond precision is lost: e.g. nanos=123456789 produces ".123456Z" instead of the correct ".123456789Z" (the 789 nanoseconds are silently dropped). nanos=100 produces ".000000Z" — the value is entirely lost.

  2. Trailing zeros differ: e.g. nanos=500000000 produces ".500000Z" whereas MessageToDict produces ".500Z" (protobuf strips trailing zeros and uses 3/6/9-digit groups).

Comparison:

nanos=500000000  → fast: ".500000Z"     MessageToDict: ".500Z"
nanos=123456789  → fast: ".123456Z"     MessageToDict: ".123456789Z"
nanos=100        → fast: ".000000Z"     MessageToDict: ".000000100Z"

This breaks the documented contract that the function "matches the output format of MessageToDict with proto_json.patch() applied" (sdk/python/feast/feature_server_utils.py:3). Downstream clients parsing event_timestamps from the REST API (e.g. sdk/python/feast/infra/online_stores/remote.py:496) would receive timestamps with degraded or incorrect precision.

Suggested change
return dt.strftime("%Y-%m-%dT%H:%M:%S") + (
".%06dZ" % (ts.nanos // 1000) if ts.nanos else "Z"
)
if ts.nanos % 1000000000 == 0:
frac = "Z"
elif ts.nanos % 1000000 == 0:
frac = ".%03dZ" % (ts.nanos // 1000000)
elif ts.nanos % 1000 == 0:
frac = ".%06dZ" % (ts.nanos // 1000)
else:
frac = ".%09dZ" % ts.nanos
return dt.strftime("%Y-%m-%dT%H:%M:%S") + frac
Open in Devin Review

Was this helpful? React with 👍 or 👎 to provide feedback.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[perf] Replace MessageToDict with optimized custom dict builder - 4x faster JSON serialization

2 participants