Skip to content

Optimize _convert_rows_to_protobuf: eliminate redundant timestamp conversions #6004

@abhijeet-dhumal

Description

@abhijeet-dhumal

The _convert_rows_to_protobuf function in sdk/python/feast/utils.py performs redundant timestamp conversions. The FromDatetime() call is executed once per feature per entity, when it should only be executed once per entity.. For 50 features × 500 entities, this means 25,000 calls instead of 500.

Benchmark Results
Profile Features Entities p99 Latency protobuf_convert Time
Small 3 10 0.4ms 0.04ms (10%)
Medium 5 100 14.9ms 0.5ms (10%)
Large 20 500 49.9ms 10.9ms (28%)
Stress 50 500 115.4ms 26.8ms (27%)
At scale, protobuf_convert consumes ~27% of request time.

Root Cause

sdk/python/feast/utils.py lines 1426-1436:

for feature_name in requested_features:      # 50 iterations
    for idx, read_row in enumerate(read_rows):  # 500 iterations
        row_ts_proto.FromDatetime(row_ts)       # 25,000 calls (should be 500)

Specifications

  • Version: latest v0.60.0 branch
  • Platform: Local

Possible Solution

# Pre-compute once per entity
row_timestamps = [Timestamp() for _ in read_rows]
for idx, (row_ts, _) in enumerate(read_rows):
    if row_ts is not None:
        row_timestamps[idx].FromDatetime(row_ts)

# Reference pre-computed timestamps
for feature_name in requested_features:
    ts_vector = row_timestamps[:]  # Shallow copy

Expected Impact

  • Complexity: O(features × entities) → O(entities) for timestamp conversion
  • Estimated improvement: 15-25% latency reduction for large requests
  • Affected stores: All (DynamoDB, Redis, PostgreSQL, SQLite)

Metadata

Metadata

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions