Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Upgrade protobuf from v4 to v5: 5x Upsert throughput (#393)
Upgrade the protobuf dependancy from v4 (4.25) to v5 (5.28). This appears to have significantly faster protobuf encoding - I see a 4.5x - 5x inprovement in Upsert throughput on a given EC2 machine (i3.xlarge) for large batches (~300) of high dimensionality vectors (1536): Before: Performing Populate phase 1675770/138364198 ╸━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 1% 0:35:34 44:17:17 Records/sec: 785.2 After: Performing Populate phase 1531830/138364198 ╸━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 1% 0:07:07 10:35:48 Records/sec: 3584.4 I haven't dug into the exact details, but the profile is quite different - the python frames performing type checking are no longer present, so I assume they have been optimised, perhaps pushed to native code? Before profile: ![protobuf_v4](https://github.com/user-attachments/assets/2c201489-c3f0-489b-8db1-e393c8953747) After profile: ![protobuf_v5](https://github.com/user-attachments/assets/608ed8a0-9140-469f-88cc-d5546195050c) ## Type of Change - [x] None of the above: Dependency upgrade. ## Test Plan Describe specific steps for validating this change.
- Loading branch information