Skip to content

perf: hot-path quick wins — fuse lookups, cache sizes, pack Entry#226

Merged
kacy merged 2 commits intomainfrom
perf/hot-path-quick-wins
Feb 20, 2026
Merged

perf: hot-path quick wins — fuse lookups, cache sizes, pack Entry#226
kacy merged 2 commits intomainfrom
perf/hot-path-quick-wins

Conversation

@kacy
Copy link
Owner

@kacy kacy commented Feb 20, 2026

summary

phase A of the performance audit: four hot-path optimizations shipped together since they touch overlapping code (especially the Entry struct). estimated +15–30% throughput improvement on GET-heavy workloads.

fuse double hash lookup on GET — every GET previously did remove_if_expired(key) (one hash probe) then entries.get_mut(key) (second probe). now uses a single get_mut() that checks expiry inline. at 2M+ ops/sec this eliminates millions of redundant probes per second.

cache value_size in Entry — added cached_value_size: usize to Entry. mutations update it incrementally instead of recomputing via value_size() which walks entire collections. HSET on a 1000-field hash no longer iterates all fields twice for memory accounting.

Entry cache-line packing — shrunk last_access_ms: u64 to last_access_secs: u32 (seconds since process start, wraps at ~136 years). saves 4 bytes per entry and keeps hot fields within the first cache line.

zero-alloc peek_command_name — replaced 3 heap allocations per MULTI/EXEC command (to_vec → String::from_utf8 → to_ascii_uppercase) with inline eq_ignore_ascii_case returning &'static str.

item 5 (SmallVec for parsed arrays) was investigated and intentionally skipped — Frame is ~32 bytes, so SmallVec<[Frame; 6]> would add ~200 bytes inline, bloating the Frame enum for all variants. net negative.

what was tested

  • all 345 emberkv-core unit tests pass
  • all 351 ember-protocol tests pass
  • clippy clean (only pre-existing warnings)
  • manually verified memory tracking consistency: push/pop/set/del paths all maintain cached_value_size correctly

design considerations

  • the fused GET lookup uses early returns from match arms to satisfy the borrow checker — the non-expired path returns immediately (releasing the mutable borrow), and only the expired path falls through to remove_expired_entry
  • cached_value_size is maintained incrementally on all mutation paths (list push/pop, hash set/del/incrby, set add/rem, sorted set add/rem, vector add/rem) rather than lazily recomputed
  • now_secs() uses a process-start epoch via OnceLock<Instant> to avoid syscalls — monotonic and wraps at ~136 years which is acceptable for LRU ordering

kacy added 2 commits February 19, 2026 23:45
add auth, election, and raft_transport to the modules table and features
list. fix emberkv-core typo → ember-core in the related crates table.
hot-path quick wins from performance audit (phase A):

- fuse double hash probe on GET/GET_STRING into single lookup
  that checks expiry inline, eliminating ~2M redundant probes/sec
- cache value_size in Entry struct so memory accounting is O(1)
  instead of walking entire collections on every mutation
- shrink last_access from u64 ms to u32 secs, saving 4 bytes/entry
  and improving cache-line packing for the hot Entry struct
- replace allocating peek_command_name (3 heap allocs per MULTI/EXEC
  command) with zero-allocation eq_ignore_ascii_case comparisons
@kacy kacy merged commit d3d9b7d into main Feb 20, 2026
4 of 7 checks passed
@kacy kacy deleted the perf/hot-path-quick-wins branch February 20, 2026 19:09
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant