Skip to content

Conversation

@nabbi
Copy link
Contributor

@nabbi nabbi commented Feb 10, 2026

Three changes to the JPEG streaming hot path:

  1. Reuse Image buffer for JPEG decode: add Image member reuse_image_ to EventStream. On the JPEG path (SaveJPEGs & 1), call ReadJpeg() into the member instead of new/delete Image each frame. ReadJpeg's internal WriteBuffer() reuses the pixel allocation when dimensions match (every frame in a given event). Eliminates ~2 MB malloc+free per streamed frame. The FFmpeg path (MP4-only events) still uses new/delete since its initialization is more complex and it's the uncommon case.

  2. Replace stat() with access() for file existence checks. The stat() filled a struct stat that was never read — send_file() does its own fstat() for Content-Length. access(path, R_OK) is a lighter syscall that skips the 144-byte struct fill.

  3. Replace stringtf() with snprintf() into a stack char[PATH_MAX]. stringtf() does two heap allocations per call (unique_ptr<char[]> for vsnprintf + std::string for return). File paths are well within PATH_MAX. This eliminates 2-6 heap alloc/free cycles per frame depending on the analysis fallback path.

@connortechnology
Copy link
Member

Not a fan of going back to c style strings. The rest looks great.

Three changes to the JPEG streaming hot path:

1. Reuse Image buffer for JPEG decode: add Image member reuse_image_
   to EventStream. On the JPEG path (SaveJPEGs & 1), call ReadJpeg()
   into the member instead of new/delete Image each frame. ReadJpeg's
   internal WriteBuffer() reuses the pixel allocation when dimensions
   match (every frame in a given event). Eliminates ~2 MB malloc+free
   per streamed frame. The FFmpeg path (MP4-only events) still uses
   new/delete since its initialization is more complex and it's the
   uncommon case.

2. Replace stat() with access() for file existence checks. The stat()
   filled a struct stat that was never read — send_file() does its own
   fstat() for Content-Length. access(path, R_OK) is a lighter syscall
   that skips the 144-byte struct fill.

3. Replace stringtf() with snprintf() into a stack char[PATH_MAX],
   wrapped in a std::string_view for downstream consumers.
   stringtf() does two heap allocations per call (unique_ptr<char[]>
   for vsnprintf + std::string for return). File paths are well within
   PATH_MAX. snprintf() into the stack buffer eliminates 2-6 heap
   alloc/free cycles per frame depending on the analysis fallback
   path. The string_view keeps the C-style storage contained to the
   snprintf call sites; everything downstream uses C++ semantics
   (filepath.empty(), filepath.data()) rather than raw char[] idioms.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@nabbi nabbi force-pushed the perf-eventstream-sendframe-reduce-overhead branch from a4409f3 to c8d5ea3 Compare February 11, 2026 00:55
@nabbi
Copy link
Contributor Author

nabbi commented Feb 11, 2026

Not a fan of going back to c style strings. The rest looks great.

I explored eliminating, found a means to localize them, Claude's perspective about the gains.


Edit 1 (the original commit): replaced std::string + stringtf() with raw char filepath_buf[PATH_MAX] + snprintf(). This eliminated the heap allocations, but the consumer code was all C idioms — filepath_buf[0] == '\0', passing filepath_buf everywhere.

Edit 2 (the amend we just did): added std::string_view filepath as a non-owning view over that same stack buffer. The char[] and snprintf stay — that's where the perf win is. But all the consumer code now goes through the string_view:

  • filepath_buf[0] != '\0' → !filepath.empty()
  • Image(filepath_buf) → Image(filepath.data())
  • bare filepath_buf in format args → filepath.data()

The C-style storage (char[PATH_MAX] + snprintf) is still there because that's the whole point — zero heap allocs. The string_view just keeps it contained to the two snprintf call sites and the filepath = filepath_buf assignments. Everything downstream uses C++ semantics. It's the boundary between "we need a stack buffer for performance" and "the rest of the function shouldn't have to know that."

The honest answer: this change matters as part of the set. Change 1 (reuse_image_) saves ~2 MB of alloc/free per frame — that's the big one. Change 3 is cleanup that removes unnecessary heap traffic in the same hot path. On its own it's noise. Combined with the other two, it means sendFrame does zero heap allocations on the common JPEG path, which is a cleaner baseline.


@connortechnology
Copy link
Member

What about using a static std::string? Or actually just a member of MonitorStream. Or of Stream for that matter.

I'll discuss with Claude. :) The rest is obviously great.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants