Skip to content

Conversation

@cbiffle
Copy link

@cbiffle cbiffle commented Feb 8, 2025

I noticed that hot parts of the program (allocation and evaluation) were doing more work than is strictly necessary, so I've removed some of it.

  • Removed the memset from tcl_append_string, since all but one byte of it is immediately overwritten by strncpy. Just zeroed the terminating byte instead.

  • Loosened the return type of tcl_string to char* (not const), since many uses of it mutate the returned buffer.

  • Replaced the initializing copy in tcl_append_string with memcpy, which is slightly faster than strncpy in cases like this where the length is known in advance. (This is about a 5% improvement in my tests.)

  • Altered tcl_append to not free the right-hand argument, after noticing that almost every use of it created an otherwise-unused copy.

  • In TCMD handling, rely on tcl_list_at to detect an empty command by returning NULL. This avoids a second tokenization pass over the string in tcl_list_length.

  • Avoid making a short-lived copy of the value in tcl_var, just take ownership of the pointer instead.

On a small collection of microbenchmarks, this improved performance by about 25-30% on my machine.

@cbiffle cbiffle force-pushed the fewer-allocs branch 2 times, most recently from e8e8b6f to 4ce0dd9 Compare February 8, 2025 18:24
I noticed that hot parts of the program (allocation and evaluation) were
doing more work than is strictly necessary, so I've removed some of it.

- Removed the memset from tcl_append_string, since all but one byte of
  it is immediately overwritten by strncpy. Just zeroed the terminating
  byte instead.

- Loosened the return type of tcl_string to char* (not const), since
  many uses of it mutate the returned buffer.

- Replaced the initializing copy in tcl_append_string with memcpy, which
  is slightly faster than strncpy in cases like this where the length is
  known in advance. (This is about a 5% improvement in my tests.)

- Altered tcl_append to _not_ free the right-hand argument, after
  noticing that almost every use of it created an otherwise-unused copy.

- In TCMD handling, rely on tcl_list_at to detect an empty command by
  returning NULL. This avoids a second tokenization pass over the string
  in tcl_list_length.

- Avoid making a short-lived copy of the value in tcl_var, just take
  ownership of the pointer instead.

On a small collection of microbenchmarks, this improved performance by
about 25-30% on my machine.
@cbiffle
Copy link
Author

cbiffle commented Feb 10, 2025

(Noticed your suggestion to run clang-format on PRs, updated the commit accordingly -- it's done something odd to the TCMD case block, but now it's consistent.)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant