-
Notifications
You must be signed in to change notification settings - Fork 744
btf: introduce caching string table to speed up ext info loading #1809
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Curious, couldn't package Otherwise, I'd simply put this lookup cache in |
I tried this locally to see how other benchmarks do as well, specifically those that use a lot of BTF strings. The
We do decrease allocations per-op, but also increase CPU time and memory usage per op. I think I have a good alternative. The issue we are trying to solve is the copying that happens when we go from the The patch is simple: diff --git a/btf/strings.go b/btf/strings.go
index f4a3957..264a202 100644
--- a/btf/strings.go
+++ b/btf/strings.go
@@ -7,6 +7,7 @@ import (
"io"
"maps"
"strings"
+ "unsafe"
)
type stringTable struct {
@@ -55,7 +56,7 @@ func (st *stringTable) Lookup(offset uint32) (string, error) {
}
b, err := st.lookupSlow(offset)
- return string(b), err
+ return unsafe.String(&b[0], len(b)), err
}
func (st *stringTable) LookupBytes(offset uint32) ([]byte, error) { Comparing all three results:
Using the unsafe string instead we get more memory savings and no penalty to CPU usage |
I would be fine with the unsafe solution if you are confident it's safe |
Wouldn't that mean that the backing slice for the string table is kept live, preventing garbage collection? Dylan: can you tell why the inspektor gadget benchmark slows down? Seems counterintuitive to me. |
Yes. Though it only matters if these strings are the only reference holding onto that string table. Question is, is it worth it.
The main caller for So it seems in this case we get a lot of strings that are only looked up once, thus we pay for the map accesses and insertions, but the cache hit rate is to low to give a CPU benefit. |
I'm not sure I understand, my PR only touches the code that read ext info/line info. How can this be visible in the |
Ah, I think I may have misrepresented your changes. During my testing I had inline the caching into the string table directly as suggested by Timo. That is not fair, only caching certain access patterns might indeed make a difference here. Let me re-run with you exact patch set. Sorry, that is on me. |
Re-ran the benchmark with your exact branch. With those change the inspektor gadget benchmark is totally unaffected, which makes sense in hind sight. Sorry for the confusion and derailment.
|
No worries at all, happy that you could confirm it. So yeah the goal with caching only ext infos access is to keep the great win from #1772 that significantly speed up vmlinux parsing. Looking at the code:
One thing to note with unique, is that before go1.25 doing This is important because if you were to add something like
this would actually allocates a string for each call anyway today.. |
I'd rather not do the unsafe hackery if it's not necessary. Couple of observations:
(2) seems like an easy win, it'd be nice if you could give that a try. (1) is probably a bit more work which you could tackle separately if you are interested. P.S. Not sure |
Definitely happy to do 2, this was actually something I thought about. I can do 1 in a later PR (at a later time) if it's ok for you |
The whole suggestion of doing |
The unique package hashes the whole string, but since we already have a |
ffe102b
to
fb7ed7c
Compare
when parsing line infos, we end up querying multiple times the same offest (especially the file name), and we currently create a new string each time. This commit deduplicates this allocation by caching the string. Signed-off-by: Paul Cacheux <paul.cacheux@datadoghq.com>
fb7ed7c
to
928be4b
Compare
Alright, so I simplified the PR as suggested to move the caching in the |
stringTable has to be safe for concurrent use since it is shared between multiple decoders. Document this and add a lock to LookupCached to make it safe. Signed-off-by: Lorenz Bauer <lmb@isovalent.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Made a small change because stringTable is meant to be concurrency safe.
Ah yes I clearly missed that, thanks ! |
Thanks! |
#1772 introduced lazy loading of strings from string tables. It helps a lot when querying types, but when parsing line infos, we end up querying multiple times the same offest (especially the file name), and we currently create a new string each time.
This PR deduplicates this allocation by caching the string when parsing line info.
Benchmarks:
With the elf file present in this repo (
testdata/loader-el.elf
):On a 18MiB elf file: