Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Treat go binaries without offsets as generic #1476

Merged
merged 3 commits into from
Dec 21, 2024
Merged

Conversation

rafaelroquetto
Copy link
Contributor

@rafaelroquetto rafaelroquetto commented Dec 20, 2024

In the case in which we have found offsets, but these are related to a go proxy, we must set the error so that the attacher will treat this as a regular binary.

There is a small corner case in which we detect no go offsets in a target executable, but if that executable happens to have a ".gosyms" section, we will end up classifying it as a Go executable, causing issues. We attach an error in this case to have the attacher treat it as a generic executable.

Also, on ELF parse error, we return the list of missing fields, assuming that we would stop the instrumentation. There is an edge case error on which the missing fields set is empty so Beyla keeps with the instrumentation, but the returned list of fields is nil.

Closes #1472

rafaelroquetto and others added 2 commits December 20, 2024 12:33
There is a small corner case in which we detect no go offsets in a
target executable, but if that executable happens to have a ".gosyms"
section, we will end up classifying it as a Go executable, causing
issues. We attach an error in this case to have the attacher treat it as
a generic executable.
On ELF parse error, we return the list of missing fields, assuming that
we would stop the instrumentation. There is an edge case error on which
the missing fields set is empty so Beyla keeps with the instrumentation,
but the returned list of fields is nil.
Copy link

codecov bot commented Dec 20, 2024

Codecov Report

Attention: Patch coverage is 60.00000% with 6 lines in your changes missing coverage. Please review.

Project coverage is 81.37%. Comparing base (1192264) to head (8e38155).
Report is 1 commits behind head on main.

Files with missing lines Patch % Lines
pkg/internal/discover/typer.go 72.72% 3 Missing ⚠️
pkg/internal/discover/attacher.go 33.33% 1 Missing and 1 partial ⚠️
pkg/internal/goexec/structmembers.go 0.00% 1 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main    #1476      +/-   ##
==========================================
- Coverage   81.45%   81.37%   -0.09%     
==========================================
  Files         149      149              
  Lines       15253    15264      +11     
==========================================
- Hits        12425    12421       -4     
- Misses       2228     2240      +12     
- Partials      600      603       +3     
Flag Coverage Δ
integration-test 59.78% <60.00%> (+<0.01%) ⬆️
k8s-integration-test 60.66% <13.33%> (+0.79%) ⬆️
oats-test 33.99% <13.33%> (-0.03%) ⬇️
unittests 52.05% <0.00%> (-0.07%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

In the case in which we have found offsets, but these are related to a
go proxy, we must set the error so that the attacher will treat this as
a regular binary.
Copy link
Contributor

@grcevski grcevski left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The go offsets have two components, function offsets and field offsets. Where we crash right now is the field offsets being nil, while all the logic of the Go proxy only cares about the function offsets. We should leave the function offsets alone and not change anything related to the Go proxy, we should make the fix related to bad elf parsing, but also keep the checks that verify we'll avoid instrumenting as Go Executable if we found no offsets.

@@ -124,6 +125,11 @@ func (t *typer) asInstrumentable(execElf *exec.FileInfo) ebpf.Instrumentable {
instrumentableCache.Add(execElf.Ino, InstrumentedExecutable{Type: svc.InstrumentableGolang, Offsets: offsets})
return ebpf.Instrumentable{Type: svc.InstrumentableGolang, FileInfo: execElf, Offsets: offsets}
}

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is actually not enough to fix this issue. The Go proxy checks for functions missing, not offsets for fields.

@@ -146,12 +152,19 @@ func (t *typer) asInstrumentable(execElf *exec.FileInfo) ebpf.Instrumentable {

detectedType := exec.FindProcLanguage(execElf.Pid, execElf.ELF, execElf.CmdExePath)

if detectedType == svc.InstrumentableGolang && err == nil {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should not be here, it will print a warning for our users in their logs for every go executable we discover which doesn't have any functions we care about.

Copy link
Contributor Author

@rafaelroquetto rafaelroquetto Dec 21, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This particular bit relates to this:

There is a small corner case in which we detect no go offsets in a target executable, but if that executable happens to have a ".gosyms" section, we will end up classifying it as a Go executable, causing issues. We attach an error in this case to have the attacher treat it as a generic executable.

It's a very corner case, not present in the reported issue but I discovered it while analysing the code. If offsets are null (and thus InstrumentationError is also null), but for some esoteric reason the target executable has a .gosymtab, this will cause detectedType to be InstrumentableGolang, it will trigger this codepath and crash with a null offsets pointer.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah I see now, okay makes sense.

@@ -443,7 +443,7 @@ func structMemberOffsetsFromDwarf(data *dwarf.Data) (FieldOffsets, map[GoOffset]
log.Debug("inspecting fields for struct type", "type", typeName)
if err := readMembers(reader, structMember.fields, expectedReturns, fieldOffsets); err != nil {
log.Debug("error reading DWARF info", "type", typeName, "error", err)
return nil, expectedReturns
return fieldOffsets, expectedReturns
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the only thing we need to do to fix and include a check that will avoid the null pointer in go tracers.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe there's more to it. This lies in the case when instrumentable.Offsets is null. The stack posted on the issue points that this call is passing a nill Offsets pointer. This pointer gets initialised here. Here we are initialising goexec.Offsets.FieldOfssets which is not directly involving in the crash (it's the toplevel goexec.Offsets being nil that is causing the crash here .

We opted to leave this in the PR because this fixes the issue you are talking about, but I reckon that happens to be a different issue.

@@ -134,9 +134,11 @@ func (ta *TraceAttacher) getTracer(ie *ebpf.Instrumentable) bool {
case svc.InstrumentableGolang:
// gets all the possible supported tracers for a go program, and filters out
// those whose symbols are not present in the ELF functions list
if ta.Cfg.Discovery.SkipGoSpecificTracers || ta.Cfg.Discovery.SystemWide || ie.InstrumentationError != nil {
if ta.Cfg.Discovery.SkipGoSpecificTracers || ta.Cfg.Discovery.SystemWide || ie.InstrumentationError != nil || ie.Offsets == nil {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we just need this check IMO.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that may work indeed, but we end up being less robust by losing the distinction of the corner case (not directly related to the reported issue) in which we end up with a binary having .gosymtabs but no symbols. With the present code, we print a "prettier" message.

We still need to set the error here if you want to print a proper message when we find go proxies though.

@rafaelroquetto
Copy link
Contributor Author

The go offsets have two components, function offsets and field offsets. Where we crash right now is the field offsets being nil, while all the logic of the Go proxy only cares about the function offsets. We should leave the function offsets alone and not change anything related to the Go proxy, we should make the fix related to bad elf parsing, but also keep the checks that verify we'll avoid instrumenting as Go Executable if we found no offsets.

Looking at the stack, we actually crash here on the toplevel goexec.Offsets pointer before we even reach the inner fields, as a result of the outer / toplevel Instrumentable.Offsets being null

@grcevski
Copy link
Contributor

The go offsets have two components, function offsets and field offsets. Where we crash right now is the field offsets being nil, while all the logic of the Go proxy only cares about the function offsets. We should leave the function offsets alone and not change anything related to the Go proxy, we should make the fix related to bad elf parsing, but also keep the checks that verify we'll avoid instrumenting as Go Executable if we found no offsets.

Looking at the stack, we actually crash here on the toplevel goexec.Offsets pointer before we even reach the inner fields, as a result of the outer / toplevel Instrumentable.Offsets being null

Ah I see, I was under the impression it was only the fields that were missing.

@grcevski grcevski merged commit 7664338 into main Dec 21, 2024
13 checks passed
@grcevski grcevski deleted the fix_null_go_offsets branch December 21, 2024 21:42
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Some beyla pods crashing - signal SIGSEGV - grafana/beyla:1.9.0
3 participants