Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

multiple concurrent probe users, fixing "Device or resource busy" #872

Closed
brendangregg opened this issue Dec 22, 2016 · 8 comments
Closed

Comments

@brendangregg
Copy link
Member

bcc currently has a limitation of one program per probe. Eg, you can run execsnoop & opensnoop at the same time, but not two opensnoops, as the second gets:

# ./opensnoop.py 
write(/sys/kernel/debug/tracing/kprobe_events, "p:kprobes/p_sys_open sys_open") failed: Device or resource busy
Traceback (most recent call last):
  File "./opensnoop.py", line 127, in <module>
    b.attach_kprobe(event="sys_open", fn_name="trace_entry")
  File "/usr/lib/python2.7/dist-packages/bcc/__init__.py", line 467, in attach_kprobe
    raise Exception("Failed to attach BPF to kprobe")
Exception: Failed to attach BPF to kprobe

As bcc/BPF gets adopted, I'm seeing the risk of clashes between 24x7 bcc/BPF monitoring and ad hoc analysis, when the latter tries to instrument the same probes.

I suspect it's fixable in just bcc, without kernel changes. For ftrace, anyway, we can create two probes for the same kprobe location, provided they have different aliases, and trace_pipe will see both events:

# cd /sys/kernel/debug/tracing
# echo 'p:myopen1 do_sys_open' >> kprobe_events 
# echo 'p:myopen2 do_sys_open' >> kprobe_events 
# echo 1 > events/kprobes/myopen1/enable 
# echo 1 > events/kprobes/myopen2/enable 
# cat trace_pipe
    redis-server-1651  [001] d... 4455463.122021: myopen1: (do_sys_open+0x0/0x210)
    redis-server-1968  [000] d... 4455463.222274: myopen2: (do_sys_open+0x0/0x210)
    redis-server-1651  [001] d... 4455463.222274: myopen2: (do_sys_open+0x0/0x210)
    redis-server-1651  [001] d... 4455463.222276: myopen1: (do_sys_open+0x0/0x210)
    redis-server-1968  [000] d... 4455463.222278: myopen1: (do_sys_open+0x0/0x210)
       snmp-pass-2218  [002] d... 4455463.243604: myopen2: (do_sys_open+0x0/0x210)
       snmp-pass-2218  [002] d... 4455463.243609: myopen1: (do_sys_open+0x0/0x210)
       snmp-pass-2218  [002] d... 4455463.243842: myopen2: (do_sys_open+0x0/0x210)
       snmp-pass-2218  [002] d... 4455463.243844: myopen1: (do_sys_open+0x0/0x210)
[...]
[...now disable them...]
@derek0883
Copy link

derek0883 commented Jan 21, 2017

@brendangregg
Kernel already support this. for the example you provided, only need a little changes, So each instance will have their own buffer.

cd /sys/kernel/debug/tracing
# echo 'p:myopen1 do_sys_open' >> kprobe_events 
# echo 'p:myopen2 do_sys_open' >> kprobe_events 
**cd instances**
mkdir myopen1
mkdir myopen2
echo 1 > myopen1/events/kprobes/myopen1/enable
echo 1 > myopen2/events/kprobes/myopen2/enable
cat myopen1/trace_pipe
cat myopen2/trace_pipe

Here is my diff file to fix this.

  1. In python script, It will try to detect instances directory, if it is exist, then will add _pid at the end of event name, and also mkdir under instances directory.
  2. In libbpf.c, bpf_attach_probe will try to attached event under instances, if it return failed, then will try with previous way, So if user are running old kernel without instances feature support, it can still run 1 instance.
  3. At then end, python script will remove the directory under instances.
    Do you think is this OK? or Do you want move all changes to libbpf.c, without any change for python script?
    I also tested some other example under tools. It works. Should I sent a Pull Request? Thanks.

multi_probe.diff.txt

@brendangregg
Copy link
Member Author

G'Day @derek0883 , thanks for this, this is important to get done. I'm traveling at the moment so haven't tested it yet.

@drzaeus77 / @4ast : did you have an opinion about solving this in python vs C?

@4ast
Copy link
Member

4ast commented Jan 22, 2017

adopting instances is certainly a good move. I'm not sure adding _pid is enough. May be add _bcc suffix or something as well. I think the major bits should be on C side. If we need to change the C api, let's do it.

@derek0883
Copy link

derek0883 commented Jan 22, 2017

@4ast
If adding _pid is not good/safe enough, then lets add a time stamp! :)

@4ast
Copy link
Member

4ast commented Jan 23, 2017

_pid is probably enough from uniqueness point of view. I was thinking to improve debugging... so doing 'ls' in instances will be clear what tool created them.

@derek0883
Copy link

derek0883 commented Jan 23, 2017

@brendangregg @4ast
Attached my 2nd diff. the changes:

  1. event naming pattern changed to _bcc_pid.
  2. detect instances, mkdir/rmdir under instances in C file. The upside of this way is that other frontend is easier and less changes, e.g. python or CPP frontend. Downside is changed two C APIs. bpf_detach_kprobe and bpf_detach_uprobe need an extra argument.
    Let me know if a PR is needed. Thanks! :)
    multi_probe_v2.diff.txt

@4ast
Copy link
Member

4ast commented Jan 23, 2017

great! yes. please send PR. It's easier to review via PR :)

@brendangregg
Copy link
Member Author

fixed by #918

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants