-
Notifications
You must be signed in to change notification settings - Fork 3.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
bcc: Allow specify wakeup_events for perf buffer #3801
Conversation
This commit adds a new option `wakeup_events` to the open_perf_buffer API. This provides an alternative to solve iovisor#3793. Signed-off-by: Hengqi Chen <chenhengqi@outlook.com>
Is my understanding correct here? If Do we need to do anything special to ensure the 'remainder' events are handled? e.g. |
does epoll timeout will serve this purpose? If the number 10 event never comes up, timeout eventually happens and the user space should process it, right? |
The callchain is: bpf_perf_event_output
__bpf_perf_event_output
perf_event_output
__perf_event_output
perf_output_sample
perf_output_end
perf_output_put_handle In if (!event->attr.watermark) {
int wakeup_events = event->attr.wakeup_events;
if (wakeup_events) {
struct perf_buffer *rb = handle->rb;
int events = local_inc_return(&rb->events);
if (events >= wakeup_events) {
local_sub(wakeup_events, &rb->events);
local_inc(&rb->wakeup);
}
}
} In if (handle->wakeup != local_read(&rb->wakeup))
perf_output_wakeup(handle); And here is static void perf_output_wakeup(struct perf_output_handle *handle)
{
atomic_set(&handle->rb->poll, EPOLLIN);
handle->event->pending_wakeup = 1;
irq_work_queue(&handle->event->pending);
} IIRC, the timeout won't take effect. And the users have to handle the According to my test, each sample triggers the callback handler once, so we don't have to change the handler. |
I think we're on the same page. I didn't have a very clear idea of the scenarios I was worried about. After some more thought I think there's nothing to be worried about. For posterity, though:
So no problem from my end.
When you say "timeout won't take effect", are you referring to the case where |
Yes, you are right. By |
LGTM |
Context: I filed #3793 IIUC, the updated API needs to be exposed to But one thing I think my understanding does not match the reality:
So here I assumed that, by waking up every (for example) 10 events, the 10 events are handled by 1 invocation of the callback on all events in the perf ring buffer (perfbuffer, not the new bpf ringbuf). But it seems that this actually only invoke the callback on the 10th event? If that's the case, then we cannot use it, because we do need to handle all events, missing any of them would not be allowed. Instead, we only want to minimize the overhead of repetitively invoking callback on events. Thus #3793 Edit: I plan to try this out to better understand the situation. Don't take the above as suggestions as I probably had a very poor understanding of the details anyway. |
In the I see what you mean with the "handle many events per callback instead of 1" point, but I don't think this improve performance significantly relative to exposing Also, note that the samples in the perf buffer are not of fixed size and may loop around from the 'end' to 'beginning' of the ring buffer, so when pulling many samples out it's necessary to poke at each one to find size and possibly do some
Yep. This should be added, as well as a test validating this While messing around with this PR's changes today I noticed that we're missing a |
In order to read 'remainder' events from perf buffers w/ 'wakeup_events > 1', an API to force reading from a program's perf buffers regardless of whether their fds are in "ready to read" state is necessary. This commit introduces such an API, perf_reader_consume, modeled after libbpf's perf_buffer__consume. Future work will refactor bcc to use libbpf's perf buffer support as much as possible instead of duplicating functionality. For now, since iovisor#3801 was commited let's add this piece of missing functionality.
In order to read 'remainder' events from perf buffers w/ 'wakeup_events > 1', an API to force reading from a program's perf buffers regardless of whether their fds are in "ready to read" state is necessary. This commit introduces such an API, perf_reader_consume, modeled after libbpf's perf_buffer__consume. Future work will refactor bcc to use libbpf's perf buffer support as much as possible instead of duplicating functionality. For now, since #3801 was commited let's add this piece of missing functionality.
Like in iovisor#3801 and iovisor#3805, this patch adds a consume API which allows user to force reading from perf buffer and adds wake_events parameter on open_all_cpu API. Signed-off-by: Hengqi Chen <chenhengqi@outlook.com>
Like in iovisor#3801 and iovisor#3805, this patch adds a consume API which allows user to force reading from perf buffer and adds wake_events parameter on open_all_cpu API. Signed-off-by: Hengqi Chen <chenhengqi@outlook.com>
In order to read 'remainder' events from perf buffers w/ 'wakeup_events > 1', an API to force reading from a program's perf buffers regardless of whether their fds are in "ready to read" state is necessary. This commit introduces such an API, perf_reader_consume, modeled after libbpf's perf_buffer__consume. Future work will refactor bcc to use libbpf's perf buffer support as much as possible instead of duplicating functionality. For now, since iovisor#3801 was commited let's add this piece of missing functionality.
Like in iovisor#3801 and iovisor#3805, this patch adds a consume API which allows user to force reading from perf buffer and adds wake_events parameter on open_all_cpu API. Signed-off-by: Hengqi Chen <chenhengqi@outlook.com>
This commit adds a new option
wakeup_events
to the open_perf_buffer API.This provides an alternative to solve #3793.
Signed-off-by: Hengqi Chen chenhengqi@outlook.com