Skip to content

port txg dtrace script to ebpf #31

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Feb 19, 2020
Merged

Conversation

brad-lewis
Copy link
Contributor

This is a port of the dtrace script :
http://grok.delphix.com/xref/dlpx-app-5.3-stage/appliance/server/service/performance_playbook/txg.d

This is a little different than the existing estat scripts and there are a few new things to give extra scrutiny too.

  1. Event based profiling using bcc BPF_PERF_OUTPUT map.
  2. A perf event, the equivalent of a dtrace profiling probe
  3. Using drgn python package to read zfs_dirty_data_max.

Here sample output. I need to get on a loaded machine and verify the output when throttling and delays are an issue.

Thu Jan 23 16:21:41 2020 100482 5025ms 214ms (99 pass 1) 2MB ( 0) 0us 0ms 0ms
Thu Jan 23 16:21:46 2020 100483 4909ms 114ms (97 pass 1) 1MB ( 0) 0us 0ms 0ms
Thu Jan 23 16:21:51 2020 100484 5226ms 179ms (99 pass 1) 2MB ( 0) 0us 0ms 0ms
Thu Jan 23 16:21:56 2020 100485 4972ms 75ms (96 pass 1) 1MB ( 0) 0us 0ms 0ms
Thu Jan 23 16:22:02 2020 100486 5044ms 72ms (98 pass 1) 2MB ( 0) 0us 0ms 0ms
date txg time since last sync
| | | sync time
| | | | (%% pass 1)
| | | | | highest dirty (%%)
| | | | | | highest throttle delay
| | | | | | | | avg delay
v v v v v v v v v
Thu Jan 23 16:22:07 2020 100487 5047ms 66ms (97 pass 1) 2MB ( 0) 0us 0ms 0ms
Thu Jan 23 16:22:12 2020 100488 5053ms 83ms (98 pass 1) 1MB ( 0) 0us 0ms 0ms
Thu Jan 23 16:22:17 2020 100489 5036ms 49ms (95 pass 1) 2MB ( 0) 0us 0ms 0ms
Thu Jan 23 16:22:22 2020 100490 5099ms 71ms (95 pass 1) 3MB ( 0) 0us 0ms 0ms
Thu Jan 23 16:22:27 2020 100491 5019ms 55ms (97 pass 1) 2MB ( 0) 0us 0ms 0ms
Thu Jan 23 16:22:32 2020 100492 5064ms 260ms (99 pass 1) 1MB ( 0) 0us 0ms 0ms
Thu Jan 23 16:22:37 2020 100493 4859ms 78ms (95 pass 1) 2MB ( 0) 0us 0ms 0ms
Thu Jan 23 16:22:43 2020 100494 5041ms 778ms (99 pass 1) 3MB ( 0) 0us 0ms 0ms
Thu Jan 23 16:22:48 2020 100495 4341ms 142ms (98 pass 1) 3MB ( 0) 0us 0ms 0ms

@brad-lewis
Copy link
Contributor Author

Responding to an external discussion. I looked into using BPF.ksymname() as an alternative to the drgn approach to read zfs_max_data_dirty. The support in BPF appears to be only for functions.

/*
* Collect data to know how much we're being throttled / delayed. If we're
* throttled on every tx we could hit these probes a lot (burning CPU), so we
* avoid using syntactic sugar in this section.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What syntactic sugar is this referring to?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wish I had some syntactic sugar.
The spirit of the dtrace comment, warning about not overburdening this probe seemed worth keeping.

@brad-lewis brad-lewis merged commit d6f5f96 into delphix:master Feb 19, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

4 participants