Implement new ZFS pool io monitoring and the option to show ZFS pools only #385

simplepad · 2022-07-04T11:50:11Z

This PR implements new ZFS pool io monitoring, compatible with newer versions of ZFS, where /proc/spl/kstat/zfs/*pool_name*/io files are not present anymore. Instead, it uses /proc/spl/kstat/zfs/*pool_name*/objset-* files, which allows for per dataset io stats.

Also, this PR adds the zfs_pools_only option, which if enabled will hide all datasets and only show pools in disk listings, and will count io stats differently.
In more detail:
When zfs_pools_only is set to true:

In disk listings, only ZFS pools are shown
IO stats are calculated per pool by combining the io of all objects in the pool (all objset-* files in the pool directory, pool root included)

When zfs_pools_only is set to false:

In disk listings, both ZFS pools and datasets are shown
IO stats are calculated per object, that is separately for ZFS pool root and each dataset (operations on datasets are not calculated towards pool root io)

Closes #383
Partially closes #380, pools health monitoring isn't implemented

…ions, now it uses files "/proc/spl/kstat/zfs/*pool_name*/objset*". Needs additional work to be compatible with the option "zfs_pools_only".

…pool's stat filepath points to the objset-* file when the option is disabled, otherwise it points to the pool's stats directory. Made ZFS total pool stat collection into a separate function for clean code. Also removed an unnecessary variable in the default ZFS stat collection, and changed io_ticks to track the number of reads/writes, reducing unnecessary calculations.

src/btop_shared.hpp

src/linux/btop_collect.cpp

… declarations to btop_collect.cpp so they aren't included when compiling for macos and freebsd

…se fs::path instead of strings.

trentbuck · 2022-07-05T08:32:38Z

FYI

I tested Debian 11 / Linux 5.16.0-0.bpo.4-amd64 / ZFS 2.1.4-1~bpo11+1.
With Use fstab = False,

simplepad@4969dd8 shows I/O sparklines for ZFS datasets
6b1b9f8 does not show I/O sparklines for ZFS datasets

With Zfs pools only = True, my datasets disappear, but my pool doesn't appear.
I saw this on three hosts, all set up roughly along the lines of https://openzfs.github.io/openzfs-docs/Getting%20Started/Debian/Debian%20Bullseye%20Root%20on%20ZFS.html
They import the pool from inside the initramfs-tools initrd (before zed is running).
They run refind (not grub), and do not have a bpool. Instead /boot is rsynced into /boot/efi/EFI/debian via this crap https://github.com/trentbuck/flash-kernel-efi/tree/ansible

simplepad · 2022-07-05T11:39:05Z

@trentbuck Does it show your pool when Zfs pools only = False?

simplepad · 2022-07-05T15:19:43Z

Also, does anything come up in your ~/.config/btop/btop.log?

trentbuck · 2022-07-06T14:38:35Z

@simplepad I don't ever see the pool in btop; both zfs_pools_only = False and zfs_pools_only = True in ~/.config/btop/btop.conf just show/hide the datasets. Earlier, I wrongly thought it meant "show/hide the pool" rather than "show/hide datasets". I did not edit btop.conf directly; I used m > Options > mem from inside btop.

For the avoidance of doubt, I do see the pool in zpool list, and I see the datasets in zfs list.

My btop.log has some entries but they look boring.
btop.log

I tested both as my regular user, and as root - I did not see any difference in behaviour. My regular user normal Debian/GNOME defaults: sudo and polkit privileges, but no zfs-allow(8) privileges.
I can't think of anything else "weird" that would cause a pool to not show up.
I haven't actually straced btop or read the code, yet.

trentbuck · 2022-07-06T14:48:03Z

PS: here is grep -rnH ^ /proc/spl/kstat/zfs run as the same regular user that runs btop:
grep.typescript.txt
The pool's name is hera, and the objset files are clearly readable.

I wonder if the issue is that my pool's root dataset is not mounted?

$ zfs list
NAME                                    USED  AVAIL     REFER  MOUNTPOINT
hera                                    152G  72.5G      192K  none
hera/goll                              9.43G  6.57G     9.43G  /srv/backup/goll
hera/hera                               108G  72.5G     8.31G  /
hera/hera/home                         87.4G  72.5G      320K  /home
hera/hera/home/twb                     87.4G  12.9G     51.1G  /home/twb
hera/hera/var                          2.36G  72.5G      320K  /var
hera/hera/var/cache                    1.99G  72.5G     1.18G  /var/cache
hera/hera/var/tmp                       380M  7.71G      294M  /var/tmp
hera/not-backed-up                     34.7G  72.5G      192K  none
hera/not-backed-up/apt-cacher-ng       9.59G  2.41G     9.59G  /var/cache/apt-cacher-ng
hera/not-backed-up/tweak.prisonpc.com  25.1G  22.9G     4.60G  /var/lib/libvirt/images/twoke.d

simplepad · 2022-07-06T17:22:55Z

@trentbuck Can you confirm that your pool appears when using btop 1.2.8? That way we can be sure that the issue is caused by my PR.

I wonder if the issue is that my pool's root dataset is not mounted?

That might be it. Btop is getting disks for listing from /etc/fstab, /etc/mtab, and /proc/self/mounts, so if it's not mounted, it won't be shown.

Earlier, I wrongly thought it meant "show/hide the pool" rather than "show/hide datasets".

Yeah, I think the option's name and description aren't good currently, I want to make it clearer before merging.

trentbuck · 2022-07-06T21:03:34Z

Ah, sorry - I was unclear again.
I have never seen btop show my pool on any host (I tried 3 different hosts, all Debian 11).
I (incorrectly?) thought that was expected on v1.2.8 because, prior to your PR, btop was looking in the wrong place?

With btop v1.2.8, I don't see "hera" (pool) listed under disks. I see datasets with only Used/Free (no I/O)

With your commit, I see I/O tickers, but still no pool

trentbuck · 2022-07-06T21:14:10Z

That might be it. Btop is getting disks for listing from /etc/fstab, /etc/mtab, and /proc/self/mounts, so if it's not mounted, it won't be shown.

I think I misunderstood this also. I was expecting "zfs_pools_only" to show me the same data as "zpool iostat 2", i.e. aggregate I/O data for the whole pool, regardless of whether any of the datasets are mounted.

For example, if I have four VMs (A B C D), and each one gets a zvol dataset that is handed directly to a VM (qemu -hda /dev/zvol/hera/A), I might not have any datasets mounted on the host OS. In that case I would still like/expect btop to show me I/O and capacity stats for either each zvol dataset (with zfs_pools_only=True, a la zfs list) or the aggregate for the whole pool (with zfs_pools_only=False, a la zpool list and zpool iostat 2).

Does that make sense?
Sorry if I am asking for too much!

trentbuck · 2022-07-06T21:20:38Z

PS: I have use_fstab = False because otherwise I don't see any ZFS datasets at all.
My fstab is only has legacy crap in it -- /boot (ext4), /boot/efi (vfat), &c.
All my ZFS mounts are specified as attributes of the ZFS datasets and get mounted automatically via /usr/lib/systemd/system-generators/zfs-mount-generator, which creates systemd .mount units and bypasses /etc/fstab entirely. They do show up in /proc/self/mountinfo and /proc/self/mounts. Per /lib/tmpfiles.d/debian.conf, /etc/mtab is a symlink to ../proc/self/mounts.

$ grep -c zfs /etc/fstab /etc/mtab /proc/self/mountinfo /proc/self/mounts
/etc/fstab:0
/etc/mtab:6
/proc/self/mountinfo:6
/proc/self/mounts:6

simplepad · 2022-07-06T22:41:18Z

I get what you want, but currently btop acts like this: no mountpoint == disk won't be detected and won't be shown.

The pool's name is hera, and the objset files are clearly readable.

I looked closely to your output of grep -rnH ^ /proc/spl/kstat/zfs command, and I couldn't find the hera root object file in there.

I think your request is getting into the territory of #108, which is outside the scope of my PR.

src/linux/btop_collect.cpp

…zfs_collect_pool_total_stats(), use try-catch to prevent possible crashes from int_64t conversions

aristocratos · 2022-07-09T11:26:25Z

@simplepad
Nice work!

Do you feel this PR is ready or do you wanna do more testing?

The logging functions should be ok to keep if you change them from warning to debug, that way we can still get some useful output if someone is having issues.
If you want to keep them as warnings I would ask that you set some limiter on them so we don't spam the logfile in case of errors.

simplepad · 2022-07-09T12:13:32Z

@aristocratos

The logging functions should be ok to keep if you change them from warning to debug, that way we can still get some useful output if someone is having issues.

We could replace them with debug calls, but users won't know something is wrong until they change their log level. On the other hand, it's not like they would randomly want to check the log file, since as far as I know, btop has no error indication in the UI, so I'm not against hiding it behind debug.

Do you feel this PR is ready or do you wanna do more testing?

The only thing that's stopping me from marking this PR ready to review is that the zfs_pools_only option's name and description might be a little confusing and hard to understand, as shown by trentbuck in their comments above. Maybe we should change it to zfs_hide_datasets?
Also, this PR changes the ZFS io collection to be compatible with the new openzfs versions, but technically, if the older files we use in btop 1.2.8 are still present and the zfs_pools_only option is enabled, it's much faster to use them because then we only need to read one file (as opposed to the new version where we read all objects files that comprise the pool and calculate the sum of all of their stats). Should I keep the old way to read stats around for the odd occasion that someone with the older version of openzfs will be able to use it? I am not sure how much faster it is and how much of a benefit that is.

trentbuck · 2022-07-10T22:48:46Z

zfs_pools_only option's name and description might be a little confusing and hard to understand, as shown by trentbuck in their comments above. Maybe we should change it to zfs_hide_datasets?

+1 for zfs_hide_datasets, since that reflects the current semantics.
(Or, keep the variable name, but change the semantics to match :-)

Also, this PR changes the ZFS io collection to be compatible with the new openzfs versions, but technically, if the older files we use in btop 1.2.8 are still present and the zfs_pools_only option is enabled, it's much faster to use them because then we only need to read one file (as opposed to the new version where we read all objects files that comprise the pool and calculate the sum of all of their stats). Should I keep the old way to read stats around for the odd occasion that someone with the older version of openzfs will be able to use it? I am not sure how much faster it is and how much of a benefit that is.

Do you know what version of OpenZFS removed the "old way" files? From the below, I guess it's in 2.1.x.

Debian 11 / ZFS 2.1.4 --- only /proc/spl/kstat/zfs/*morpheus*/objset*
Debian 10 / ZFS 2.0.3 --- both
Debian 9 / ZFS 0.7.12 --- only /proc/spl/kstat/zfs/morpheus/io

simplepad · 2022-07-10T23:34:49Z

@trentbuck

Do you know what version of OpenZFS removed the "old way" files? From the below, I guess it's in 2.1.x.

Well, here's the commit that removes it openzfs/zfs@371f88d, so yeah, probably 2.1.0

Debian 11 / ZFS 2.1.4 --- only /proc/spl/kstat/zfs/morpheus/objset*

Strange, I have a Debian 11 server with ZFS installed and the stable version there is 2.0.3

aristocratos · 2022-07-11T14:10:57Z

@simplepad

We could replace them with debug calls, but users won't know something is wrong until they change their log level. On the other hand, it's not like they would randomly want to check the log file, since as far as I know, btop has no error indication in the UI, so I'm not against hiding it behind debug.

Well, an indication would be that the zfs io isn't showing up leading to them hopefully creating a bug report. And then if they haven't provided a debug logfile we can always ask for one.

I've been thinking about adding some more functionality to the logger, for example optional parameters "log_identifier" and "max_reported". To easily be able to mute log entries of the same "log_identifier".
That way warnings can still be generated for non critical issues without filling up the logfile and adding unnecessary io operations and processing for those who choose to ignore them.

The only thing that's stopping me from marking this PR ready to review is that the zfs_pools_only option's name and description might be a little confusing and hard to understand, as shown by trentbuck in their comments above. Maybe we should change it to zfs_hide_datasets?

I haven't really used zfs so can't really tell what the best terminology for a regular user would be, but in general I would guess having "hide" in the name rather than "only" is probably better. There are several options already in the program that probably should be changed to better clarify what they actually do...

Also, this PR changes the ZFS io collection to be compatible with the new openzfs versions, but technically, if the older files we use in btop 1.2.8 are still present and the zfs_pools_only option is enabled, it's much faster to use them because then we only need to read one file (as opposed to the new version where we read all objects files that comprise the pool and calculate the sum of all of their stats). Should I keep the old way to read stats around for the odd occasion that someone with the older version of openzfs will be able to use it? I am not sure how much faster it is and how much of a benefit that is.

Unless we are talking 100+ files that are being read compared to the "old" way, I don't think will be noticeable performance wise. But I would say it's up to you if you wanna keep both ways.
You could do some simple testing with just the timer shown for "mem" in debug mode and see if you notice any discernable difference between the two ways and if it's worth keeping both.

…on clearer

simplepad · 2022-07-11T16:51:57Z

Although I don't really have that many datasets I did some testing and there wasn't any noticeable difference between the two ways. After all, the files we're reading are stored in /proc, which is a virtual filesystem stored in memory, so the reads are probably fast enough that we can leave the code as is.

Marking this PR as Ready for review.

aristocratos

Great work man!

And thank you for the continued contributions!

trentbuck · 2022-07-12T05:55:54Z

Strange, I have a Debian 11 server with ZFS installed and the stable version there is 2.0.3

It's in bullseye-backports: https://packages.debian.org/search?keywords=zfs-dkms

You can put something like this in /etc/apt/preferences.d/zfs-is-cool:

Explanation: use Linux 5.14+ (not 5.10)
Package: src:linux src:linux-signed-amd64 src:zfs-linux
Pin: release a=bullseye-backports
Pin-Priority: 500

trentbuck · 2022-07-12T06:18:05Z

Unless we are talking 100+ files that are being read compared to the "old" way, I don't think will be noticeable performance wise.

FWIW, I think 10 is typical for a single-user laptop, and 100 to 1000 is typical for a multi-user server.

AFAICT you get one objset file for each mounted dataset. On a server with 20 users (thus at least 40 datasets mounted -- one for each $HOME and one for each $MAIL), I see:

cyber@heavy:~$ zfs list | wc -l
109
cyber@heavy:~$ mount -t zfs | wc -l
95
cyber@heavy:~$ ls -l /proc/spl/kstat/zfs/*/*objset* | wc -l
95

On a single-user laptop, those numbers are all 6.
On a backup server, 9 objsets/mounts, and 66 datasets (because ZFS doesn't need to decrypt or mount the incremental backups).

At the default refresh rate of 0.5Hz, I doubt 100 files is a big deal.
In any case, the "old way" will be gone from all ZFS systems eventually.
If I/you/we are worried about performance, we can talk to the ZFS developers -- maybe they will add an efficient path just for btop :-)

simplepad · 2022-07-12T09:12:37Z

@trentbuck I see, we should be good then.

dm17 · 2024-03-26T00:11:50Z

I'm wondering if ZFS users would like to see ZFS errors in btop, or if that is useless as they monitor those another way... I figure this would be a decent place to ask.

simplepad added 3 commits July 1, 2022 19:30

Add zfs_pools_only option

d8d7e40

Re-implemented ZFS stat collection to be compatible with new ZFS vers…

ed20cb9

…ions, now it uses files "/proc/spl/kstat/zfs/*pool_name*/objset*". Needs additional work to be compatible with the option "zfs_pools_only".

aristocratos reviewed Jul 4, 2022

View reviewed changes

src/btop_shared.hpp Outdated Show resolved Hide resolved

src/linux/btop_collect.cpp Outdated Show resolved Hide resolved

simplepad added 2 commits July 5, 2022 00:20

Move get_zfs_stat_file() and zfs_collect_pool_total_stats() functions…

30cc42f

… declarations to btop_collect.cpp so they aren't included when compiling for macos and freebsd

Use fs::directory_iterator() instead of readdir() in ZFS functions, u…

4969dd8

…se fs::path instead of strings.

aristocratos reviewed Jul 8, 2022

View reviewed changes

src/linux/btop_collect.cpp Show resolved Hide resolved

src/linux/btop_collect.cpp Outdated Show resolved Hide resolved

check if at least one object was read before updating ZFS pool io in …

189cba7

…zfs_collect_pool_total_stats(), use try-catch to prevent possible crashes from int_64t conversions

simplepad added 2 commits July 11, 2022 19:08

rename zfs_pools_only option to zfs_hide_datasets, make its descripti…

bc608e8

…on clearer

switch Logger calls to debug() to avoid filling up the logfile

a33bab3

simplepad marked this pull request as ready for review July 11, 2022 16:52

aristocratos approved these changes Jul 11, 2022

View reviewed changes

aristocratos merged commit b485964 into aristocratos:main Jul 11, 2022

simplepad deleted the zfs-iostat-new branch July 11, 2022 19:16

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Implement new ZFS pool io monitoring and the option to show ZFS pools only #385

Implement new ZFS pool io monitoring and the option to show ZFS pools only #385

simplepad commented Jul 4, 2022

trentbuck commented Jul 5, 2022

simplepad commented Jul 5, 2022

simplepad commented Jul 5, 2022

trentbuck commented Jul 6, 2022

trentbuck commented Jul 6, 2022

simplepad commented Jul 6, 2022 •

edited

Loading

trentbuck commented Jul 6, 2022

trentbuck commented Jul 6, 2022

trentbuck commented Jul 6, 2022

simplepad commented Jul 6, 2022 •

edited

Loading

aristocratos commented Jul 9, 2022

simplepad commented Jul 9, 2022

trentbuck commented Jul 10, 2022

simplepad commented Jul 10, 2022

aristocratos commented Jul 11, 2022

simplepad commented Jul 11, 2022

aristocratos left a comment

trentbuck commented Jul 12, 2022

trentbuck commented Jul 12, 2022

simplepad commented Jul 12, 2022

dm17 commented Mar 26, 2024

Implement new ZFS pool io monitoring and the option to show ZFS pools only #385

Implement new ZFS pool io monitoring and the option to show ZFS pools only #385

Conversation

simplepad commented Jul 4, 2022

trentbuck commented Jul 5, 2022

simplepad commented Jul 5, 2022

simplepad commented Jul 5, 2022

trentbuck commented Jul 6, 2022

trentbuck commented Jul 6, 2022

simplepad commented Jul 6, 2022 • edited Loading

trentbuck commented Jul 6, 2022

trentbuck commented Jul 6, 2022

trentbuck commented Jul 6, 2022

simplepad commented Jul 6, 2022 • edited Loading

aristocratos commented Jul 9, 2022

simplepad commented Jul 9, 2022

trentbuck commented Jul 10, 2022

simplepad commented Jul 10, 2022

aristocratos commented Jul 11, 2022

simplepad commented Jul 11, 2022

aristocratos left a comment

Choose a reason for hiding this comment

trentbuck commented Jul 12, 2022

trentbuck commented Jul 12, 2022

simplepad commented Jul 12, 2022

dm17 commented Mar 26, 2024

simplepad commented Jul 6, 2022 •

edited

Loading

simplepad commented Jul 6, 2022 •

edited

Loading