Skip to content

simple reproducer for mmap zfs_flush hang #17303

Open
@markwiederspahn

Description

@markwiederspahn

System information

Type Version/Name
Distribution Name RHEL, Alma
Distribution Version 8
Kernel Version 4.18.0-553.51.1.el8_10
Architecture x86_64
OpenZFS Version 2.1.16, 2.2.7, 2.3.1, 2.3.2, 2.3.99-323_g246e5883b

Describe the problem you're observing

production rhel8 system using python parallel threads with mmap hung when certain jobs are run. The issue was traced to a "flush" which does an msync() which leads to ZFS sync writes. Originally seen on a 2x12 disk raidz2 array with 2.1.16 36 cores 644G ram, occurred also with 2.2.7. Debugging moved to Alma8 system with 8 disk raidz1 8 cores 48G ram. Hang is generally reproducible within a few minutes at n=16, sometimes two copies of n=16

Describe how to reproduce the problem

on stock RHEL8/Alma8 with ZFS installed --enable-debug from the tarballs, dnf install python39 python39-numpy
/usr/bin/python3.9 ./memmap_flush_test.py -j 16 --flush will sometimes hang the system, two copies at one is faster.
echo 8191 > zfs_flags
no ASSERTS ever seen.
This seems to be related to various similar issues: #7964, #8751 and #17159, although I attempted to install the small changes from the latter in my copy of master and I continue to get hangs. Probably I'm doing something wrong and not actually getting the changes.

Include any warning/errors/backtraces from the system logs

...
/proc/spl/kstat/zfs/dbgmsg:
1746464023 ffff8f994d87d000 zap_leaf.c:501:zap_entry_read_name(): error 75
1746464023 ffff8f994d87d000 zap_leaf.c:501:zap_entry_read_name(): error 75
1746464023 ffff8f994d87d000 metaslab.c:4021:metaslab_flush(): flushing: txg 17473623, spa xx1, vdev_id 0, ms_id 5231, unflushed_allocs 647168, unflushed_frees
507904, appended 280 bytes
1746464023 ffff8f994d87d000 zap_leaf.c:501:zap_entry_read_name(): error 75
1746464023 ffff8f994d87d000 dbuf.c:3358:dbuf_findbp(): error 2
1746464023 ffff8f994d87d000 zap_leaf.c:501:zap_entry_read_name(): error 75

messages:
May 5 11:57:09 utig1 kernel: INFO: task kworker/u96:1:139110 blocked for more than 120 seconds.
May 5 11:57:09 utig1 kernel: Tainted: P OE -------- - - 4.18.0-553.51.1.el8_10.x86_64 #1
May 5 11:57:09 utig1 kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
May 5 11:57:09 utig1 kernel: task:kworker/u96:1 state:D stack:0 pid:139110 ppid:2 flags:0x80004080
May 5 11:57:09 utig1 kernel: Workqueue: writeback wb_workfn (flush-zfs-6)
May 5 11:57:09 utig1 kernel: Call Trace:
May 5 11:57:09 utig1 kernel: __schedule+0x2d1/0x870
May 5 11:57:09 utig1 kernel: schedule+0x55/0xf0
May 5 11:57:09 utig1 kernel: io_schedule+0x12/0x40
May 5 11:57:09 utig1 kernel: __lock_page+0x136/0x270
May 5 11:57:09 utig1 kernel: ? filemap_fdatawait_keep_errors+0x50/0x50
May 5 11:57:09 utig1 kernel: write_cache_pages+0x1f2/0x420
May 5 11:57:09 utig1 kernel: ? zpl_readpages+0x20/0x20 [zfs]
May 5 11:57:09 utig1 kernel: ? __wb_update_bandwidth+0x35/0x1e0
May 5 11:57:09 utig1 kernel: ? _cond_resched+0x15/0x30
May 5 11:57:09 utig1 kernel: zpl_writepages+0xaa/0x200 [zfs]
May 5 11:57:09 utig1 kernel: do_writepages+0xc6/0x1d0
May 5 11:57:09 utig1 kernel: ? ip_finish_output2+0x1a6/0x430
May 5 11:57:09 utig1 kernel: __writeback_single_inode+0x39/0x300
May 5 11:57:09 utig1 kernel: writeback_sb_inodes+0x1ea/0x450
May 5 11:57:09 utig1 kernel: __writeback_inodes_wb+0x5f/0xd0
May 5 11:57:09 utig1 kernel: wb_writeback+0x24c/0x2e0
May 5 11:57:09 utig1 kernel: ? get_nr_inodes+0x35/0x60
May 5 11:57:09 utig1 kernel: wb_workfn+0x392/0x4e0
May 5 11:57:09 utig1 kernel: process_one_work+0x1d3/0x390
May 5 11:57:09 utig1 kernel: worker_thread+0x30/0x390
May 5 11:57:09 utig1 kernel: ? process_one_work+0x390/0x390
May 5 11:57:09 utig1 kernel: kthread+0x134/0x150
May 5 11:57:09 utig1 kernel: ? set_kthread_struct+0x50/0x50
May 5 11:57:09 utig1 kernel: ret_from_fork+0x35/0x40

ps axl | grep D:

5 0 139110 2 20 0 0 0 - D ? 0:16 [kworker/u96:1+flush-zfs-6]
1 0 253108 252676 20 0 6226276 101836 - D pts/0 0:01 /usr/bin/python3.9 ../memmap_flush_test.py -j 16 --flush
1 0 253109 252676 20 0 6226276 101836 - D pts/0 0:01 /usr/bin/python3.9 ../memmap_flush_test.py -j 16 --flush
1 0 253113 252676 20 0 6226276 101836 - D pts/0 0:01 /usr/bin/python3.9 ../memmap_flush_test.py -j 16 --flush
1 0 253118 252676 20 0 6226276 101840 - D pts/0 0:01 /usr/bin/python3.9 ../memmap_flush_test.py -j 16 --flush
...

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type: DefectIncorrect behavior (e.g. crash, hang)

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions