-
Notifications
You must be signed in to change notification settings - Fork 411
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Coalescing of sequential read requests #999
Comments
This isn't something that io_uring controls, this is all done by the block layer. io_uring sends off separate requests, and the lower levels may merge them. If your device is nvme0n1, then you can turn that off completely with:
In that same directory, there are also settings for max request size. So if you think that eg a 128K request is fine, but you don't want it larger than that, you could do:
|
Thanks, didn't realize there already was a setting. |
What do you think about adding an option to io_uring to control request merging, such as IOURING_SQE_NOMERGE used together with io_uring_prep_read()? When passed to lower levels, this would override the system-wide /sys/block/nvmeX/queue/nomerges setting. This would be useful to implement something like buffered reads with O_DIRECT-opened files from user space, to consume a file in small chunks while additional reads are pending. Currently, I do not think this is possible, as block merging would combine everything into a single request and execute it atomically. Changing kernel-wide parameters is not always an option and may hurt performance for some applications while benefiting others. Having finer-grained control over merging would be preferable. |
This is awesome. Would you like me to test this change? Will it work with ext4 or does that filesystem not rely on iomap? |
If I add 100 sequential read requests (e.g. for blocks 0-99 of a file), they appear to be coalesced and executed atomically by the NVME. That is, io_uring_wait_cqe_nr(1) will block until all 100 reads complete.
I see how this helps with throughput, but this can be a problem for an application that cares about latency and is ready to handle individual block reads as soon as they complete.
Is there a flag to prevent this coalescing and if not, would it make sense to add one?
The text was updated successfully, but these errors were encountered: