Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Coalescing of sequential read requests #999

Open
andyg24 opened this issue Nov 20, 2023 · 5 comments
Open

Coalescing of sequential read requests #999

andyg24 opened this issue Nov 20, 2023 · 5 comments

Comments

@andyg24
Copy link

andyg24 commented Nov 20, 2023

If I add 100 sequential read requests (e.g. for blocks 0-99 of a file), they appear to be coalesced and executed atomically by the NVME. That is, io_uring_wait_cqe_nr(1) will block until all 100 reads complete.

I see how this helps with throughput, but this can be a problem for an application that cares about latency and is ready to handle individual block reads as soon as they complete.

Is there a flag to prevent this coalescing and if not, would it make sense to add one?

@axboe
Copy link
Owner

axboe commented Nov 20, 2023

This isn't something that io_uring controls, this is all done by the block layer. io_uring sends off separate requests, and the lower levels may merge them. If your device is nvme0n1, then you can turn that off completely with:

# echo 2 > /sys/block/nvme0n1/queue/nomerges

In that same directory, there are also settings for max request size. So if you think that eg a 128K request is fine, but you don't want it larger than that, you could do:

# echo 128 > /sys/block/nvme0n1/queue/max_sectors_kb

@andyg24
Copy link
Author

andyg24 commented Nov 20, 2023

Thanks, didn't realize there already was a setting.

@andyg24 andyg24 closed this as completed Nov 20, 2023
@andyg24
Copy link
Author

andyg24 commented Dec 19, 2024

What do you think about adding an option to io_uring to control request merging, such as IOURING_SQE_NOMERGE used together with io_uring_prep_read()?

When passed to lower levels, this would override the system-wide /sys/block/nvmeX/queue/nomerges setting.

This would be useful to implement something like buffered reads with O_DIRECT-opened files from user space, to consume a file in small chunks while additional reads are pending. Currently, I do not think this is possible, as block merging would combine everything into a single request and execute it atomically.

Changing kernel-wide parameters is not always an option and may hurt performance for some applications while benefiting others. Having finer-grained control over merging would be preferable.

@andyg24 andyg24 reopened this Dec 19, 2024
@axboe
Copy link
Owner

axboe commented Dec 19, 2024

@andyg24
Copy link
Author

andyg24 commented Dec 19, 2024

This is awesome. Would you like me to test this change? Will it work with ext4 or does that filesystem not rely on iomap?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants