Low IOPS during phase 5 due to synchronous reads #1516
Description
Hi!
TG, during the execution phase, seems to do all of its I/O synchronously (ie. one by one), which prevents the OS and disk to optimize seek times and reduce overall random I/O latency.
The issue is particularly visible when using a cloud instance, where generally disks are actually network block storage, which increases latency a lot even when requesting SSDs (high random IOPS, but still high latency because of the network layer).
Check this documentation for example: https://cloud.google.com/compute/docs/disks/performance. Note that it still applies on bare metal, where latency still exists.
The solution (if the logic permits) would be to have multiple light threads requesting (prefetching) the different pieces of data simultaneously, such that the OS I/O queue could optimize the disk access.
I am running the execution step right now locally on an HDD, and I am getting a throughput of ~100kB/s, read IOPS ~27, and an IO queue depth of ~1.9. Previous steps were able to fill the IO queue to ~128. (Measured through iostat -xt 10
on a drive used exclusively for TG's data)
Activity