[FEA] Add a Parquet reader benchmark that uses multiple CUDA streams

**Is your feature request related to a problem? Please describe.**
Our suite of [Parquet reader benchmarks](https://github.com/rapidsai/cudf/tree/branch-23.04/cpp/benchmarks/io/parquet) includes a variety of data source, data types, compression formats and reader options. However it does not include a benchmark that uses multiple CUDA streams with multiple host threads to read portions of the same dataset and maximize GPU utilization. The Spark-RAPIDS plugin relies on multi-stream parquet reads from host buffers (using per-thread-default-stream, PTDS) for the data ingest step into libcudf. 

**Describe the solution you'd like**
We should add a libcudf microbenchmark that creates several host threads, each with it's own non-default CUDA stream, and then reads a large parquet dataset from host memory into a libcudf table. Currently we haven't exposed a stream in the public API for the parquet reader, but development of the benchmark can begin by using the [read_parquet](https://github.com/rapidsai/cudf/blob/97a746f4195b65366865a6763ce0446c30279ecf/cpp/src/io/parquet/reader.cpp#L23) detail API. We could design this benchmark either to read one file per thread or one row group per thread, whichever is more expedient. After the read step, we might want to add a concatenation step to yield a single table. It might be useful to leverage the same generated data as in the other Parquet reader benchmarks, so we have a performance reference when studying the advantage of multi-thread, multi-stream read times.

**Describe alternatives you've considered**
The alternative would be to continue using Spark-RAPIDS NDS runs to track performance of libcudf's parquet reader in a multi-threaded, multi-stream use case.



Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[FEA] Add a Parquet reader benchmark that uses multiple CUDA streams #12700

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development