Closed
Description
openedon Feb 22, 2024
While benchmarking some code we found that about 5% worth of time are being lost due to this line of code:
https://github.com/rapidsai/cudf/blob/branch-24.04/cpp/src/io/parquet/reader_impl.cpp#L248
This is on our perf cluster (A100) for NDS @3k. It explains some of a dip in perf we have seen since 23.10 but we haven't gotten around to testing.
If I stop obtaining the value from error_code
(e.g. I don't perform the pageable memcpy essentially) we gain 20 seconds locally. I am filing this because it may be a good idea to remove this or look into how to improve it (would a pinned copy help)?
Activity