Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Added Arrow Interop Benchmarks #17194

Open
wants to merge 24 commits into
base: branch-24.12
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from 18 commits
Commits
Show all changes
24 commits
Select commit Hold shift + click to select a range
974bfd6
added arrow interop benchmarks; to_arrow_host and to_arrow_device
lamarrr Oct 28, 2024
62a1794
Merge remote-tracking branch 'upstream/branch-24.12' into arrow-inter…
lamarrr Oct 28, 2024
0568c38
fixed cmake formatting
lamarrr Oct 28, 2024
6b3fe24
updated interop benchmarks
lamarrr Nov 1, 2024
44825ea
Merge branch 'branch-24.12' into arrow-interop-benchmarks
lamarrr Nov 1, 2024
dd437eb
updated interop benchmarks
lamarrr Nov 4, 2024
81a2046
Merge branch 'branch-24.12' into arrow-interop-benchmarks
lamarrr Nov 4, 2024
314d548
updated arrow interop benchmarks
lamarrr Nov 5, 2024
56102d5
Merge branch 'arrow-interop-benchmarks' of https://github.com/lamarrr…
lamarrr Nov 5, 2024
0740ae6
Merge branch 'branch-24.12' into arrow-interop-benchmarks
lamarrr Nov 5, 2024
0d5062c
fixed arrow array source
lamarrr Nov 5, 2024
7f75562
Merge remote-tracking branch 'upstream/branch-24.12' into arrow-inter…
lamarrr Nov 5, 2024
eda0ff8
fixed struct and list arrrow interop benchmark
lamarrr Nov 6, 2024
a015526
Merge branch 'branch-24.12' into arrow-interop-benchmarks
lamarrr Nov 6, 2024
1e36256
Merge branch 'branch-24.12' into arrow-interop-benchmarks
lamarrr Nov 6, 2024
1807940
Merge branch 'branch-24.12' into arrow-interop-benchmarks
lamarrr Nov 7, 2024
1b136dc
Merge branch 'branch-24.12' into arrow-interop-benchmarks
lamarrr Nov 11, 2024
e476ff6
removed unused arrow detail header
lamarrr Nov 13, 2024
4e405ff
added support for multiple columns and added decimal32 and 64 to arro…
lamarrr Nov 14, 2024
212d914
Update cpp/benchmarks/interop/interop.cpp
lamarrr Nov 15, 2024
2a6edcd
Merge branch 'branch-24.12' into arrow-interop-benchmarks
lamarrr Nov 15, 2024
dd64271
fixed num_rows type in arrow interop benchmarks
lamarrr Nov 15, 2024
07a0916
Merge branch 'branch-24.12' into arrow-interop-benchmarks
lamarrr Nov 18, 2024
f59f349
Update cpp/benchmarks/interop/interop.cpp
lamarrr Nov 19, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
6 changes: 6 additions & 0 deletions cpp/benchmarks/CMakeLists.txt
Original file line number Diff line number Diff line change
Expand Up @@ -286,6 +286,12 @@ ConfigureNVBench(
ConfigureBench(HASHING_BENCH hashing/partition.cpp)
ConfigureNVBench(HASHING_NVBENCH hashing/hash.cpp)

# ##################################################################################################
# * interop benchmark ------------------------------------------------------------------------------
ConfigureNVBench(INTEROP_NVBENCH interop/interop.cpp)
target_link_libraries(INTEROP_NVBENCH PRIVATE nanoarrow)
target_include_directories(INTEROP_NVBENCH PRIVATE ${CMAKE_SOURCE_DIR}/tests/interop)

# ##################################################################################################
# * merge benchmark -------------------------------------------------------------------------------
ConfigureBench(MERGE_BENCH merge/merge.cpp)
Expand Down
238 changes: 238 additions & 0 deletions cpp/benchmarks/interop/interop.cpp
Original file line number Diff line number Diff line change
@@ -0,0 +1,238 @@
/*
* Copyright (c) 2024, NVIDIA CORPORATION.
*
* Licensed under the Apache License, Version 2.0 (the "License");
* you may not use this file except in compliance with the License.
* You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
*/

#include <benchmarks/common/generate_input.hpp>
#include <benchmarks/common/table_utilities.hpp>

#include <cudf/interop.hpp>

#include <nanoarrow/nanoarrow.hpp>
#include <nanoarrow/nanoarrow_device.h>
#include <nanoarrow_utils.hpp>
#include <nvbench/nvbench.cuh>

#include <algorithm>
#include <vector>

template <cudf::type_id data_type>
void BM_to_arrow_device(nvbench::state& state, nvbench::type_list<nvbench::enum_type<data_type>>)
{
auto const num_rows = static_cast<cudf::size_type>(state.get_int64("num_rows"));
int32_t const num_columns = 1;
lamarrr marked this conversation as resolved.
Show resolved Hide resolved
auto const num_elements = static_cast<int64_t>(num_rows) * num_columns;

std::vector<cudf::type_id> types;

std::fill_n(std::back_inserter(types), num_columns, data_type);
Comment on lines +40 to +42
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: similar suggestion at other places too.

Suggested change
std::vector<cudf::type_id> types;
std::fill_n(std::back_inserter(types), num_columns, data_type);
std::vector<cudf::type_id> types(num_columns, data_type);

is this better than using fill_n with back_inserter?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't trust std::vector's constructors, they easily become ambiguous and do the wrong thing


auto const table = create_random_table(types, row_count{num_rows});
int64_t const size_bytes = estimate_size(table->view());

state.add_element_count(num_elements, "num_elements");
state.add_global_memory_reads(size_bytes);
state.add_global_memory_writes(size_bytes);

state.exec(nvbench::exec_tag::sync, [&](nvbench::launch& launch) {
cudf::to_arrow_device(table->view(), rmm::cuda_stream_view{launch.get_stream()});
});
}

template <cudf::type_id data_type>
void BM_to_arrow_host(nvbench::state& state, nvbench::type_list<nvbench::enum_type<data_type>>)
{
auto const num_rows = static_cast<cudf::size_type>(state.get_int64("num_rows"));
int32_t const num_columns = 1;
auto const num_elements = static_cast<int64_t>(num_rows) * num_columns;

std::vector<cudf::type_id> types;

std::fill_n(std::back_inserter(types), num_columns, data_type);

auto const table = create_random_table(types, row_count{num_rows});
int64_t const size_bytes = estimate_size(table->view());

state.add_element_count(num_elements, "num_elements");
state.add_global_memory_reads(size_bytes);
state.add_global_memory_writes(size_bytes);

state.exec(nvbench::exec_tag::sync, [&](nvbench::launch& launch) {
cudf::to_arrow_host(table->view(), rmm::cuda_stream_view{launch.get_stream()});
});
}

template <cudf::type_id data_type>
void BM_from_arrow_device(nvbench::state& state, nvbench::type_list<nvbench::enum_type<data_type>>)
{
auto const num_rows = static_cast<cudf::size_type>(state.get_int64("num_rows"));
auto const num_columns = 1;
auto const num_elements = static_cast<int64_t>(num_rows) * num_columns;

std::vector<cudf::type_id> types;

std::fill_n(std::back_inserter(types), num_columns, data_type);

data_profile profile;
profile.set_struct_depth(1);
profile.set_list_depth(1);

auto const table = create_random_table(types, row_count{num_rows}, profile);
cudf::table_view table_view = table->view();
int64_t const size_bytes = estimate_size(table_view);

cudf::column_metadata column_metadata{""};
std::vector<cudf::column_metadata> children_metadata;
std::fill_n(std::back_inserter(children_metadata),
table->get_column(0).num_children(),
cudf::column_metadata{""});
column_metadata.children_meta = children_metadata;
std::vector<cudf::column_metadata> table_metadata;
table_metadata.push_back(column_metadata);

cudf::unique_schema_t schema = cudf::to_arrow_schema(table_view, table_metadata);
cudf::unique_device_array_t input = cudf::to_arrow_device(table_view);

state.add_element_count(num_elements, "num_elements");
state.add_global_memory_reads(size_bytes);
state.add_global_memory_writes(size_bytes);

state.exec(nvbench::exec_tag::sync, [&](nvbench::launch& launch) {
cudf::from_arrow_device_column(
schema.get(), input.get(), rmm::cuda_stream_view{launch.get_stream()});
});
}

template <cudf::type_id data_type>
void BM_from_arrow_host(nvbench::state& state, nvbench::type_list<nvbench::enum_type<data_type>>)
{
auto const num_rows = static_cast<cudf::size_type>(state.get_int64("num_rows"));
auto const num_columns = 1;
auto const num_elements = static_cast<int64_t>(num_rows) * num_columns;

std::vector<cudf::type_id> types;

std::fill_n(std::back_inserter(types), num_columns, data_type);

data_profile profile;
profile.set_struct_depth(1);
profile.set_list_depth(1);

auto const table = create_random_table(types, row_count{num_rows}, profile);
cudf::table_view table_view = table->view();
int64_t const size_bytes = estimate_size(table_view);

cudf::column_metadata column_metadata{""};
std::vector<cudf::column_metadata> children_metadata;
std::fill_n(std::back_inserter(children_metadata),
table->get_column(0).num_children(),
cudf::column_metadata{""});
column_metadata.children_meta = children_metadata;
std::vector<cudf::column_metadata> table_metadata;
table_metadata.push_back(column_metadata);

cudf::unique_schema_t schema = cudf::to_arrow_schema(table_view, table_metadata);
cudf::unique_device_array_t input = cudf::to_arrow_host(table_view);

state.add_element_count(num_elements, "num_elements");
state.add_global_memory_reads(size_bytes);
state.add_global_memory_writes(size_bytes);

state.exec(nvbench::exec_tag::sync, [&](nvbench::launch& launch) {
cudf::from_arrow_host_column(
schema.get(), input.get(), rmm::cuda_stream_view{launch.get_stream()});
});
}

using data_types = nvbench::enum_type_list<cudf::type_id::INT8,
cudf::type_id::INT16,
cudf::type_id::INT32,
cudf::type_id::INT64,
cudf::type_id::UINT8,
cudf::type_id::UINT16,
cudf::type_id::UINT32,
cudf::type_id::UINT64,
cudf::type_id::FLOAT32,
cudf::type_id::FLOAT64,
cudf::type_id::BOOL8,
cudf::type_id::TIMESTAMP_SECONDS,
cudf::type_id::TIMESTAMP_MILLISECONDS,
cudf::type_id::TIMESTAMP_MICROSECONDS,
cudf::type_id::TIMESTAMP_NANOSECONDS,
cudf::type_id::DURATION_SECONDS,
cudf::type_id::DURATION_MILLISECONDS,
cudf::type_id::DURATION_MICROSECONDS,
cudf::type_id::DURATION_NANOSECONDS,
// cudf::type_id::DICTIONARY32,
lamarrr marked this conversation as resolved.
Show resolved Hide resolved
cudf::type_id::STRING,
cudf::type_id::LIST,
cudf::type_id::DECIMAL128,
cudf::type_id::STRUCT>;

static char const* stringify_type(cudf::type_id value)
{
switch (value) {
case cudf::type_id::INT8: return "INT8";
case cudf::type_id::INT16: return "INT16";
case cudf::type_id::INT32: return "INT32";
case cudf::type_id::INT64: return "INT64";
case cudf::type_id::UINT8: return "UINT8";
case cudf::type_id::UINT16: return "UINT16";
case cudf::type_id::UINT32: return "UINT32";
case cudf::type_id::UINT64: return "UINT64";
case cudf::type_id::FLOAT32: return "FLOAT32";
case cudf::type_id::FLOAT64: return "FLOAT64";
case cudf::type_id::BOOL8: return "BOOL8";
case cudf::type_id::TIMESTAMP_DAYS: return "TIMESTAMP_DAYS";
case cudf::type_id::TIMESTAMP_SECONDS: return "TIMESTAMP_SECONDS";
case cudf::type_id::TIMESTAMP_MILLISECONDS: return "TIMESTAMP_MILLISECONDS";
case cudf::type_id::TIMESTAMP_MICROSECONDS: return "TIMESTAMP_MICROSECONDS";
case cudf::type_id::TIMESTAMP_NANOSECONDS: return "TIMESTAMP_NANOSECONDS";
case cudf::type_id::DURATION_DAYS: return "DURATION_DAYS";
case cudf::type_id::DURATION_SECONDS: return "DURATION_SECONDS";
case cudf::type_id::DURATION_MILLISECONDS: return "DURATION_MILLISECONDS";
case cudf::type_id::DURATION_MICROSECONDS: return "DURATION_MICROSECONDS";
case cudf::type_id::DURATION_NANOSECONDS: return "DURATION_NANOSECONDS";
case cudf::type_id::DICTIONARY32: return "DICTIONARY32";
case cudf::type_id::STRING: return "STRING";
case cudf::type_id::LIST: return "LIST";
case cudf::type_id::DECIMAL32: return "DECIMAL32";
case cudf::type_id::DECIMAL64: return "DECIMAL64";
case cudf::type_id::DECIMAL128: return "DECIMAL128";
case cudf::type_id::STRUCT: return "STRUCT";
default: return "unknown";
}
}

NVBENCH_DECLARE_ENUM_TYPE_STRINGS(cudf::type_id, stringify_type, stringify_type)
Comment on lines +199 to +234
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This might be used by other benchmarks too.
Please consider moving these to benchmark common header file or common utilities.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah. which header would you suggest that be moved to? same header as type_id?

Copy link
Contributor

@ttnghia ttnghia Nov 19, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that is a good idea. Many times, I had to implement this function for my debugging. Let's make a follow up PR, moving this into types.hpp (same header as type_id), or cudf_test/debug_utilities.hpp.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we have a similar file here: https://github.com/rapidsai/cudf/blob/branch-24.12/cpp/benchmarks/io/nvbench_helpers.hpp
Worth considering a new file in https://github.com/rapidsai/cudf/tree/branch-24.12/cpp/benchmarks/common.

Declaring an NVBench-specific utility in a non-cuDF namespace within types.hpp is inappropriate, as it also forces types.hpp to include an NVBench header, which is something we want to avoid.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree with a new header in the benchmarks/common.
We could refactor some of the elements from io/nvbench_helpers.hpp with the new header as well.
Though the refactor may be too much for this PR.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That utility function will be used not just in benchmark but also testing/debugging.

Could you elaborate on this further? How does this feature specifically benefit unit tests?

Benchmarks and unit tests are treated as "external" users of libcudf, which is why they are not included within the cudf:: namespace. The cudf/utilities directory is reserved for use cases within libcudf or those under the cudf:: namespace.

Copy link
Contributor

@ttnghia ttnghia Nov 19, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How does this feature specifically benefit unit tests?

I didn't mean it will benefit unit tests, but meant it will benefit developers in debugging/testing.

I usually debug cudf APIs by printing out the type id of the input column. For example:

printf("input type: %s\n", type_to_string(input.type().id()));

However, most of the time, for fast iteration with small issues, I just printed the type id integer value, and look up the enum, trying to identify the enum type from value. For example:

printf("input type: %d\n",(int)input.type().id());

Output: input type: 23 

(Looking at the enum class type_id, and count from the first to the 23rd line, I can know type 23 is STRING)

As such, having such type_to_string utility is very helpful. This debug print statement can be inserted anywhere. Most of the time I insert it into the middle of a cudf API. Sometimes I use it in unit tests too. It would be most beneficial to add it into libcudf's utilities module. We can either use it for debugging libcudf APIs or for external usage in benchmark/unit tests.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see what you mean. It's a different feature request.

You are referring to the fact that stringify_type can be a common utility for both benchmarks and tests. A host-device print utility like print_type_id together with stringify_type can be added to https://github.com/rapidsai/cudf/blob/branch-24.12/cpp/include/cudf_test/print_utilities.cuh. Once it's done, this file can simply include that print utility header and add a single line for nvbench readability enhancement:

NVBENCH_DECLARE_ENUM_TYPE_STRINGS(cudf::type_id, cudf::test::print::stringify_type, stringify_type)

I’m inclined to leave this refactoring for a separate PR.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Opened #17376 to track this issue.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for filing the issue 👍


NVBENCH_BENCH_TYPES(BM_to_arrow_host, NVBENCH_TYPE_AXES(data_types))
.set_type_axes_names({"data_type"})
.set_name("to_arrow_host")
.add_int64_axis("num_rows", {10'000, 100'000, 1'000'000, 10'000'000});

NVBENCH_BENCH_TYPES(BM_to_arrow_device, NVBENCH_TYPE_AXES(data_types))
.set_type_axes_names({"data_type"})
.set_name("to_arrow_device")
.add_int64_axis("num_rows", {10'000, 100'000, 1'000'000, 10'000'000});

NVBENCH_BENCH_TYPES(BM_from_arrow_host, NVBENCH_TYPE_AXES(data_types))
.set_type_axes_names({"data_type"})
.set_name("from_arrow_host")
.add_int64_axis("num_rows", {10'000, 100'000, 1'000'000, 10'000'000});

NVBENCH_BENCH_TYPES(BM_from_arrow_device, NVBENCH_TYPE_AXES(data_types))
.set_type_axes_names({"data_type"})
.set_name("from_arrow_device")
.add_int64_axis("num_rows", {10'000, 100'000, 1'000'000, 10'000'000});
Loading