Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Check column type equality, handling nested types correctly. #14531

Merged
merged 39 commits into from
May 2, 2024
Merged
Show file tree
Hide file tree
Changes from 1 commit
Commits
Show all changes
39 commits
Select commit Hold shift + click to select a range
39d14c8
Check column types with a utility that handles nesting correctly.
bdice Mar 11, 2024
2e3979f
Merge remote-tracking branch 'upstream/branch-24.04' into nested-colu…
bdice Mar 14, 2024
57b3b50
Merge remote-tracking branch 'upstream/branch-24.06' into nested-colu…
bdice Mar 26, 2024
5f7f1f8
Add column_scalar_types_equal.
bdice Mar 26, 2024
d880838
Use column_scalar_types_equal.
bdice Mar 26, 2024
8b152dc
Fix bugs and update tests.
bdice Mar 26, 2024
5cf4769
Add scalar_types_equal.
bdice Mar 27, 2024
8d938bb
Fix tests.
bdice Mar 27, 2024
f7d8c41
Merge remote-tracking branch 'upstream/branch-24.06' into nested-colu…
bdice Mar 27, 2024
5430397
Add data_type_errors.
bdice Mar 27, 2024
8e02603
Fix tests.
bdice Mar 27, 2024
a1bbecb
Fix remaining type checks that should use new utilities.
bdice Mar 27, 2024
0de9f9b
Add docs section on comparing data types.
bdice Mar 27, 2024
fc7b821
Fix typo.
bdice Mar 27, 2024
55e0c9f
Merge remote-tracking branch 'upstream/branch-24.06' into nested-colu…
bdice Apr 1, 2024
faa92c8
Use cudf::is_fixed_point.
bdice Apr 1, 2024
6556266
Fix missing paren.
bdice Apr 1, 2024
6531a53
list column, not lists column
bdice Apr 4, 2024
454d413
Rename functions to be overloads of cudf::types_equal, update and exp…
bdice Apr 9, 2024
c91df10
Rename function.
bdice Apr 9, 2024
c742253
Rename to have_same_types, apply PR feedback.
bdice Apr 10, 2024
8a44004
Merge remote-tracking branch 'upstream/branch-24.06' into nested-colu…
bdice Apr 10, 2024
fce091f
Add back deprecated methods.
bdice Apr 12, 2024
cafdf6f
Fix verb/subject agreement.
bdice Apr 12, 2024
cda4e0d
Update docs for all_have_same_types.
bdice Apr 12, 2024
ec740c9
Un-deprecate column_types_equivalent.
bdice Apr 12, 2024
ec0eeff
Apply patch from Lawrence.
bdice Apr 12, 2024
5afc5e2
Use std::equal.
bdice Apr 12, 2024
18cdd52
Switch back to column_types_equivalent.
bdice Apr 12, 2024
7c05d95
Fix tests that had incorrect nested types.
bdice Apr 13, 2024
7ba2664
Merge branch 'nested-column-type-checks' of github.com:bdice/cudf int…
bdice Apr 13, 2024
9904431
Merge remote-tracking branch 'upstream/branch-24.06' into nested-colu…
bdice Apr 30, 2024
15d05e0
Fix clang-format.
bdice Apr 30, 2024
57c1f64
Move table_view overload from table_view.hpp to type_checks.hpp.
bdice May 1, 2024
25c5ad9
Mirror scalar/column call.
bdice May 1, 2024
d764f7e
Minor cleanup.
bdice May 1, 2024
e0732b9
Improve dispatching to existing overloads.
bdice May 1, 2024
fc0dada
Merge branch 'nested-column-type-checks' of github.com:bdice/cudf int…
bdice May 1, 2024
32444c5
Merge branch 'branch-24.06' into nested-column-type-checks
bdice May 1, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Prev Previous commit
Next Next commit
Fix tests that had incorrect nested types.
  • Loading branch information
bdice committed Apr 13, 2024
commit 7c05d952d0dcac665a294d2e30d8d7029f035d90
11 changes: 3 additions & 8 deletions cpp/tests/copying/get_value_tests.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -542,11 +542,6 @@ struct ListGetStructValueTest : public cudf::test::BaseFixture {
return SCW{{field1, field2, field3}, mask};
}

/**
* @brief Create a 0-length structs column
*/
SCW zero_length_struct() { return SCW{}; }

/**
* @brief Concatenate structs columns, allow specifying inputs in `initializer_list`
*/
Expand Down Expand Up @@ -653,7 +648,7 @@ TYPED_TEST(ListGetStructValueTest, NonNestedGetNonNullEmpty)
cudf::size_type index = 2;
// For well-formed list column, an empty list still holds the complete structure of
// a 0-length structs column
auto expected_data = this->zero_length_struct();
auto expected_data = this->make_test_structs_column({}, {}, {}, no_nulls());

auto s = cudf::get_element(list_column->view(), index);
auto typed_s = static_cast<cudf::list_scalar const*>(s.get());
Expand Down Expand Up @@ -757,8 +752,8 @@ TYPED_TEST(ListGetStructValueTest, NestedGetNonNullEmpty)
auto list_column_nested =
this->make_test_lists_column(3, {0, 1, 1, 2}, std::move(list_column), {1, 1, 1});

auto expected_data =
this->make_test_lists_column(0, {0}, this->zero_length_struct().release(), {});
auto expected_data = this->make_test_lists_column(
0, {0}, this->make_test_structs_column({}, {}, {}, no_nulls()).release(), {});

cudf::size_type index = 1;
auto s = cudf::get_element(list_column_nested->view(), index);
Expand Down
4 changes: 1 addition & 3 deletions cpp/tests/io/parquet_writer_test.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -556,9 +556,7 @@ TEST_F(ParquetWriterTest, EmptyList)
auto result = cudf::io::read_parquet(
cudf::io::parquet_reader_options_builder(cudf::io::source_info(filepath)));

using lcw = cudf::test::lists_column_wrapper<int64_t>;
auto expected = lcw{lcw{}, lcw{}, lcw{}};
CUDF_TEST_EXPECT_COLUMNS_EQUAL(result.tbl->view().column(0), expected);
CUDF_TEST_EXPECT_COLUMNS_EQUAL(result.tbl->view().column(0), L0->view());
}

TEST_F(ParquetWriterTest, DeepEmptyList)
Expand Down
8 changes: 1 addition & 7 deletions cpp/tests/utilities/column_utilities.cu
Original file line number Diff line number Diff line change
Expand Up @@ -239,12 +239,6 @@ std::unique_ptr<column> generate_child_row_indices(lists_column_view const& c,

template <bool check_exact_equality>
struct column_property_comparator {
bool types_equivalent(cudf::column_view const& lhs, cudf::column_view const& rhs)
{
return cudf::is_fixed_point(lhs.type()) ? lhs.type().id() == rhs.type().id()
: cudf::have_same_types(lhs, rhs);
}

bool compare_common(cudf::column_view const& lhs,
cudf::column_view const& rhs,
cudf::column_view const& lhs_row_indices,
Expand All @@ -256,7 +250,7 @@ struct column_property_comparator {
if (check_exact_equality) {
PROP_EXPECT_EQ(cudf::have_same_types(lhs, rhs), true);
} else {
PROP_EXPECT_EQ(types_equivalent(lhs, rhs), true);
PROP_EXPECT_EQ(cudf::column_types_equivalent(lhs, rhs), true);
}

auto const lhs_size = check_exact_equality ? lhs.size() : lhs_row_indices.size();
Expand Down