Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[C++][Gandiva] Segfault when in_expr of decimal type cached and crash #39784

Open
likun666661 opened this issue Jan 24, 2024 · 0 comments
Open

Comments

@likun666661
Copy link

Describe the bug, including details regarding any error messages, version, and platform.

Describe the bug, including details regarding any error messages, version, and platform.
When using Gandiva for an 'in' expression with decimals, if the 'in' expression for decimal matches Gandiva's object code, a crash occurs when performing calculations on input data.
A reproducible test example is as follows:

TEST_F(TestIn, TestInDecimal) {
  int32_t precision = 38;
  int32_t scale = 5;
  auto decimal_type = std::make_shared<arrow::Decimal128Type>(precision, scale);

  // schema for input fields
  auto field0 = field("f0", arrow::decimal(precision, scale));
  auto schema = arrow::schema({field0});

  // Build In f0 + f1 in (6, 11)
  auto node_f0 = TreeExprBuilder::MakeField(field0);

  gandiva::DecimalScalar128 d0("6", precision, scale);
  gandiva::DecimalScalar128 d1("12", precision, scale);
  gandiva::DecimalScalar128 d2("11", precision, scale);
  std::unordered_set<gandiva::DecimalScalar128> in_constants({d0, d1, d2});
  auto in_expr = TreeExprBuilder::MakeInExpressionDecimal(node_f0, in_constants);
  auto condition = TreeExprBuilder::MakeCondition(in_expr);

  // Create a row-batch with some sample data
  int num_records = 5;
  auto values0 = MakeDecimalVector({"1", "2", "0", "-6", "6"});
  auto array0 =
      MakeArrowArrayDecimal(decimal_type, values0, {true, true, true, false, true});
  // expected output (indices for which condition matches)
  auto exp = MakeArrowArrayUint16({4});

  // prepare input record batch
  auto in_batch = arrow::RecordBatch::Make(schema, num_records, {array0});

  {
    std::shared_ptr<Filter> filter;
    auto status = Filter::Make(schema, condition, TestConfiguration(), &filter);
    EXPECT_TRUE(status.ok());
    std::shared_ptr<SelectionVector> selection_vector;
    status = SelectionVector::MakeInt16(num_records, pool_, &selection_vector);
    EXPECT_TRUE(status.ok());

    // Evaluate expression
    status = filter->Evaluate(*in_batch, selection_vector);
    EXPECT_TRUE(status.ok());

    // Validate results
    EXPECT_ARROW_ARRAY_EQUALS(exp, selection_vector->ToArray());
  }

  std::shared_ptr<Filter> new_filter;
  auto status = Filter::Make(schema, condition, TestConfiguration(), &new_filter);
  EXPECT_TRUE(status.ok());
  EXPECT_TRUE(new_filter->GetBuiltFromCache());

  std::shared_ptr<SelectionVector> selection_vector_new;
  status = SelectionVector::MakeInt16(num_records, pool_, &selection_vector_new);
  EXPECT_TRUE(status.ok());

  status = new_filter->Evaluate(*in_batch, selection_vector_new);
  EXPECT_TRUE(status.ok());
}

Component(s)

C++ - Gandiva

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant