Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[r] Port the query-condition logic from TileDB-R to TileDB-SOMA-R #3162

Merged
merged 4 commits into from
Oct 15, 2024

Conversation

johnkerl
Copy link
Member

@johnkerl johnkerl commented Oct 10, 2024

Issue and/or context: #3051 [sc-55672]

Changes:

Here I am only introducing a non-exported function parse_query_condition_new, and unit-testing it thoroughly. Replacing the exiting parse_query_condition with this will be a follow-on PR.

Notes for Reviewer:

Timestamp handling will be deferred to a follow-up PR: #3163.

@johnkerl johnkerl changed the title [r] Port query-condition logic from TileDB-R to TileDB-SOMA-R [WIP] [r] Port the query-condition logic from TileDB-R to TileDB-SOMA-R [WIP] Oct 10, 2024
@johnkerl johnkerl force-pushed the kerl/r-qc branch 2 times, most recently from 19a9dcd to c7ab637 Compare October 10, 2024 22:36
@johnkerl johnkerl marked this pull request as ready for review October 10, 2024 22:37
@johnkerl johnkerl force-pushed the kerl/r-qc branch 3 times, most recently from cf8906f to 4855c0e Compare October 10, 2024 23:47
@johnkerl johnkerl changed the title [r] Port the query-condition logic from TileDB-R to TileDB-SOMA-R [WIP] [r] Port the query-condition logic from TileDB-R to TileDB-SOMA-R Oct 10, 2024
@johnkerl johnkerl requested a review from nguyenv October 11, 2024 20:10
apis/r/R/QueryCondition.R Outdated Show resolved Hide resolved
apis/r/R/QueryCondition.R Outdated Show resolved Hide resolved
apis/r/R/QueryCondition.R Outdated Show resolved Hide resolved
apis/r/R/QueryCondition.R Outdated Show resolved Hide resolved
apis/r/R/QueryCondition.R Outdated Show resolved Hide resolved
Comment on lines +34 to +54
tiledb_query_condition_op_t _op_name_to_tdb_op(const std::string& opstr) {
if (opstr == "LT") {
return TILEDB_LT;
} else if (opstr == "LE") {
return TILEDB_LE;
} else if (opstr == "GT") {
return TILEDB_GT;
} else if (opstr == "GE") {
return TILEDB_GE;
} else if (opstr == "EQ") {
return TILEDB_EQ;
} else if (opstr == "NE") {
return TILEDB_NE;
} else if (opstr == "IN") {
return TILEDB_IN;
} else if (opstr == "NOT_IN") {
return TILEDB_NOT_IN;
} else {
Rcpp::stop("Unknown TileDB op string '%s'", opstr.c_str());
}
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: I prefer map than if-else for readability.

Suggested change
tiledb_query_condition_op_t _op_name_to_tdb_op(const std::string& opstr) {
if (opstr == "LT") {
return TILEDB_LT;
} else if (opstr == "LE") {
return TILEDB_LE;
} else if (opstr == "GT") {
return TILEDB_GT;
} else if (opstr == "GE") {
return TILEDB_GE;
} else if (opstr == "EQ") {
return TILEDB_EQ;
} else if (opstr == "NE") {
return TILEDB_NE;
} else if (opstr == "IN") {
return TILEDB_IN;
} else if (opstr == "NOT_IN") {
return TILEDB_NOT_IN;
} else {
Rcpp::stop("Unknown TileDB op string '%s'", opstr.c_str());
}
}
tiledb_query_condition_op_t _op_name_to_tdb_op(const std::string& opstr) {
static const std::unordered_map<std::string, tiledb_query_condition_op_t> op_map = {
{"LT", TILEDB_LT},
{"LE", TILEDB_LE},
{"GT", TILEDB_GT},
{"GE", TILEDB_GE},
{"EQ", TILEDB_EQ},
{"NE", TILEDB_NE},
{"IN", TILEDB_IN},
{"NOT_IN", TILEDB_NOT_IN}
};
auto op_pair = op_map.find(opstr);
if (op_pair != op_map.end()) {
return op_pair->second;
} else {
Rcpp::stop("Unknown TileDB op string '%s'", opstr.c_str());
}
}

@@ -0,0 +1,206 @@
test_that("DataFrame Factory", {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some other cases I can think of:

  • Mismatched types like soma_joinid == "ten"
  • What is the behavior for comparing floats and strings like soma_joinid == 10.0?
  • What is the behavior for overflow/underflow int8== 100000000?
  • Comparing signed and unsigned int8 == -1
  • Make sure parens evaluate correctly
  • Incomplete query conditions: soma_joinid ==, soma_joinid && int8, soma_joinid == 1 &&

Could potentially do this on a follow-up PR.

Copy link
Member

@mojaveazure mojaveazure left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM; please bump the develop version, update the changelog, and 🚢

@johnkerl
Copy link
Member Author

@nguyenv re #3162 (comment) I'll do these in a follow-up PR

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants