-
Notifications
You must be signed in to change notification settings - Fork 921
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Update strings contains benchmarks to nvbench #15495
Conversation
d_target.empty()); | ||
|
||
if (!d_target.empty()) { | ||
if (d_target.empty()) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍 This didn't occur to me, and I've been staring at this code for a bit now.
This also includes some performance improvement for long strings
From #15536 (comment) |
Adding a link here to the code logic attempts for reference and convenience |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks great! Thanks!
cpp/benchmarks/string/find.cpp
Outdated
state.exec(nvbench::exec_tag::sync, [&](nvbench::launch& launch) { | ||
if (api == "find") { | ||
cudf::strings::find(input, target); | ||
} else if (api == "find_multi") { | ||
cudf::strings::find_multiple(input, cudf::strings_column_view(targets)); | ||
} else if (api == "contains") { | ||
cudf::strings::contains(input, target); | ||
} else if (api == "starts_with") { | ||
cudf::strings::starts_with(input, target); | ||
} else if (api == "ends_with") { | ||
cudf::strings::ends_with(input, target); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will multiple if/else inside one state.exec
lambda affect the benchmark run (as this will be executed in thousands of iterations)?
How about placing separate state.exec
in each if/else branch?
if (((i + j + d_target.size_bytes()) <= d_str.size_bytes()) && | ||
d_target.compare(d_str.data() + i + j, d_target.size_bytes()) == 0) { | ||
found = true; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will any early termination be helpful here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I benchmarked that and it did not help -- technically a bit slower.
/merge |
Description
Reference #15405
Updates the benchmarks for
cudf::strings::contains()
to use nvbench and also introduce a hit-test axis.The logic has been updated to remove the unneeded
fill()
call for long strings.Also cleaned up code and updated logic to process 4 bytes per warp thread.
Checklist