-
Notifications
You must be signed in to change notification settings - Fork 87
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
CAGRA API update and allow async host refinement #131
CAGRA API update and allow async host refinement #131
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks Malte for the PR, it looks good overall, just two small comments.
Public API is needed to allow fine control of CAGRA IVF-PQ build options.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks Malte, for the updates! Let's discuss API questions with @cjnolet, otherwise, the refinement improvement part looks good!
cpp/include/cuvs/neighbors/cagra.hpp
Outdated
const cuvs::neighbors::cagra::index_params& params, | ||
raft::host_matrix_view<const uint8_t, int64_t, raft::row_major> dataset) | ||
-> cuvs::neighbors::cagra::index<uint8_t, uint32_t>; | ||
auto build( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe the documentation from cagra.cuh shall be moved here. @cjnolet to confirm.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Correct- we should document every function / prototype exposed in include/cuvs/**.hpp
. It seems repetitive but the API docs are fully indexed and searchable so when a user finds one of these functions through search, it's guaranteed to have the docs carried along with it.
cpp/include/cuvs/neighbors/cagra.hpp
Outdated
raft::resources const& handle, | ||
const cuvs::neighbors::cagra::index_params& params, | ||
raft::host_matrix_view<const uint8_t, int64_t, raft::row_major> dataset, | ||
std::optional<cuvs::neighbors::nn_descent::index_params> nn_descent_params = std::nullopt, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is awkward- a user is going to have to set std::nullopt for at least one of nn-descent_params or ivf_pq params. I think we should make these build_params for cagra. Actually, I think what we should do is take away the graph_build_algo
option from build_params
and add a graph_build_params
option that accepts the cuvs::neighbors::build_params
base class, documents it approprirately with what's allowed, and then dispatches accordingly. Also remember to raise an exception if anything other than ivf_pq
or nn_descent
is used. This allows us to add more graph build algos in the future without too much additional effort but it also makes the public APIs super simple for the user- by default we can just set the build_params
to the default nn_descent::build_params
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
On second though- since you need refine_ratio on the ivf_pq_build_params
, why not have a new type cagra::graph_build_params
which can extend cuvs::neighbors::build_params
? Then you can have a cagra::graph_build_params::ivf_pq
which extends ivf_pq::build_params
and cagra::graph_build_params
and adds the refine ratio
. Then you can have cagra__graph_build_params::nn_descent
which does the same for nn-descent. This way you can have cagra::build_params.graph_build_params
which accepts anything that extends cagra::graph_build_params
and dispatches the build algo accordingly? This would get all this additional stuff out of cagra::build
and keep the API clean and easy to follow, while minimizing duplication.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@cjnolet , i am not sure whether I understand correctly what you are proposing. An abstract graph_build_params struct that is inherited from different variants, e.g.
namespace ivf_pq {
struct graph_build_params : cuvs::neighbors::cagra::index_params {
cuvs::neighbors::ivf_pq::index_params pq_build_params;
cuvs::neighbors::ivf_pq::search_params pq_search_params;
float refinement_rate = 2f;
};
} // namespace
for the ivf_pq variant? To my understanding this would require to either implement interfaces individually or use of polymorphism/pointers that can be casted/dispatched. Do I understand your proposal correctly?
As an alternative cleanup step we could also combine additional variant-specific-parameters into (ivf_pq::)graph_build_params
and add them as a single argument to the cagra::index_params
struct index_params : cuvs::neighbors::index_params {
size_t intermediate_graph_degree = 128;
size_t graph_degree = 64;
std::optional<vpq_params> compression = std::nullopt;
bool construct_index_with_dataset = true;
std::variant<ivf_pq::graph_build_params, nn_descent::graph_build_params> graph_build_params;
};
What do you think should we go for?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think your last idea looks great. It's clean, explicit, and tells the user right away what graph_build_params variants are supported.
cpp/include/cuvs/neighbors/cagra.hpp
Outdated
raft::resources const& handle, | ||
const cuvs::neighbors::cagra::index_params& params, | ||
raft::host_matrix_view<const uint8_t, int64_t, raft::row_major> dataset, | ||
std::optional<cuvs::neighbors::nn_descent::index_params> nn_descent_params = std::nullopt, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
On second though- since you need refine_ratio on the ivf_pq_build_params
, why not have a new type cagra::graph_build_params
which can extend cuvs::neighbors::build_params
? Then you can have a cagra::graph_build_params::ivf_pq
which extends ivf_pq::build_params
and cagra::graph_build_params
and adds the refine ratio
. Then you can have cagra__graph_build_params::nn_descent
which does the same for nn-descent. This way you can have cagra::build_params.graph_build_params
which accepts anything that extends cagra::graph_build_params
and dispatches the build algo accordingly? This would get all this additional stuff out of cagra::build
and keep the API clean and easy to follow, while minimizing duplication.
/** | ||
* Specify compression parameters if compression is desired. | ||
* | ||
* NOTE: this is experimental new API, consider it unsafe. | ||
*/ | ||
std::optional<cuvs::neighbors::vpq_params> compression = std::nullopt; | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
bool construct_index_with_dataset = true; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Removed, will try to put dataset on device. In case of OOM, catch error and return index without added dataset.
@tfeher , thanks for the review. I pushed a new version containing the suggestions. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks Malte, just one nitpick.
cpp/include/cuvs/neighbors/cagra.hpp
Outdated
/** | ||
* Specify compression parameters if compression is desired. | ||
* | ||
* NOTE: this is experimental new API, consider it unsafe. | ||
*/ | ||
std::optional<cuvs::neighbors::vpq_params> compression = std::nullopt; | ||
|
||
bool construct_index_with_dataset = true; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
cc @divyegala this is the option to enable not having the dataset vectors copied to device automatically after building the index.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
index_params.graph_build_params = cuvs::neighbors::cagra::graph_build_params::ivf_pq_params{}; | ||
break; | ||
case cuvsCagraGraphBuildAlgo::NN_DESCENT: | ||
cuvs::neighbors::cagra::graph_build_params::nn_descent_params nn_descent_params{}; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we consolidate the C++ and C structs so that we don't need to duplicate all of these values, please? It's going to be painful to manage two complete copies of these structs because changes are going to get made to one without being made to the other.
I did this with the DIstanceType
enum in cuvs::distance
, for example. I think we should ultimately strive to write these once (in C++ or C) and then expose them through the other.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Highlighting recent changes.
/** | ||
* Specify compression parameters if compression is desired. | ||
* | ||
* NOTE: this is experimental new API, consider it unsafe. | ||
*/ | ||
std::optional<cuvs::neighbors::vpq_params> compression = std::nullopt; | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Removed, will try to put dataset on device. In case of OOM, catch error and return index without added dataset.
search_params.n_probes = std::max<uint32_t>(10, build_params.n_lists * 0.01); | ||
search_params.lut_dtype = CUDA_R_16F; | ||
search_params.internal_distance_dtype = CUDA_R_16F; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated defaults.
build_knn_graph(res, dataset, knn_graph->view(), refine_rate, pq_build_params, search_params); | ||
// Set default value in case knn_build_params is not defined. | ||
auto knn_build_params = params.graph_build_params; | ||
if (std::holds_alternative<std::monostate>(params.graph_build_params)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Select default algorithm
// We just add the graph. User is expected to update dataset separately. This branch is used | ||
// if user needs special control of memory allocations for the dataset. | ||
} catch (std::bad_alloc& e) { | ||
RAFT_LOG_DEBUG("Insufficient GPU memory to construct CAGRA index with dataset on GPU"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Automatically construct index without dataset if dataset would not fit GPU memory.
* @return true if enough GPU memory can be allocated | ||
* @return false otherwise | ||
*/ | ||
bool has_enough_device_memory(raft::resources const& res, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Small nitpick- can we move this to the cuvs::neighbors::nn_descent::helpers
namespace please?
We can do it in a follow-on PR for 24.08 so we can get this merged ASAP.
/merge |
Fixes handling OOM error during CAGRA index creation, that was introduced in #131. Authors: - Tamas Bela Feher (https://github.com/tfeher) Approvers: - Corey J. Nolet (https://github.com/cjnolet) URL: #167
This PR updates the CAGRA public API, changes defaults, and improves refinement during IVF-PQ build step. Updated defaults: - By default CAGRA would select NN descent. We fall back to IVF-PQ build algorithm if there is not enough memory for NN descent. - For the IVF-PQ build algo, the search params were updated to use `n_probe = 0.01*nlist`, and both LUT and internal distance type is set to `fp16` (as opposed to `fp8/fp32` previously) - By default build would create the index that contains both the `graph` and the `dataset` on GPU. If the `dataset` does not fit gpu, then the returned index will only contain the graph (on device). In such case the user is expected to call `index.update_dataset()` (for example with dataset in managed memory) before we can `search` the index. API changes: - We can specify IVF-PQ build algo parameters the following way: ```c++ cagra::index_params params; params.graph_degree = 32; params.intermediate_graph_degree = 48; auto pq_params = cagra::graph_build_algo::ivf_pq_params(dataset.extents()); // This sets reasonable defaults, but can be updated by the user, e.g.: pq_params.pq_dim = 32; // Select IVF-PQ algorithm by passing ivf_pq_params as graph_build_algo params.graph_build_algo = pq_params; ``` - We can specify NN descent algo parameters a similar way ```c++ cagra::index_params params; params.graph_degree = 32; params.intermediate_graph_degree = 48; // Select NN descent algo by passing nn_descent_params params.graph_build_algo = cagra::graph_build_params::nn_descent_params(intermediate_degree) ; ``` Additionally, this PR optimizes the IVF-PQ refinement step within the CAGRA graph creation in case the dataset is in host-memory. * depending on hardware, a certain amount of refinement comes for free * the change also has a positive effect without refinement as the graph creation is done on host either way Authors: - Malte Förster (https://github.com/mfoerste4) - Tamas Bela Feher (https://github.com/tfeher) - Corey J. Nolet (https://github.com/cjnolet) Approvers: - Corey J. Nolet (https://github.com/cjnolet) - Tamas Bela Feher (https://github.com/tfeher) URL: rapidsai/cuvs#131
This PR updates the CAGRA public API, changes defaults, and improves refinement during IVF-PQ build step.
Updated defaults:
By default CAGRA would select NN descent. We fall back to IVF-PQ build algorithm if there is not enough memory for NN descent.
For the IVF-PQ build algo, the search params were updated to use
n_probe = 0.01*nlist
, and both LUT and internal distance type is set tofp16
(as opposed tofp8/fp32
previously)By default build would create the index that contains both the
graph
and thedataset
on GPU. If thedataset
does not fit gpu, then the returned index will only contain the graph (on device). In such case the user is expected to callindex.update_dataset()
(for example with dataset in managed memory) before we cansearch
the index.API changes:
Additionally, this PR optimizes the IVF-PQ refinement step within the CAGRA graph creation in case the dataset is in host-memory.