Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CAGRA API update and allow async host refinement #131

Merged

Conversation

mfoerste4
Copy link
Contributor

@mfoerste4 mfoerste4 commented May 17, 2024

This PR updates the CAGRA public API, changes defaults, and improves refinement during IVF-PQ build step.

Updated defaults:

  • By default CAGRA would select NN descent. We fall back to IVF-PQ build algorithm if there is not enough memory for NN descent.

  • For the IVF-PQ build algo, the search params were updated to use n_probe = 0.01*nlist, and both LUT and internal distance type is set to fp16 (as opposed to fp8/fp32 previously)

  • By default build would create the index that contains both the graph and the dataset on GPU. If the dataset does not fit gpu, then the returned index will only contain the graph (on device). In such case the user is expected to call index.update_dataset() (for example with dataset in managed memory) before we can search the index.

API changes:

  • We can specify IVF-PQ build algo parameters the following way:
cagra::index_params params;
params.graph_degree = 32;
params.intermediate_graph_degree = 48;
auto pq_params = cagra::graph_build_algo::ivf_pq_params(dataset.extents());
// This sets reasonable defaults, but can be updated by the user, e.g.:
pq_params.pq_dim = 32;
// Select IVF-PQ algorithm by passing ivf_pq_params as graph_build_algo
params.graph_build_algo = pq_params;
  • We can specify NN descent algo parameters a similar way
cagra::index_params params;
params.graph_degree = 32;
params.intermediate_graph_degree = 48;
// Select NN descent algo by passing nn_descent_params
params.graph_build_algo = cagra::graph_build_params::nn_descent_params(intermediate_degree) ;

Additionally, this PR optimizes the IVF-PQ refinement step within the CAGRA graph creation in case the dataset is in host-memory.

  • depending on hardware, a certain amount of refinement comes for free
  • the change also has a positive effect without refinement as the graph creation is done on host either way

@github-actions github-actions bot added the cpp label May 17, 2024
@cjnolet cjnolet added improvement Improves an existing functionality non-breaking Introduces a non-breaking change labels May 17, 2024
Copy link
Contributor

@tfeher tfeher left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks Malte for the PR, it looks good overall, just two small comments.

Public API is needed to allow fine control of CAGRA IVF-PQ build options.

@mfoerste4 mfoerste4 changed the title [WIP] CAGRA/IVF-PQ index allow async host refinement CAGRA/IVF-PQ index allow async host refinement May 22, 2024
@mfoerste4 mfoerste4 requested a review from tfeher May 22, 2024 22:21
@mfoerste4 mfoerste4 marked this pull request as ready for review May 23, 2024 08:40
@mfoerste4 mfoerste4 requested a review from a team as a code owner May 23, 2024 08:40
Copy link
Contributor

@tfeher tfeher left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks Malte, for the updates! Let's discuss API questions with @cjnolet, otherwise, the refinement improvement part looks good!

const cuvs::neighbors::cagra::index_params& params,
raft::host_matrix_view<const uint8_t, int64_t, raft::row_major> dataset)
-> cuvs::neighbors::cagra::index<uint8_t, uint32_t>;
auto build(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe the documentation from cagra.cuh shall be moved here. @cjnolet to confirm.

Copy link
Member

@cjnolet cjnolet May 24, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Correct- we should document every function / prototype exposed in include/cuvs/**.hpp. It seems repetitive but the API docs are fully indexed and searchable so when a user finds one of these functions through search, it's guaranteed to have the docs carried along with it.

raft::resources const& handle,
const cuvs::neighbors::cagra::index_params& params,
raft::host_matrix_view<const uint8_t, int64_t, raft::row_major> dataset,
std::optional<cuvs::neighbors::nn_descent::index_params> nn_descent_params = std::nullopt,
Copy link
Member

@cjnolet cjnolet May 24, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is awkward- a user is going to have to set std::nullopt for at least one of nn-descent_params or ivf_pq params. I think we should make these build_params for cagra. Actually, I think what we should do is take away the graph_build_algo option from build_params and add a graph_build_params option that accepts the cuvs::neighbors::build_params base class, documents it approprirately with what's allowed, and then dispatches accordingly. Also remember to raise an exception if anything other than ivf_pq or nn_descent is used. This allows us to add more graph build algos in the future without too much additional effort but it also makes the public APIs super simple for the user- by default we can just set the build_params to the default nn_descent::build_params.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On second though- since you need refine_ratio on the ivf_pq_build_params, why not have a new type cagra::graph_build_params which can extend cuvs::neighbors::build_params? Then you can have a cagra::graph_build_params::ivf_pq which extends ivf_pq::build_params and cagra::graph_build_params and adds the refine ratio. Then you can have cagra__graph_build_params::nn_descent which does the same for nn-descent. This way you can have cagra::build_params.graph_build_params which accepts anything that extends cagra::graph_build_params and dispatches the build algo accordingly? This would get all this additional stuff out of cagra::build and keep the API clean and easy to follow, while minimizing duplication.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@cjnolet , i am not sure whether I understand correctly what you are proposing. An abstract graph_build_params struct that is inherited from different variants, e.g.

namespace ivf_pq { 
  struct graph_build_params : cuvs::neighbors::cagra::index_params {
    cuvs::neighbors::ivf_pq::index_params pq_build_params;
    cuvs::neighbors::ivf_pq::search_params pq_search_params;
    float refinement_rate = 2f;
};
} // namespace

for the ivf_pq variant? To my understanding this would require to either implement interfaces individually or use of polymorphism/pointers that can be casted/dispatched. Do I understand your proposal correctly?

As an alternative cleanup step we could also combine additional variant-specific-parameters into (ivf_pq::)graph_build_params and add them as a single argument to the cagra::index_params

struct index_params : cuvs::neighbors::index_params {
  size_t intermediate_graph_degree = 128;
  size_t graph_degree = 64;
  std::optional<vpq_params> compression = std::nullopt;
  bool construct_index_with_dataset = true;
  std::variant<ivf_pq::graph_build_params, nn_descent::graph_build_params> graph_build_params;
};

What do you think should we go for?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think your last idea looks great. It's clean, explicit, and tells the user right away what graph_build_params variants are supported.

raft::resources const& handle,
const cuvs::neighbors::cagra::index_params& params,
raft::host_matrix_view<const uint8_t, int64_t, raft::row_major> dataset,
std::optional<cuvs::neighbors::nn_descent::index_params> nn_descent_params = std::nullopt,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On second though- since you need refine_ratio on the ivf_pq_build_params, why not have a new type cagra::graph_build_params which can extend cuvs::neighbors::build_params? Then you can have a cagra::graph_build_params::ivf_pq which extends ivf_pq::build_params and cagra::graph_build_params and adds the refine ratio. Then you can have cagra__graph_build_params::nn_descent which does the same for nn-descent. This way you can have cagra::build_params.graph_build_params which accepts anything that extends cagra::graph_build_params and dispatches the build algo accordingly? This would get all this additional stuff out of cagra::build and keep the API clean and easy to follow, while minimizing duplication.

@tfeher tfeher mentioned this pull request May 28, 2024
10 tasks
/**
* Specify compression parameters if compression is desired.
*
* NOTE: this is experimental new API, consider it unsafe.
*/
std::optional<cuvs::neighbors::vpq_params> compression = std::nullopt;

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
bool construct_index_with_dataset = true;

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed, will try to put dataset on device. In case of OOM, catch error and return index without added dataset.

@mfoerste4 mfoerste4 requested a review from tfeher May 29, 2024 12:48
@mfoerste4
Copy link
Contributor Author

@tfeher , thanks for the review. I pushed a new version containing the suggestions.

Copy link
Contributor

@tfeher tfeher left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks Malte, just one nitpick.

@mfoerste4 mfoerste4 requested a review from cjnolet May 29, 2024 17:08
@mfoerste4
Copy link
Contributor Author

@cnolet, @tfeher, I just pushed a fix for the docs which was the last red CI check.

/**
* Specify compression parameters if compression is desired.
*
* NOTE: this is experimental new API, consider it unsafe.
*/
std::optional<cuvs::neighbors::vpq_params> compression = std::nullopt;

bool construct_index_with_dataset = true;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cc @divyegala this is the option to enable not having the dataset vectors copied to device automatically after building the index.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

index_params.graph_build_params = cuvs::neighbors::cagra::graph_build_params::ivf_pq_params{};
break;
case cuvsCagraGraphBuildAlgo::NN_DESCENT:
cuvs::neighbors::cagra::graph_build_params::nn_descent_params nn_descent_params{};
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we consolidate the C++ and C structs so that we don't need to duplicate all of these values, please? It's going to be painful to manage two complete copies of these structs because changes are going to get made to one without being made to the other.

I did this with the DIstanceType enum in cuvs::distance, for example. I think we should ultimately strive to write these once (in C++ or C) and then expose them through the other.

@tfeher tfeher requested a review from a team as a code owner May 30, 2024 12:59
@github-actions github-actions bot added the CMake label May 30, 2024
@tfeher tfeher changed the title CAGRA/IVF-PQ index allow async host refinement CAGRA API update and allow async host refinement May 30, 2024
Copy link
Contributor

@tfeher tfeher left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Highlighting recent changes.

/**
* Specify compression parameters if compression is desired.
*
* NOTE: this is experimental new API, consider it unsafe.
*/
std::optional<cuvs::neighbors::vpq_params> compression = std::nullopt;

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed, will try to put dataset on device. In case of OOM, catch error and return index without added dataset.

Comment on lines +29 to +31
search_params.n_probes = std::max<uint32_t>(10, build_params.n_lists * 0.01);
search_params.lut_dtype = CUDA_R_16F;
search_params.internal_distance_dtype = CUDA_R_16F;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated defaults.

build_knn_graph(res, dataset, knn_graph->view(), refine_rate, pq_build_params, search_params);
// Set default value in case knn_build_params is not defined.
auto knn_build_params = params.graph_build_params;
if (std::holds_alternative<std::monostate>(params.graph_build_params)) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Select default algorithm

// We just add the graph. User is expected to update dataset separately. This branch is used
// if user needs special control of memory allocations for the dataset.
} catch (std::bad_alloc& e) {
RAFT_LOG_DEBUG("Insufficient GPU memory to construct CAGRA index with dataset on GPU");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Automatically construct index without dataset if dataset would not fit GPU memory.

* @return true if enough GPU memory can be allocated
* @return false otherwise
*/
bool has_enough_device_memory(raft::resources const& res,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Small nitpick- can we move this to the cuvs::neighbors::nn_descent::helpers namespace please?

We can do it in a follow-on PR for 24.08 so we can get this merged ASAP.

@cjnolet cjnolet requested a review from tfeher May 30, 2024 14:39
@cjnolet
Copy link
Member

cjnolet commented May 30, 2024

/merge

@rapids-bot rapids-bot bot merged commit 13e0732 into rapidsai:branch-24.06 May 30, 2024
54 checks passed
@tfeher tfeher mentioned this pull request May 30, 2024
rapids-bot bot pushed a commit that referenced this pull request May 30, 2024
Fixes handling OOM error during CAGRA index creation, that was introduced in #131.

Authors:
  - Tamas Bela Feher (https://github.com/tfeher)

Approvers:
  - Corey J. Nolet (https://github.com/cjnolet)

URL: #167
difyrrwrzd added a commit to difyrrwrzd/cuvs that referenced this pull request Aug 10, 2024
This PR updates the CAGRA public API, changes defaults, and improves refinement during IVF-PQ build step.

Updated defaults:
- By default CAGRA would select NN descent. We fall back to IVF-PQ build algorithm if there is not enough memory for NN descent.
- For the IVF-PQ build algo, the search params were updated to use `n_probe = 0.01*nlist`, and both LUT and internal distance type is set to `fp16` (as opposed to `fp8/fp32` previously)

- By default build would create the index that contains both the `graph` and the `dataset` on GPU. If the `dataset` does not fit gpu, then the returned index will only contain the graph (on device). In such case the user is expected to call `index.update_dataset()`  (for example with dataset in managed memory) before we can `search` the index. 

API changes:
- We can specify IVF-PQ build algo parameters the following way:
```c++
cagra::index_params params;
params.graph_degree = 32;
params.intermediate_graph_degree = 48;
auto pq_params = cagra::graph_build_algo::ivf_pq_params(dataset.extents());
// This sets reasonable defaults, but can be updated by the user, e.g.:
pq_params.pq_dim = 32;
// Select IVF-PQ algorithm by passing ivf_pq_params as graph_build_algo
params.graph_build_algo = pq_params;
```

- We can specify NN descent algo parameters a similar way
```c++
cagra::index_params params;
params.graph_degree = 32;
params.intermediate_graph_degree = 48;
// Select NN descent algo by passing nn_descent_params
params.graph_build_algo = cagra::graph_build_params::nn_descent_params(intermediate_degree) ;
```

Additionally, this PR optimizes the IVF-PQ refinement step within the CAGRA graph creation in case the dataset is in host-memory.

* depending on hardware, a certain amount of refinement comes for free
* the change also has a positive effect without refinement as the graph creation is done on host either way

Authors:
  - Malte Förster (https://github.com/mfoerste4)
  - Tamas Bela Feher (https://github.com/tfeher)
  - Corey J. Nolet (https://github.com/cjnolet)

Approvers:
  - Corey J. Nolet (https://github.com/cjnolet)
  - Tamas Bela Feher (https://github.com/tfeher)

URL: rapidsai/cuvs#131
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CMake cpp improvement Improves an existing functionality non-breaking Introduces a non-breaking change
Projects
Development

Successfully merging this pull request may close these issues.

3 participants