Skip to content

Conversation

tarang-jain
Copy link
Contributor

A way to make Faiss GPU indexes work with multi-threading for throughput mode. The GPU resource is embedded on the index's state. And we cannot update it to use a different resource on a different CPU thread. However, the way around this is to load the CPU index and create a fresh GPU copy (using copyFrom) for each thread using the particular thread's GPU resource. This way we can set the new GPU resource (hence a different raft handle and stream) for each CPU thread in throughput mode. This needs a bit more testing to ensure that it works. (Might also need a new git patch for Faiss)

Copy link

copy-pr-bot bot commented Aug 12, 2025

Auto-sync is disabled for draft pull requests in this repository. Workflows must be run manually.

Contributors can view more details about this message here.

@tarang-jain tarang-jain self-assigned this Aug 12, 2025
@tarang-jain tarang-jain added non-breaking Introduces a non-breaking change C++ feature request New feature or request labels Aug 12, 2025
@tarang-jain
Copy link
Contributor Author

[UPDATE]: This approach likely has a problem: Converting a CPU index to a GPU index on every thread will increase GPU memory consumption because it would essentially be several deep copies of the same GPU index. We must tackle the problem more fundamentally as outlined in the issue here

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
C++ feature request New feature or request non-breaking Introduces a non-breaking change
Development

Successfully merging this pull request may close these issues.

1 participant