Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[enhancement] Improve efficiency of community detection on GPU #2381

Merged
merged 2 commits into from
Dec 14, 2023

Conversation

tomaarsen
Copy link
Collaborator

@tomaarsen tomaarsen commented Dec 14, 2023

Supersedes #1857
Closes #1654, Closes #1840, Closes #1703

Hello!

Pull Request overview

  • Improve efficiency of community detection on GPU

Details

I noticed that running community_detection on GPU was barely faster than on CPU. I chased this down, and it's due to the large amount of looping rather than taking good advantage of torch and the GPU's strengths. The new implementation uses torch operations much more, and will be notably faster when the embeddings are on GPU.

However, the implementation performs slightly worse than the master implementation for CPU. As a result, (a slight variation of) the original implementation is still used for CPU, with one exception:

for idx, val in zip(top_idx_large.tolist(), top_val_large):
if val < threshold:
break
new_cluster.append(idx)
extracted_communities.append(new_cluster)

This loop has been replaced with:

extracted_communities.append(top_idx_large[top_val_large >= threshold].tolist())

Which is slightly more performant.

Benchmarks

sentence_transformers_clustering

Note

The computation time is still exponential! This is simply due to how all N embeddings must be compared with all N other embeddings. In short, this PR does not allow clustering on any amount of embeddings, but it does allow it on a much larger amount.

  • Tom Aarsen

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
1 participant