Skip to content

MMR Multiprocessing #268

@vjp23

Description

@vjp23

Hi, I'm loving KeyBERT and using it for a project now. However, I'm noticing that performance is very slow at scale when using MMR. I'm observing that running the embedding model on GPU speeds things up, but it seems that the bottleneck is now MMR computation on CPU. Does KeyBERT natively support multiprocessing that?

My plan was to break this all out- start by computing my own n-grams, then embedding the n-grams and documents directly, and passing the embeddings to KeyBERT in a multiprocessing setup (i.e. map a huge list of embeddings to multiple processes of KeyBERT to perform the MMR). But before I go down that road, I just want to double check that this is not already supported natively in KeyBERT?

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions