Question regarding Punica integeration #107

psych0v0yager · 2023-12-06T03:16:51Z

The acknowledgements of this project mention the SGMV kernels created by the Punica project. Is there a way we can run multiple adapters simultaneously using LoRAX in a similar way shown in the Punica example? Can this be done via the AsyncClient?

tgaddair · 2023-12-06T05:08:54Z

Hi @psych0v0yager, yes, there are a few ways you can achieve running multiple adapters in a single batch:

Multiple clients making requests at the same time (this is the most common situation we see in production)
Making multiple requests using AsyncClient and then awaiting at the end (batch request submission)
Using another concurrency system like threading and making an HTTP request directly to the endpoint

We have a very simple example of (3) here, but I'll make a note to add more examples of how to do this using AsyncClient.

Hope that answers your question!

psych0v0yager · 2023-12-06T06:34:08Z

Thanks for the fast reply and multiple solutions! I'll be sure to check out your example for number 3, and I look forward to seeing more documentation on the AsyncClient. Imo the AsyncClient seems like the most convenient for a MoE type situation.

tgaddair · 2023-12-06T17:32:28Z

Awesome, @psych0v0yager to help me understand your use case a little better, for the MoE situation you're describing, are you interested in generating a different sequence for each adapter and then combining them, or mixing multiple adapters for the same request and generating a single sequence? It sounds like the first one (generating a different sequence for each adapter), but wanted to confirm, as both are use cases we want to support.

psych0v0yager · 2023-12-06T22:53:34Z

@tgaddair thanks for the reply! I was interested in the first one (generating a different sequence for each adapter).

Specifically I was imagining running 5 adapters concurrently, each of them generating a different sequence. Once the batch of 5 is done, I want to feed all 5 sequences to a 6th adapter that is finetuned to select the best sequence.

tgaddair added the question Further information is requested label Dec 6, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Question regarding Punica integeration #107

Question regarding Punica integeration #107

psych0v0yager commented Dec 6, 2023

tgaddair commented Dec 6, 2023 •

edited

Loading

psych0v0yager commented Dec 6, 2023

tgaddair commented Dec 6, 2023

psych0v0yager commented Dec 6, 2023

Question regarding Punica integeration #107

Question regarding Punica integeration #107

Comments

psych0v0yager commented Dec 6, 2023

tgaddair commented Dec 6, 2023 • edited Loading

psych0v0yager commented Dec 6, 2023

tgaddair commented Dec 6, 2023

psych0v0yager commented Dec 6, 2023

tgaddair commented Dec 6, 2023 •

edited

Loading