Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Community Contributions] examples on distributed inference using 🤗 Accelerate #3078

Open
sayakpaul opened this issue Sep 4, 2024 · 3 comments
Labels
contributions-welcome good first issue Good for newcomers wip Work in progress

Comments

@sayakpaul
Copy link
Member

sayakpaul commented Sep 4, 2024

The inference/distributed directory houses examples on running distributed inference with accelerate:

  • Phi2 for language generation
  • Stable Diffusion for image generation

The strategy followed there is to load an entire model onto each GPU and sending chunks of a batch through each GPU’s model copy at a time. Synthetic data generation has become an essential toolkit for every ML Engineer. So, it'd be beneficial to extend these examples to include some more use cases:

  • Image captioning
  • Speech data generation

Some nice to haves:

  • Include artifact serialization as done in this
  • Keep the artifact serialization code under a thread to not block GPU execution

How can you help?

You could help us contribute an example on any of the above-mentioned use cases or you can come up with your own 🤗 Help us make the art of synthetic data generation scalable, easy, and accessible.

@muellerzr muellerzr added good first issue Good for newcomers wip Work in progress labels Sep 4, 2024
@sayakpaul
Copy link
Member Author

sayakpaul commented Sep 5, 2024

For image recaptioning, I have a quick and dirty PoC here that assumes streaming from webdataset and uses multiprocessing from torch: https://gist.github.com/sayakpaul/dfb26d6562b62ba8c2f613dc912a28a8#file-generate_captions_multigpu-py. Would be nice to make it more efficient and leverage accelerate to make things cleaner and simpler.

@VladOS95-cyber
Copy link

Hey @sayakpaul @muellerzr! If it is still actual, I am going to add one more example for Video captioning using LLaVA-NeXT-Video-7B-hf

@sayakpaul
Copy link
Member Author

Of course it's up, that would be so cool! Cc: @a-r-r-o-w

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
contributions-welcome good first issue Good for newcomers wip Work in progress
Projects
None yet
Development

No branches or pull requests

3 participants