Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: Add a pattern for llama3 on inferentia2 #528

Merged
merged 8 commits into from
May 14, 2024
Merged

Conversation

askulkarni2
Copy link
Collaborator

What does this PR do?

🛑 Please open an issue first to discuss any significant work and flesh out details/direction - we would hate for your time to be wasted.
Consult the CONTRIBUTING guide for submitting pull-requests.

  1. Adds a new pattern for llama3 on inf2 based on the trainium-inferentia blueprint
  2. Misc. fixes for cleanup and more standardized looks for all the gen-ai patterns.
  3. Removed usage of deployment for gradio and switched over to just use of docker container on localhost (it doesn't add any value and just complicates things)

Motivation

Fixes #517

More

  • Yes, I have tested the PR using my local account setup (Provide any test evidence report under Additional Notes)
  • Mandatory for new blueprints. Yes, I have added a example to support my blueprint PR
  • Mandatory for new blueprints. Yes, I have updated the website/docs or website/blog section for this feature
  • Yes, I ran pre-commit run -a with this PR. Link for installing pre-commit locally

For Moderators

  • E2E Test successfully complete before merge?

Additional Notes

Copy link
Collaborator

@vara-bonthu vara-bonthu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice work on this, @askulkarni2! The PR looks good to me. Since it introduces changes to multiple blueprints, let's keep an eye on any issues that may arise.

@askulkarni2 askulkarni2 merged commit 561ff94 into main May 14, 2024
39 checks passed
@askulkarni2 askulkarni2 deleted the feat-add-llama3 branch May 14, 2024 00:10
sudo echo "deb https://apt.repos.neuron.amazonaws.com ${VERSION_CODENAME} main" > /etc/apt/sources.list.d/neuron.list && \
sudo wget -qO - https://apt.repos.neuron.amazonaws.com/GPG-PUB-KEY-AMAZON-AWS-NEURON.PUB | apt-key add - && \
sudo apt-get update -y && \
sudo apt-get install aws-neuronx-dkms aws-neuronx-collectives=2.* aws-neuronx-runtime-lib=2.* aws-neuronx-tools=2.* -y && \
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FYI - we should strip aws-neuronx-dkms from all of the docker images in this repo. that is the Neuron driver and its provided on the AMI/host

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[Inference]: Llama3 on Inf2 with Trainium-inferentia blueprint wtih RayServe
3 participants