-
Notifications
You must be signed in to change notification settings - Fork 221
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: Enhance JupyterHub Performance: GPU Acceleration and Time-Slicing Support #277
feat: Enhance JupyterHub Performance: GPU Acceleration and Time-Slicing Support #277
Conversation
…r jupyter helm values to provision GPU notebook instances without Cognito
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please modify the PR title. Available types are:
feat: A new feature
fix: A bug fix
docs: Documentation only changes
style: Changes that do not affect the meaning of the code (white-space, formatting, missing semi-colons, etc)
refactor: A code change that neither fixes a bug nor adds a feature
perf: A code change that improves performance
test: Adding missing tests or correcting existing tests
build: Changes that affect the build system or external dependencies (example scopes: gulp, broccoli, npm)
ci: Changes to our CI configuration files and scripts (example scopes: Travis, Circle, BrowserStack, SauceLabs)
chore: Other changes that don't modify src or test files
revert: Reverts a previous commit
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM 🚀
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think you need to add provider “random” for resource “random_string”.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@lusoal versions.tf
file needs updating with the random provider. Checks are failing with this error
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@lusoal I have added few minor comments and questions for the PR when you get a chance.
ai-ml/jupyterhub/karpenter-provisioners/01-karpenter-provisioner-gpu-ts.yaml
Show resolved
Hide resolved
Adjusted PR based on feedback |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@lusoal left few some minor comments
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM🔥
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM 🚀
What does this PR do?
🛑 Please open an issue first to discuss any significant work and flesh out details/direction - we would hate for your time to be wasted.
Consult the CONTRIBUTING guide for submitting pull-requests.
Included GPU support, additional authentication mechanism, and Karpenter integration. With the addition of GPU support, users can now take advantage of GPU resources to accelerate their AI/ML workloads within JupyterHub. To facilitate testing and demonstrations, I've included the
dummy
authentication mechanism, allowing users without their own domain and certificate to easily try out the Blueprint. This PR installs Karpenter for dynamic provisioning and configure its provisioners for the examples that I'll be adding soon. For GPU instances, I have configured support for Time-Slicing, enabling scheduled workloads on oversubscribed GPUs to interleave with each other, maximizing GPU utilization. Also changedwebsite/docs/blueprints/ai-ml/jupyterhub.md
to reflect the terraform change.Motivation
This PR brings GPU-based instances, configured through the NVIDIA gpu-operator. With the addition of time-slicing support, users can efficiently share GPUs, optimizing resource utilization even in cases where MIG support is limited. These enhancements will be crucial for upcoming blogs and demos, empowering users with accelerated AI/ML workloads within JupyterHub. 🚀
More
website/docs
orwebsite/blog
section for this featurepre-commit run -a
with this PR. Link for installing pre-commit locallyFor Moderators
Additional Notes