Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use CPU instances with GPU inference accelerator #618

Open
vishalbollu opened this issue Nov 28, 2019 · 7 comments
Open

Use CPU instances with GPU inference accelerator #618

vishalbollu opened this issue Nov 28, 2019 · 7 comments
Labels
enhancement New feature or request research Determine technical constraints

Comments

@vishalbollu
Copy link
Contributor

vishalbollu commented Nov 28, 2019

Description

Instead of spinning up a GPU nodegroup, spin up a CPU nodegroup with Elastic Inference (GPU accelerated inference).

Additional Context

@vishalbollu vishalbollu added the enhancement New feature or request label Nov 28, 2019
@vishalbollu vishalbollu added the research Determine technical constraints label Nov 28, 2019
@scribu
Copy link

scribu commented Dec 7, 2019

+1. This is critical for a cost-effective deployment.

@deliahu deliahu added the good first issue Good for newcomers label May 5, 2020
@lezwon
Copy link

lezwon commented Jul 22, 2020

Hi, I'd like to look into this issue if anyone can help me get started.

@deliahu
Copy link
Member

deliahu commented Jul 22, 2020

@lezwon thanks for your interest!

I think the first step is to figure out how to create an eks cluster with instances that have elastic inference attached. Currently, Cortex uses eksctl to create the cluster, and based on eksctl-io/eksctl#643, it looks like eksctl might not support elastic inference yet. But I am not sure if that's the case, or if there is a workaround; it could be worth reaching out to the eksctl team to inquire.

@RobertLucian or @vishalbollu, do you have any additional context on this?

@lezwon
Copy link

lezwon commented Jul 23, 2020

@deliahu Thank you for the help. I'll look into the issue you mentioned with eksctl. :)

@deliahu
Copy link
Member

deliahu commented Jul 23, 2020

@lezwon sounds good, thank you, keep us posted!

@deliahu deliahu removed the good first issue Good for newcomers label Jan 20, 2021
@H4dr1en
Copy link

H4dr1en commented Apr 6, 2021

This issue has been depriorized and the relevant eksctl issue is closed for inactivity but using EI would be cost saving for most of the Cortex users. Is there any plan to solve this issue in the following releases?

@miguelvr
Copy link
Collaborator

miguelvr commented Apr 6, 2021

@H4dr1en we have added multi instance types clusters as a feature recently. This can mitigate costs already, by allowing to run both CPU / GPU and Spot instances in the same cluster.

I know it is not remotely the same as Elastic Inference, but it is an improvement :)

We will look into Elastic Inference again soon since we are re-focusing the team's efforts on improving the Cortex UX on AWS.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request research Determine technical constraints
Projects
None yet
Development

No branches or pull requests

6 participants