-
Notifications
You must be signed in to change notification settings - Fork 608
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support ARM-based AWS instances #1528
Comments
What is the timeline on these enhacements? |
@imagine3D-ai we don't currently have a timeline for ARM instance support. Which instance type are you hoping to use, and is cost reduction your only motivation for using it (and if so, how much would it save you)? |
Cost is not my only motivation (although |
@imagine3D-ai each model behaves a bit differently, so some lend themselves to machines with more memory compared to CPU, and others lend themselves to more/faster CPU compared to memory. The latest "Compute Optimized" non-ARM instances would be the c5 or c5a series. "large" is the smallest size for those (as opposed to "medium"), but since you can serve multiple API or multiple replicas of the same API on a single instance, using a larger instance type will not be more expensive if you have multiple APIs or multiple replicas in a single API. |
Are the required enhancements listed the same for say running the realtime API locally on a Jetson? I am considering taking a swing at this vs. using another model server, I would much rather use Cortex. The CLI fails to run at all so I guess that would need to be fixed also: datenstrom@ant:~$ cortex
Traceback (most recent call last):
File "/home/datenstrom/.local/bin/cortex", line 8, in <module>
sys.exit(run())
File "/home/datenstrom/.local/lib/python3.6/site-packages/cortex/binary/__init__.py", line 32, in run
process = subprocess.run([get_cli_path()] + sys.argv[1:], cwd=os.getcwd())
File "/usr/lib/python3.6/subprocess.py", line 423, in run
with Popen(*popenargs, **kwargs) as process:
File "/usr/lib/python3.6/subprocess.py", line 729, in __init__
restore_signals, start_new_session)
File "/usr/lib/python3.6/subprocess.py", line 1364, in _execute_child
raise child_exception_type(errno_num, err_msg, err_filename)
OSError: [Errno 8] Exec format error: '/home/datenstrom/.local/lib/python3.6/site-packages/cortex/binary/cli' |
The features of Cortex that used to manage docker container deployments (also referred to as Cortex local) has been deprecated and is no longer being supported. We happened to build a model server along our journey to building a distributed model inference cluster. Creating a model server isn't our primary focus. Having said that, if you would like adopt Cortex local for a different architecture, you can take a look at Cortex v0.25 which is the last version of Cortex with local support. The requirements listed in this ticket pertain to making the different components of Cortex cluster compatible with ARM before ARM instances can be supported. From the top of my head, Cortex local relies on Docker. You may have to recompile the cortex go binary for your architecture as well. |
Notes
Just the containers that run on worker nodes need to be compiled for ARM:
docker buildx
.docker buildx
.docker buildx
.docker buildx
.docker buildx
.docker buildx
.The text was updated successfully, but these errors were encountered: