Add Nutanix AI Endpoint #346

jinan-zhou · 2024-10-29T23:01:02Z

Add Nutanix AI Endpoint

This PR adds Nutanix AI Endpoint as a provider.
The distribution container is at https://hub.docker.com/repository/docker/jinanz/distribution-nutanix/general.

Setup instructions

Please refer to llama_stack/templates/nutanix/doc_template.md for details

Feature/Issue validation/testing/test plan

Test non-streaming inference

Query

curl -X POST http://localhost:1740/inference/chat_completion -H "Content-Type: application/json" -d '{"model":"Llama3.1-8B-Instruct","messages":[{"content":"How far is the sun? Answer in one sentence.", "role": "user"}],"stream":false}'

Response

{"completion_message":{"role":"assistant","content":"The average distance from the Earth to the sun is approximately 93 million miles (149.6 million kilometers), which is about 8 minutes and 20 seconds away from Earth at the speed of light.","stop_reason":"end_of_turn","tool_calls":[]},"logprobs":null}

Test streaming inference

Query

curl -X POST http://localhost:1740/inference/chat_completion -H "Content-Type: application/json" -d '{"model":"Llama3.1-8B-Instruct","messages":[{"content":"How far is the moon? Answer in one sentence.", "role": "user"}],"stream":true}'

Response

data: {"event":{"event_type":"start","delta":"","logprobs":null,"stop_reason":null}}

data: {"event":{"event_type":"progress","delta":"The","logprobs":null,"stop_reason":null}}

data: {"event":{"event_type":"progress","delta":" average","logprobs":null,"stop_reason":null}}

data: {"event":{"event_type":"progress","delta":" distance","logprobs":null,"stop_reason":null}}

...

data: {"event":{"event_type":"progress","delta":" distance","logprobs":null,"stop_reason":null}}

data: {"event":{"event_type":"progress","delta":".\"","logprobs":null,"stop_reason":null}}

data: {"event":{"event_type":"complete","delta":"","logprobs":null,"stop_reason":"end_of_turn"}}

jinan-zhou · 2024-11-04T18:34:36Z

Hi Llama Stack team, your reviews are much appreciated!
@ashwinb @yanxi0830 @hardikjshah @dltn @raghotham

llama_stack/providers/remote/inference/nutanix/__init__.py

mattf · 2024-11-22T12:22:46Z

llama_stack/providers/remote/inference/nutanix/nutanix.py

+]
+
+
+class NutanixInferenceAdapter(ModelRegistryHelper, Inference):


@ashwinb this is almost the same code as fireworks and databricks. what do you think of having a common base class?

@mattf Yes, I think we need to start consolidating more on the code side.

we have some tests now but we also need to put down some more requirements of when a new inference provider comes in. here are some things we are thinking about:

support for structured decoding -- kind of table stakes now

proper support for tool calling (either directly or via allow legacy completions API so llama stack can format the prompt)

support for vision models

otherwise we cannot claim to the user that "you can just Llama Stack and pick-and-choose any provider and you will get a consistent experience"

Thank you for the feedback. While the code does share similarities with Fireworks and Databricks, there are important differences, and we anticipate adding new features that will further differentiate our implementation from those of other vendors.

I believe it may be more efficient for each vendor to maintain their own Llama Stack adapter. The duplication of code within each adapter, in this context, is manageable and can even be beneficial. Adopting a "Do Repeat Yourself" approach for these adapters aligns with maintaining clarity and flexibility, especially given the unique requirements and evolution of individual providers.

That said, I’m open to further discussions if there’s a strong case for a shared base class or alternative approach. Let me know your thoughts!

@jinan-zhou thank you for the thoughtful argument. i think you're right that abstracting the providers now is too early. i raise the topic only to start a discuss, not to block or slow your valuable contribution.

mattf

lgtm

jinan-zhou · 2024-12-04T21:45:39Z

@mattf Thank you so much!
Could you approve the PR?
Also @ashwinb @raghotham

mattf · 2024-12-05T11:25:28Z

@mattf Thank you so much! Could you approve the PR? Also @ashwinb @raghotham

i'm not authorized. @ashwinb or @raghotham certainly can

jinan-zhou requested review from ashwinb, yanxi0830, hardikjshah, dltn and raghotham as code owners October 29, 2024 23:01

facebook-github-bot added the CLA Signed This label is managed by the Meta Open Source bot. label Oct 29, 2024

jinan-zhou force-pushed the nai branch 3 times, most recently from ad23f19 to 8878396 Compare November 22, 2024 00:22

mattf reviewed Nov 22, 2024

View reviewed changes

llama_stack/providers/remote/inference/nutanix/__init__.py Outdated Show resolved Hide resolved

mattf reviewed Nov 22, 2024

View reviewed changes

jinan-zhou added 5 commits December 3, 2024 01:49

Nutanix AI on!

64c5d38

refactor according to repo updates

111e32f

minor fix

f2ac4e2

Nutanix AI distribution

cb82b1e

pushed docker image, updated documentation

e1c6a2c

jinan-zhou force-pushed the nai branch from 8878396 to e1c6a2c Compare December 3, 2024 01:53

adjust import

542fa68

jinan-zhou requested a review from mattf December 3, 2024 20:37

mattf reviewed Dec 4, 2024

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add Nutanix AI Endpoint #346

Add Nutanix AI Endpoint #346

jinan-zhou commented Oct 29, 2024 •

edited

Loading

jinan-zhou commented Nov 4, 2024

mattf Nov 22, 2024

ashwinb Nov 24, 2024

jinan-zhou Dec 3, 2024

mattf Dec 4, 2024

mattf left a comment

jinan-zhou commented Dec 4, 2024 •

edited

Loading

mattf commented Dec 5, 2024

		]


		class NutanixInferenceAdapter(ModelRegistryHelper, Inference):

Add Nutanix AI Endpoint #346

Are you sure you want to change the base?

Add Nutanix AI Endpoint #346

Conversation

jinan-zhou commented Oct 29, 2024 • edited Loading

Add Nutanix AI Endpoint

Setup instructions

Feature/Issue validation/testing/test plan

jinan-zhou commented Nov 4, 2024

mattf Nov 22, 2024

Choose a reason for hiding this comment

ashwinb Nov 24, 2024

Choose a reason for hiding this comment

jinan-zhou Dec 3, 2024

Choose a reason for hiding this comment

mattf Dec 4, 2024

Choose a reason for hiding this comment

mattf left a comment

Choose a reason for hiding this comment

jinan-zhou commented Dec 4, 2024 • edited Loading

mattf commented Dec 5, 2024

jinan-zhou commented Oct 29, 2024 •

edited

Loading

jinan-zhou commented Dec 4, 2024 •

edited

Loading