-
Notifications
You must be signed in to change notification settings - Fork 740
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add Nutanix AI Endpoint #346
base: main
Are you sure you want to change the base?
Conversation
Hi Llama Stack team, your reviews are much appreciated! |
ad23f19
to
8878396
Compare
] | ||
|
||
|
||
class NutanixInferenceAdapter(ModelRegistryHelper, Inference): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@ashwinb this is almost the same code as fireworks and databricks. what do you think of having a common base class?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@mattf Yes, I think we need to start consolidating more on the code side.
we have some tests now but we also need to put down some more requirements of when a new inference provider comes in. here are some things we are thinking about:
- support for structured decoding -- kind of table stakes now
- proper support for tool calling (either directly or via allow legacy completions API so llama stack can format the prompt)
- support for vision models
otherwise we cannot claim to the user that "you can just Llama Stack and pick-and-choose any provider and you will get a consistent experience"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for the feedback. While the code does share similarities with Fireworks and Databricks, there are important differences, and we anticipate adding new features that will further differentiate our implementation from those of other vendors.
I believe it may be more efficient for each vendor to maintain their own Llama Stack adapter. The duplication of code within each adapter, in this context, is manageable and can even be beneficial. Adopting a "Do Repeat Yourself" approach for these adapters aligns with maintaining clarity and flexibility, especially given the unique requirements and evolution of individual providers.
That said, I’m open to further discussions if there’s a strong case for a shared base class or alternative approach. Let me know your thoughts!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jinan-zhou thank you for the thoughtful argument. i think you're right that abstracting the providers now is too early. i raise the topic only to start a discuss, not to block or slow your valuable contribution.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm
@mattf Thank you so much! |
i'm not authorized. @ashwinb or @raghotham certainly can |
Add Nutanix AI Endpoint
This PR adds Nutanix AI Endpoint as a provider.
The distribution container is at https://hub.docker.com/repository/docker/jinanz/distribution-nutanix/general.
Setup instructions
Please refer to
llama_stack/templates/nutanix/doc_template.md
for detailsFeature/Issue validation/testing/test plan
Query
Response
Query
Response