Skip to content

[8.19] Add Hugging Face Chat Completion support to Inference Plugin (#127254) #128152

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged

Conversation

jonathan-buttner
Copy link
Contributor

Backport

This will backport the following commits from main to 8.19:

Questions ?

Please refer to the Backport tool documentation

…#127254)

* Add Hugging Face Chat Completion support to Inference Plugin

* Add support for streaming chat completion task for HuggingFace

* [CI] Auto commit changes from spotless

* Add support for non-streaming completion task for HuggingFace

* Remove RequestManager for HF Chat Completion Task

* Refactored Hugging Face Completion Service Settings, removed Request Manager, added Unit Tests

* Refactored Hugging Face Action Creator, added Unit Tests

* Add Hugging Face Server Test

* [CI] Auto commit changes from spotless

* Removed parameters from media type for Chat Completion Request and unit tests

* Removed OpenAI default URL in HuggingFaceService's configuration, fixed formatting in InferenceGetServicesIT

* Refactor error message handling in HuggingFaceActionCreator and HuggingFaceService

* Update minimal supported version and add Hugging Face transport version constants

* Made modelId field optional in HuggingFaceChatCompletionModel, updated unit tests

* Removed max input tokens field from HuggingFaceChatCompletionServiceSettings, fixed unit tests

* Removed if statement checking TransportVersion for HuggingFaceChatCompletionServiceSettings constructor with StreamInput param

* Removed getFirst() method calls for backport compatibility

* Made HuggingFaceChatCompletionServiceSettingsTests extend AbstractBWCWireSerializationTestCase for future serialization testing

* Refactored tests to use stripWhitespace method for readability

* Refactored javadoc for HuggingFaceService

* Renamed HF chat completion TransportVersion constant names

* Added random string generation in unit test

* Refactored javadocs for HuggingFace requests

* Refactored tests to reduce duplication

* Added changelog file

* Add HuggingFaceChatCompletionResponseHandler and associated tests

* Refactor error handling in HuggingFaceServiceTests to standardize error response codes and types

* Refactor HuggingFace error handling to improve response structure and add streaming support

* Allowing null function name for hugging face models

---------

Co-authored-by: elasticsearchmachine <infra-root+elasticsearchmachine@elastic.co>
Co-authored-by: Jonathan Buttner <jonathan.buttner@elastic.co>
(cherry picked from commit d1ad917)

# Conflicts:
#	server/src/main/java/org/elasticsearch/TransportVersions.java
@jonathan-buttner jonathan-buttner added the auto-merge-without-approval Automatically merge pull request when CI checks pass (NB doesn't wait for reviews!) label May 19, 2025
@elasticsearchmachine elasticsearchmachine merged commit 4f048d8 into elastic:8.19 May 19, 2025
15 checks passed
@jonathan-buttner jonathan-buttner deleted the backport/8.19/pr-127254 branch May 19, 2025 18:48
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
auto-merge-without-approval Automatically merge pull request when CI checks pass (NB doesn't wait for reviews!) backport v8.19.0
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants