Skip to content

Conversation

Copilot
Copy link
Contributor

@Copilot Copilot AI commented Oct 6, 2025

Description

Fixes the 404 error when using HuggingFace embedding generation by updating from the obsolete api-inference.huggingface.co endpoint to the current router.huggingface.co/hf-inference endpoint.

Problem

The HuggingFace connector was failing with 404 errors when generating embeddings:

var kernelBuilder = Kernel.CreateBuilder();
kernelBuilder.AddHuggingFaceEmbeddingGenerator(
    "sentence-transformers/all-MiniLM-L6-v2",
    apiKey: "api-key");
var k = kernelBuilder.Build();
var g = k.GetRequiredService<IEmbeddingGenerator<string, Embedding<float>>>();
var r = await g.GenerateAsync(["John: Hello, how are you?\nRoger: Hey, I'm Roger!"]);

// Microsoft.SemanticKernel.HttpOperationException: Response status code does not indicate success: 404 (Not Found).

Root Cause

The code was using the deprecated HuggingFace Inference API endpoint structure:

  • Old base URL: https://api-inference.huggingface.co
  • Old embedding path: /pipeline/feature-extraction/{modelId}
  • Resulting URL: https://api-inference.huggingface.co/pipeline/feature-extraction/sentence-transformers/all-MiniLM-L6-v2

Solution

Updated to the current HuggingFace Inference Router API:

  • New base URL: https://router.huggingface.co/hf-inference
  • New embedding path: /models/{modelId}/pipeline/feature-extraction
  • Resulting URL: https://router.huggingface.co/hf-inference/models/sentence-transformers/all-MiniLM-L6-v2/pipeline/feature-extraction

Changes

Core Changes

  • HuggingFaceClient.cs: Updated default endpoint and embedding path construction

Test Updates

  • Updated test assertions across 5 test files to match new URL structure
  • All 82 unit tests passing

Sample Updates

  • Updated TelemetryWithAppInsights demo to use new endpoint

Impact

This fix affects all HuggingFace connectors (embeddings, chat completion, text generation, image-to-text) as they share the same base URL. The new endpoints are:

  • Chat completion: https://router.huggingface.co/hf-inference/v1/chat/completions
  • Text generation: https://router.huggingface.co/hf-inference/models/{modelId}
  • Embeddings: https://router.huggingface.co/hf-inference/models/{modelId}/pipeline/feature-extraction
  • Image-to-text: https://router.huggingface.co/hf-inference/models/{modelId}

Users who explicitly provide custom endpoints are not affected by this change.

Testing

  • ✅ All 82 HuggingFace connector unit tests pass
  • ✅ URL construction verified to match the working workaround
  • ✅ Minimal surgical changes (7 files, 10 lines modified)

Closes #<issue_number>

Original prompt

This section details on the original issue you should resolve

<issue_title>.Net: Bug: .NET AddHuggingFaceEmbeddingGenerator return 404</issue_title>
<issue_description>Describe the bug
Embeddings generation using HuggingFace returns 404.
To my understanding it seems to originate from using the obsolete api-inference.huggingface.co instead of router.huggingface.co/hf-inference

To Reproduce
Try using the embedding generation with:

        var kernelBuilder = Kernel.CreateBuilder();
        kernelBuilder.AddHuggingFaceEmbeddingGenerator(
            "sentence-transformers/all-MiniLM-L6-v2",
            apiKey: "api-key");
        var k = kernelBuilder.Build();
        var g = k.GetRequiredService<IEmbeddingGenerator<string, Embedding<float>>>();
        var r = g.GenerateAsync(["John: Hello, how are you?\nRoger: Hey, I'm Roger!"]).GetAwaiter().GetResult();
Microsoft.SemanticKernel.HttpOperationException: Response status code does not indicate success: 404 (Not Found).
 ---> System.Net.Http.HttpRequestException: Response status code does not indicate success: 404 (Not Found).
   at System.Net.Http.HttpResponseMessage.EnsureSuccessStatusCode()
   at Microsoft.SemanticKernel.Http.HttpClientExtensions.SendWithSuccessCheckAsync(HttpClient client, HttpRequestMessage request, HttpCompletionOption completionOption, CancellationToken cancellationToken)
   --- End of inner exception stack trace ---
   at Microsoft.SemanticKernel.Http.HttpClientExtensions.SendWithSuccessCheckAsync(HttpClient client, HttpRequestMessage request, HttpCompletionOption completionOption, CancellationToken cancellationToken)
   at Microsoft.SemanticKernel.Http.HttpClientExtensions.SendWithSuccessCheckAsync(HttpClient client, HttpRequestMessage request, CancellationToken cancellationToken)
   at Microsoft.SemanticKernel.Connectors.HuggingFace.Core.HuggingFaceClient.SendRequestAndGetStringBodyAsync(HttpRequestMessage httpRequestMessage, CancellationToken cancellationToken)
   at Microsoft.SemanticKernel.Connectors.HuggingFace.Core.HuggingFaceClient.GenerateEmbeddingsAsync(IList`1 data, Kernel kernel, CancellationToken cancellationToken)
   at Microsoft.SemanticKernel.Connectors.HuggingFace.HuggingFaceEmbeddingGenerator.GenerateAsync(IEnumerable`1 values, EmbeddingGenerationOptions options, CancellationToken cancellationToken)

Expected behavior
Should return the embeddings

Additional context
This is the workaround I use for now

 
 var endpoint = $"https://router.huggingface.co/hf-inference/models/{model}/pipeline/feature-extraction";

  kernelBuilder.Services.AddKeyedSingleton<IEmbeddingGenerator<string, Embedding<float>>>(serviceKey: null, (serviceProvider, _) =>
      new HuggingFaceEmbeddingGenerator(
          new Uri(endpoint),
          apiKey: "api-key"));
```</issue_description>

<agent_instructions>Update the references to "https://api-inference.huggingface.co" to use the correct URL</agent_instructions>

## Comments on the Issue (you are @copilot in this section)

<comments>
</comments>

Fixes #13148

✨ Let Copilot coding agent set things up for you — coding agent works faster and does higher quality work when set up for your repo.

…ence

- Changed default endpoint from https://api-inference.huggingface.co to https://router.huggingface.co/hf-inference
- Updated embedding endpoint path from /pipeline/feature-extraction/{modelId} to /models/{modelId}/pipeline/feature-extraction
- Updated all tests to reflect new endpoint structure
- Updated sample code to use new endpoint

Co-authored-by: markwallace-microsoft <127216156+markwallace-microsoft@users.noreply.github.com>
@Copilot Copilot AI changed the title [WIP] .Net: Bug: .NET AddHuggingFaceEmbeddingGenerator return 404 .NET: Fix HuggingFace embedding generation 404 error by updating to router.huggingface.co endpoint Oct 6, 2025
Copilot finished work on behalf of markwallace-microsoft October 6, 2025 16:08
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

.Net: Bug: .NET AddHuggingFaceEmbeddingGenerator return 404

2 participants