Manually specify openai-compat format and parse it #4463

dflatline · 2025-06-08T20:08:44Z

Related GitHub Issue

Closes: #4462 =

Description

This MR explicitly specifies base64 as encoding format to OpenAI library. This causes it to return the reponse directly, without subjecting it to dimensionality reduction sabotage or integer truncation (it does both).

We then parse it directly ourselves, preserving the correct embedding dimension without accuracy loss.

Test Procedure

Run ollama pull snowflake-arctic-embed2
Run docker run -p 6333:6333 qdrant/qdrant
Enable Codebase Indexing checkbox in Roo Experimental Settings (flask icon)
Set OpenAI-Compatible as embeddings provider
Set Base URL to http://127.0.0.1:11434/v1
Set API Key to whatever
Set embedding model to snowflake-arctic-embed2
Set embeddings dimension to 1024
Leave qdrant url as http://localhost:6333.

In order to work around ollama/ollama#6262, you also need to kill your ollama demon and launch it manually from the shell like this:

OLLAMA_NUM_PARALLEL=1 OLLAMA_MAX_QUEUE=1000000 ollama serve

Type of Change

🐛 Bug Fix: Non-breaking change that fixes an issue.
✨ New Feature: Non-breaking change that adds functionality.
💥 Breaking Change: Fix or feature that would cause existing functionality to not work as expected.
♻️ Refactor: Code change that neither fixes a bug nor adds a feature.
💅 Style: Changes that do not affect the meaning of the code (white-space, formatting, etc.).
📚 Documentation: Updates to documentation files.
⚙️ Build/CI: Changes to the build process or CI configuration.
🧹 Chore: Other changes that don't modify src or test files.

Pre-Submission Checklist

Screenshots / Videos

Not needed.

Documentation Updates

Not really needed? Tho maybe we need to document ollama/ollama#6262 in the main docs generally for ollama embedding (that is the source of the OLLAMA_NUM_PARALLEL=1 OLLAMA_MAX_QUEUE=1000000 ollama serve workaround).

Additional Notes

Get in Touch

Important

Specifies base64 encoding in OpenAICompatibleEmbedder to prevent dimensionality reduction and parses embeddings to preserve dimensions.

Behavior:
- Specifies base64 as encoding_format in OpenAICompatibleEmbedder to prevent dimensionality reduction and integer truncation.
- Parses base64 embeddings into Float32Array to preserve embedding dimensions.
Logging:
- Adds logging for embedding length and first 10 values in _embedBatchWithRetries() function.

^{This description was created by}^{for 4b04411. You can customize this summary. It will automatically update as commits are pushed.}

daniel-lxs

Hey @dflatline, I reviewed and tested your PR, it indeed solves the issue but I am not sure why. I tried passing the dimensions directly to the request and that also didn't solve the issue.

While there's nothing wrong with your PR do you think you can try and find out why exactly is the current implementation failing?

Let me know what you think!

daniel-lxs · 2025-06-09T19:03:37Z

src/services/code-index/embedders/openai-compatible.ts

+				console.log(`[OpenAI-Compatible Embedder] After mapping - embedding length: ${embeddings[0]?.length}`)
+				if (embeddings[0]) {
+					console.log(
+						`[OpenAI-Compatible Embedder] First 10 values after mapping:`,
+						embeddings[0].slice(0, 5),
+					)
+				}


I think we can probably remove these.

daniel-lxs · 2025-06-09T19:04:13Z

src/services/code-index/embedders/openai-compatible.ts

@@ -111,10 +111,37 @@ export class OpenAICompatibleEmbedder implements IEmbedder {
 				const response = await this.embeddingsClient.embeddings.create({
 					input: batchTexts,
 					model: model,
+					encoding_format: "base64", // Use base64 to protect embedding dimensions from openai sabotage


I am not sure this is actually sabotage or not but, we probably want to remove the comment.

dflatline · 2025-06-11T03:11:33Z

Hey @dflatline, I reviewed and tested your PR, it indeed solves the issue but I am not sure why. I tried passing the dimensions directly to the request and that also didn't solve the issue.

While there's nothing wrong with your PR do you think you can try and find out why exactly is the current implementation failing?

I too tried quite a number of different ways of providing the dimension to the API, to no effect. Everything was happening fine on the wire with and without a dimension param. I monitored the local interface via wireshark.

The bug definitely seems to be happening in the OpenAI library, which has a number of abstractions around parsing and marshaling that I could not easily get to the bottom of... That's why I just declared it sabotage, put on the Beastie Boys, and worked around it. 😅

It does feel like a bug should be filed with OpenAI. It seems like the best thing to do there would be either a unit test or minimal standalone node app that triggers the behavior. I am not sure the best way to extract a minimal test case, as I am not a typescript or node developer. Do you have any suggestions for minimal way to exercise this for purpose of reporting to OpenAI?

dflatline · 2025-06-11T03:38:30Z

Hrmm.. It looks like the OpenAI parsing code has had bugs before.

It looks like Roo is pinned to 4.78.1 of OpenAI package, but there was a fix for this embedding API in 4.92: openai/openai-node#1448

I don't yet know if that is the same bug though. Frankly I am new to npm and typescript... I got to this side-quest from trying to improve codebase search for a python project of mine. So I am still open to suggestions for the best way to make a test case for this bug to determine if it is indeed fixed in later OpenAI package versions.

dflatline · 2025-06-11T07:31:43Z

Ok I built with OpenAI v4.103 and confirmed it had the fix from openai/openai-node#1448, but it does not fix the issue for ollama and LM Studio openai endpoints.

Interestingly, I can no longer get llama.cpp endpoints to exhibit the issue with any version, but I am getting other strange behavior.

I have a test script of the OpenAI package in at #4462 (comment)

daniel-lxs

Hey @dflatline, Thank you for looking into this, I think we can use this workaround for the time being.

I left a suggestion regarding the testing, it might be a good idea to test the base64 path in case something changes in the future.

Let me know what you think!

daniel-lxs · 2025-06-11T14:00:56Z

src/services/code-index/embedders/__tests__/openai-compatible.test.ts

To ensure the new base64 decoding logic is fully validated, could we update the mock response in this test file? If the mock returned a base64 string instead of a float array, we could confirm the decoding path is exercised correctly.

Ok I added three tests for this.

I also added a test to verify that the OpenAI package does not parse the embeddings when you request a base64 format, and that they are then as expected. That way, we know if they (or some rando) "fixes" this behavior so the openai package parses whatever format you request for you.

All of these tests are generated with RooCode and claude4.. That OpenAI one was a real pain to get right. It kept wanting to mock it at the RooCode api, and not the post inside of the OpenAI package. In fact, Claude4 would frequently revert this internal mocking of OpenAI to mock it in at the Roo package level while fixing other unrelated issues with the tests... So while handy for anti-fragility of this workaround, the OpenAI test might be "vibe-fragile" itself, in that some future vibe coded tests might "fix" this OpenAI mocking again to move mocking back to the wrong higher level.. We live in interesting times!

Remove debug logs

Improve comment

daniel-lxs

Hey @dflatline, Thank you for your contribution!
I made some minor improvements to the typing.

I tested it and works perfectly!

mrubens · 2025-06-12T15:41:38Z

Is the test failure here legit?

hannesrudolph · 2025-06-18T01:09:51Z

@dflatline do you have discord?

* Manually specify openai-compat format and parse it * fixup! Manually specify openai-compat format and parse it * Expect base64 in embedding test arguments * fixup! Manually specify openai-compat format and parse it Remove debug logs * fixup! Manually specify openai-compat format and parse it Improve comment * fixup! Manually specify openai-compat format and parse it * Add tests to exercise base64 decode of embeddings * Add tests to verify openai base64 and brokenness behavior * feat: improve typing * refactor: switch from jest to vitest for mocking in tests --------- Co-authored-by: Dixie Flatline <dflatline> Co-authored-by: Daniel Riccio <ricciodaniel98@gmail.com>

dflatline requested review from mrubens, cte and jr as code owners June 8, 2025 20:08

github-project-automation bot added this to Roo Code Roadmap and Roo Code Roadmap Jun 8, 2025

github-project-automation bot moved this to Triage in Roo Code Roadmap Jun 8, 2025

github-project-automation bot moved this to New in Roo Code Roadmap Jun 8, 2025

dosubot bot added size:S This PR changes 10-29 lines, ignoring generated files. bug Something isn't working size:M This PR changes 30-99 lines, ignoring generated files. and removed size:S This PR changes 10-29 lines, ignoring generated files. labels Jun 8, 2025

dflatline mentioned this pull request Jun 8, 2025

OpenAI-Compatible embedding dimension fail for codebase indexing #4462

Closed

daniel-lxs added the Issue/PR - Triage New issue. Needs quick review to confirm validity and assign labels. label Jun 9, 2025

daniel-lxs moved this from Triage to PR [Needs Prelim Review] in Roo Code Roadmap Jun 9, 2025

daniel-lxs reviewed Jun 9, 2025

View reviewed changes

daniel-lxs moved this from PR [Needs Prelim Review] to PR [Changes Requested] in Roo Code Roadmap Jun 9, 2025

daniel-lxs added PR - Changes Requested and removed Issue/PR - Triage New issue. Needs quick review to confirm validity and assign labels. labels Jun 9, 2025

dosubot bot added size:S This PR changes 10-29 lines, ignoring generated files. and removed size:M This PR changes 30-99 lines, ignoring generated files. labels Jun 11, 2025

daniel-lxs reviewed Jun 11, 2025

View reviewed changes

dosubot bot added size:L This PR changes 100-499 lines, ignoring generated files. and removed size:S This PR changes 10-29 lines, ignoring generated files. labels Jun 12, 2025

Dixie Flatline added 4 commits June 12, 2025 08:36

Manually specify openai-compat format and parse it

852a126

fixup! Manually specify openai-compat format and parse it

6385a4b

Expect base64 in embedding test arguments

681d8d2

fixup! Manually specify openai-compat format and parse it

670cbe3

Remove debug logs

Dixie Flatline and others added 5 commits June 12, 2025 08:36

fixup! Manually specify openai-compat format and parse it

8b6bbce

Improve comment

fixup! Manually specify openai-compat format and parse it

5185337

Add tests to exercise base64 decode of embeddings

b20841e

Add tests to verify openai base64 and brokenness behavior

c98a5df

feat: improve typing

17401b2

daniel-lxs force-pushed the embed_dimfix2 branch from 3917811 to 17401b2 Compare June 12, 2025 13:36

daniel-lxs approved these changes Jun 12, 2025

View reviewed changes

dosubot bot added the lgtm This PR has been approved by a maintainer label Jun 12, 2025

daniel-lxs moved this from PR [Changes Requested] to PR [Needs Review] in Roo Code Roadmap Jun 12, 2025

hannesrudolph added PR - Needs Review and removed PR - Changes Requested labels Jun 12, 2025

refactor: switch from jest to vitest for mocking in tests

b029b40

mrubens approved these changes Jun 12, 2025

View reviewed changes

mrubens merged commit c481827 into RooCodeInc:main Jun 12, 2025
10 checks passed

github-project-automation bot moved this from New to Done in Roo Code Roadmap Jun 12, 2025

github-project-automation bot moved this from PR [Needs Review] to Done in Roo Code Roadmap Jun 12, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Manually specify openai-compat format and parse it #4463

Manually specify openai-compat format and parse it #4463

Uh oh!

dflatline commented Jun 8, 2025 •

edited by ellipsis-dev bot

Loading

Uh oh!

daniel-lxs left a comment

Uh oh!

daniel-lxs Jun 9, 2025

Uh oh!

daniel-lxs Jun 9, 2025

Uh oh!

dflatline commented Jun 11, 2025

Uh oh!

dflatline commented Jun 11, 2025

Uh oh!

dflatline commented Jun 11, 2025

Uh oh!

daniel-lxs left a comment

Uh oh!

daniel-lxs Jun 11, 2025

Uh oh!

dflatline Jun 12, 2025

Uh oh!

daniel-lxs left a comment

Uh oh!

mrubens commented Jun 12, 2025

Uh oh!

Uh oh!

hannesrudolph commented Jun 18, 2025

Uh oh!

Uh oh!

Manually specify openai-compat format and parse it #4463

Manually specify openai-compat format and parse it #4463

Uh oh!

Conversation

dflatline commented Jun 8, 2025 • edited by ellipsis-dev bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Related GitHub Issue

Description

Test Procedure

Type of Change

Pre-Submission Checklist

Screenshots / Videos

Documentation Updates

Additional Notes

Get in Touch

Uh oh!

daniel-lxs left a comment

Choose a reason for hiding this comment

Uh oh!

daniel-lxs Jun 9, 2025

Choose a reason for hiding this comment

Uh oh!

daniel-lxs Jun 9, 2025

Choose a reason for hiding this comment

Uh oh!

dflatline commented Jun 11, 2025

Uh oh!

dflatline commented Jun 11, 2025

Uh oh!

dflatline commented Jun 11, 2025

Uh oh!

daniel-lxs left a comment

Choose a reason for hiding this comment

Uh oh!

daniel-lxs Jun 11, 2025

Choose a reason for hiding this comment

Uh oh!

dflatline Jun 12, 2025

Choose a reason for hiding this comment

Uh oh!

daniel-lxs left a comment

Choose a reason for hiding this comment

Uh oh!

mrubens commented Jun 12, 2025

Uh oh!

Uh oh!

hannesrudolph commented Jun 18, 2025

Uh oh!

Uh oh!

dflatline commented Jun 8, 2025 •

edited by ellipsis-dev bot

Loading