Skip to content

Manually specify openai-compat format and parse it #4463

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 10 commits into from
Jun 12, 2025

Conversation

dflatline
Copy link

@dflatline dflatline commented Jun 8, 2025

Related GitHub Issue

#4462

Closes: #4462 =

Description

This MR explicitly specifies base64 as encoding format to OpenAI library. This causes it to return the reponse directly, without subjecting it to dimensionality reduction sabotage or integer truncation (it does both).

We then parse it directly ourselves, preserving the correct embedding dimension without accuracy loss.

Test Procedure

  1. Run ollama pull snowflake-arctic-embed2
  2. Run docker run -p 6333:6333 qdrant/qdrant
  3. Enable Codebase Indexing checkbox in Roo Experimental Settings (flask icon)
  4. Set OpenAI-Compatible as embeddings provider
  5. Set Base URL to http://127.0.0.1:11434/v1
  6. Set API Key to whatever
  7. Set embedding model to snowflake-arctic-embed2
  8. Set embeddings dimension to 1024
  9. Leave qdrant url as http://localhost:6333.

In order to work around ollama/ollama#6262, you also need to kill your ollama demon and launch it manually from the shell like this:

OLLAMA_NUM_PARALLEL=1 OLLAMA_MAX_QUEUE=1000000 ollama serve

Type of Change

  • 🐛 Bug Fix: Non-breaking change that fixes an issue.
  • New Feature: Non-breaking change that adds functionality.
  • 💥 Breaking Change: Fix or feature that would cause existing functionality to not work as expected.
  • ♻️ Refactor: Code change that neither fixes a bug nor adds a feature.
  • 💅 Style: Changes that do not affect the meaning of the code (white-space, formatting, etc.).
  • 📚 Documentation: Updates to documentation files.
  • ⚙️ Build/CI: Changes to the build process or CI configuration.
  • 🧹 Chore: Other changes that don't modify src or test files.

Pre-Submission Checklist

  • Issue Linked: This PR is linked to an approved GitHub Issue (see "Related GitHub Issue" above).
  • Scope: My changes are focused on the linked issue (one major feature/fix per PR).
  • Self-Review: I have performed a thorough self-review of my code.
  • Code Quality:
    • My code adheres to the project's style guidelines.
    • There are no new linting errors or warnings (npm run lint).
    • All debug code (e.g., console.log) has been removed.
  • Testing:
    • New and/or updated tests have been added to cover my changes.
    • All tests pass locally (npm test).
    • The application builds successfully with my changes.
  • Branch Hygiene: My branch is up-to-date (rebased) with the main branch.
  • Documentation Impact: I have considered if my changes require documentation updates (see "Documentation Updates" section below).
  • Changeset: A changeset has been created using npm run changeset if this PR includes user-facing changes or dependency updates.
  • Contribution Guidelines: I have read and agree to the Contributor Guidelines.

Screenshots / Videos

Not needed.

Documentation Updates

Not really needed? Tho maybe we need to document ollama/ollama#6262 in the main docs generally for ollama embedding (that is the source of the OLLAMA_NUM_PARALLEL=1 OLLAMA_MAX_QUEUE=1000000 ollama serve workaround).

Additional Notes

Get in Touch


Important

Specifies base64 encoding in OpenAICompatibleEmbedder to prevent dimensionality reduction and parses embeddings to preserve dimensions.

  • Behavior:
    • Specifies base64 as encoding_format in OpenAICompatibleEmbedder to prevent dimensionality reduction and integer truncation.
    • Parses base64 embeddings into Float32Array to preserve embedding dimensions.
  • Logging:
    • Adds logging for embedding length and first 10 values in _embedBatchWithRetries() function.

This description was created by Ellipsis for 4b04411. You can customize this summary. It will automatically update as commits are pushed.

@dflatline dflatline requested review from mrubens, cte and jr as code owners June 8, 2025 20:08
@dosubot dosubot bot added size:S This PR changes 10-29 lines, ignoring generated files. bug Something isn't working size:M This PR changes 30-99 lines, ignoring generated files. and removed size:S This PR changes 10-29 lines, ignoring generated files. labels Jun 8, 2025
@daniel-lxs daniel-lxs added the Issue/PR - Triage New issue. Needs quick review to confirm validity and assign labels. label Jun 9, 2025
@daniel-lxs daniel-lxs moved this from Triage to PR [Needs Prelim Review] in Roo Code Roadmap Jun 9, 2025
Copy link
Collaborator

@daniel-lxs daniel-lxs left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey @dflatline, I reviewed and tested your PR, it indeed solves the issue but I am not sure why. I tried passing the dimensions directly to the request and that also didn't solve the issue.

While there's nothing wrong with your PR do you think you can try and find out why exactly is the current implementation failing?

Let me know what you think!

Comment on lines 135 to 141
console.log(`[OpenAI-Compatible Embedder] After mapping - embedding length: ${embeddings[0]?.length}`)
if (embeddings[0]) {
console.log(
`[OpenAI-Compatible Embedder] First 10 values after mapping:`,
embeddings[0].slice(0, 5),
)
}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we can probably remove these.

@@ -111,10 +111,37 @@ export class OpenAICompatibleEmbedder implements IEmbedder {
const response = await this.embeddingsClient.embeddings.create({
input: batchTexts,
model: model,
encoding_format: "base64", // Use base64 to protect embedding dimensions from openai sabotage
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not sure this is actually sabotage or not but, we probably want to remove the comment.

@daniel-lxs daniel-lxs moved this from PR [Needs Prelim Review] to PR [Changes Requested] in Roo Code Roadmap Jun 9, 2025
@daniel-lxs daniel-lxs added PR - Changes Requested and removed Issue/PR - Triage New issue. Needs quick review to confirm validity and assign labels. labels Jun 9, 2025
@dflatline
Copy link
Author

Hey @dflatline, I reviewed and tested your PR, it indeed solves the issue but I am not sure why. I tried passing the dimensions directly to the request and that also didn't solve the issue.

While there's nothing wrong with your PR do you think you can try and find out why exactly is the current implementation failing?

I too tried quite a number of different ways of providing the dimension to the API, to no effect. Everything was happening fine on the wire with and without a dimension param. I monitored the local interface via wireshark.

The bug definitely seems to be happening in the OpenAI library, which has a number of abstractions around parsing and marshaling that I could not easily get to the bottom of... That's why I just declared it sabotage, put on the Beastie Boys, and worked around it. 😅

It does feel like a bug should be filed with OpenAI. It seems like the best thing to do there would be either a unit test or minimal standalone node app that triggers the behavior. I am not sure the best way to extract a minimal test case, as I am not a typescript or node developer. Do you have any suggestions for minimal way to exercise this for purpose of reporting to OpenAI?

@dflatline
Copy link
Author

Hrmm.. It looks like the OpenAI parsing code has had bugs before.

It looks like Roo is pinned to 4.78.1 of OpenAI package, but there was a fix for this embedding API in 4.92: openai/openai-node#1448

I don't yet know if that is the same bug though. Frankly I am new to npm and typescript... I got to this side-quest from trying to improve codebase search for a python project of mine. So I am still open to suggestions for the best way to make a test case for this bug to determine if it is indeed fixed in later OpenAI package versions.

@dosubot dosubot bot added size:S This PR changes 10-29 lines, ignoring generated files. and removed size:M This PR changes 30-99 lines, ignoring generated files. labels Jun 11, 2025
@dflatline
Copy link
Author

Ok I built with OpenAI v4.103 and confirmed it had the fix from openai/openai-node#1448, but it does not fix the issue for ollama and LM Studio openai endpoints.

Interestingly, I can no longer get llama.cpp endpoints to exhibit the issue with any version, but I am getting other strange behavior.

I have a test script of the OpenAI package in at #4462 (comment)

Copy link
Collaborator

@daniel-lxs daniel-lxs left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey @dflatline, Thank you for looking into this, I think we can use this workaround for the time being.

I left a suggestion regarding the testing, it might be a good idea to test the base64 path in case something changes in the future.

Let me know what you think!

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To ensure the new base64 decoding logic is fully validated, could we update the mock response in this test file? If the mock returned a base64 string instead of a float array, we could confirm the decoding path is exercised correctly.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok I added three tests for this.

I also added a test to verify that the OpenAI package does not parse the embeddings when you request a base64 format, and that they are then as expected. That way, we know if they (or some rando) "fixes" this behavior so the openai package parses whatever format you request for you.

All of these tests are generated with RooCode and claude4.. That OpenAI one was a real pain to get right. It kept wanting to mock it at the RooCode api, and not the post inside of the OpenAI package. In fact, Claude4 would frequently revert this internal mocking of OpenAI to mock it in at the Roo package level while fixing other unrelated issues with the tests... So while handy for anti-fragility of this workaround, the OpenAI test might be "vibe-fragile" itself, in that some future vibe coded tests might "fix" this OpenAI mocking again to move mocking back to the wrong higher level.. We live in interesting times!

@dosubot dosubot bot added size:L This PR changes 100-499 lines, ignoring generated files. and removed size:S This PR changes 10-29 lines, ignoring generated files. labels Jun 12, 2025
Copy link
Collaborator

@daniel-lxs daniel-lxs left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey @dflatline, Thank you for your contribution!
I made some minor improvements to the typing.

I tested it and works perfectly!

@dosubot dosubot bot added the lgtm This PR has been approved by a maintainer label Jun 12, 2025
@daniel-lxs daniel-lxs moved this from PR [Changes Requested] to PR [Needs Review] in Roo Code Roadmap Jun 12, 2025
@mrubens
Copy link
Collaborator

mrubens commented Jun 12, 2025

Is the test failure here legit?

@mrubens mrubens merged commit c481827 into RooCodeInc:main Jun 12, 2025
10 checks passed
@github-project-automation github-project-automation bot moved this from New to Done in Roo Code Roadmap Jun 12, 2025
@github-project-automation github-project-automation bot moved this from PR [Needs Review] to Done in Roo Code Roadmap Jun 12, 2025
@hannesrudolph
Copy link
Collaborator

@dflatline do you have discord?

cte pushed a commit that referenced this pull request Jun 24, 2025
* Manually specify openai-compat format and parse it

* fixup! Manually specify openai-compat format and parse it

* Expect base64 in embedding test arguments

* fixup! Manually specify openai-compat format and parse it

Remove debug logs

* fixup! Manually specify openai-compat format and parse it

Improve comment

* fixup! Manually specify openai-compat format and parse it

* Add tests to exercise base64 decode of embeddings

* Add tests to verify openai base64 and brokenness behavior

* feat: improve typing

* refactor: switch from jest to vitest for mocking in tests

---------

Co-authored-by: Dixie Flatline <dflatline>
Co-authored-by: Daniel Riccio <ricciodaniel98@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working lgtm This PR has been approved by a maintainer PR - Needs Review size:L This PR changes 100-499 lines, ignoring generated files.
Projects
Archived in project
Development

Successfully merging this pull request may close these issues.

OpenAI-Compatible embedding dimension fail for codebase indexing
4 participants