Description
Confirm this is a Node library issue and not an underlying OpenAI API issue
- This is an issue with the Node library
Describe the bug
With some API endpoints, this package truncates the embedding dimension to 256. For models with a larger embedding dimension than 256, this is a massive accuracy loss.
So far, ollama and LM Studio openai-compatible endpoints suffer from this issue. llama-server does not.
This issue impacts RooCode (see RooCodeInc/Roo-Code#4462).
To Reproduce
Download embedding an model with more than 256 dimensions in ollama and LM Studio. For example, snowflake-arctic-embed-large-v2 has 1024 dimensions.
ollama pull Definity/snowflake-arctic-embed-l-v2.0-q8_0
Then, run the attached node script, and notice that the package truncates the expected dimensions of 1024 to 256. If the same model from LM Studio is served from llama-server, there is no problem:
llama-server -m ~/.cache/lm-studio/models/Casual-Autopsy/snowflake-arctic-embed-l-v2.0-gguf/snowflake-arctic-embed-l-v2.0-q6_k.gguf --embeddings -c 4096 -ngl 99
Code snippets
OS
macOS
Node version
Node v23.11.0
Library version
openai 4.104.0