Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: introducing Ollama embeddings properties. #2690

Closed
wants to merge 2 commits into from

Conversation

mkhludnev
Copy link

Description

Ollama embeddings should be properly configured via these props.
Now only base_url is passed to OllamaEmbeddings. It causes the following issues:

This change let users to configure embeddings models which is hosted in Ollama.

Checklist before requesting a review

Please delete options that are not relevant.

  • [v] My code follows the style guidelines of this project
  • [v] I have performed a self-review of my code
  • [-] I have commented hard-to-understand areas
  • [-] I have ideally added tests that prove my fix is effective or that my feature works
  • [-] New and existing unit tests pass locally with my changes
  • [-] Any dependent changes have been merged

Screenshots (if appropriate):

Copy link

vercel bot commented Jun 18, 2024

Someone is attempting to deploy a commit to the Quivr-app Team on Vercel.

A member of the Team first needs to authorize it.

@StanGirard StanGirard requested a review from AmineDiro June 18, 2024 20:31
@StanGirard
Copy link
Collaborator

Thanks a lot for this PR! Being on holiday (still looking at PR) @AmineDiro will review the PR ;)

@mkhludnev mkhludnev changed the title introducing Ollama embeddings properties. feat: introducing Ollama embeddings properties. Jun 19, 2024
@filipe-omnix
Copy link

Hi could you please provide an example of changes to the env ? (OLLAMA_EMBEDDINGS_* part please). Thanks in advance!

@mkhludnev
Copy link
Author

example of changes to the env ? (OLLAMA_EMBEDDINGS_* part please).

fair! Here's my props

OLLAMA_EMBEDDINGS_MODEL=chatfire/bge-m3:q8_0 # just because we deployed this embeddings model, choose yours
OLLAMA_EMBEDDINGS_DOC_INSTRUCT=              # just because there are certain values by default https://github.com/langchain-ai/langchain/blob/c314222796798545f168f6ff7e750eb24e8edd51/libs/community/langchain_community/embeddings/ollama.py#L40
OLLAMA_EMBEDDINGS_QUERY_INSTRUCT=            # but instructions are not necessary for bge-m3 see faq#2 https://huggingface.co/BAAI/bge-m3#faq

@mkhludnev
Copy link
Author

@filipe-omnix can you confirm if this patch is useful for you?

@andyzhangwp
Copy link

andyzhangwp commented Jul 3, 2024

@mkhludnev

I applied the above update, but still encountered an error during local testing: {"error": "model 'llama2' not found, try pulling it first"}

The following are debugging logs:

1、get_embeddings of models/setting.py , mode is llama3:
DEBUG:httpcore.http11:response_closed.complete
backend-core | ======get_embeddings=====embeddings=[base_url='http://33a45d4e.r11.cpolar.top' model='llamafamily/llama3-chinese-8b-instruct' embed_instruction='passage:' query_instruction='query:' mirostat=None mirostat_eta=None mirostat_tau=None num_ctx=None num_gpu=None num_thread=None repeat_last_n=None repeat_penalty=None temperature=None stop=None tfs_z=None top_k=None top_p=None show_progress=False headers=None model_kwargs=None]

Here you can see that the model is llama3, indicating that the configuration is valid.

2、similarity_search of vectorstore/supabase.py, :

====111======similarity_search=====self._embedding=[base_url='http://33a45d4e.r11.cpolar.top' model='llama2' embed_instruction='passage: ' query_instruction='query: ' mirostat=None mirostat_eta=None mirostat_tau=None num_ctx=None num_gpu=None num_thread=None repeat_last_n=None repeat_penalty=None temperature=None stop=None tfs_z=None top_k=None top_p=None show_progress=False headers=None model_kwargs=None]

The model here has been changed to llama2 again, and the previous embeddings have not been used

3、error log :

backend-core | | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
backend-core | | File "/code/vectorstore/supabase.py", line 76, in similarity_search
backend-core | | vectors = self._embedding.embed_documents([query])
backend-core | | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
backend-core | | File "/usr/local/lib/python3.11/site-packages/langchain_community/embeddings/ollama.py", line 211, in embed_documents
backend-core | | embeddings = self._embed(instruction_pairs)
backend-core | | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
backend-core | | File "/usr/local/lib/python3.11/site-packages/langchain_community/embeddings/ollama.py", line 199, in _embed
backend-core | | return [self.process_emb_response(prompt) for prompt in iter]
backend-core | | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
backend-core | | File "/usr/local/lib/python3.11/site-packages/langchain_community/embeddings/ollama.py", line 199, in
backend-core | | return [self.process_emb_response(prompt) for prompt in iter]
backend-core | | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
backend-core | | File "/usr/local/lib/python3.11/site-packages/langchain_community/embeddings/ollama.py", line 173, in _process_emb_response
backend-core | | raise ValueError(
backend-core | | ValueError: Error raised by inference API HTTP code: 404, {"error":"model 'llama2' not found, try pulling it first"}

@mkhludnev
Copy link
Author

code: 404, {"error":"model 'llama2' not found, try pulling it first"}

@andyzhangwp I'm afraid the patch hasn't been applied fully. At first check that OllamaEmbeddings occurs only once across backend sources - in settings.py. Then, check that there's no cached, compiled or so code left.

@yukha-dw
Copy link

yukha-dw commented Sep 4, 2024

What if we want to use multiple llama models?

Copy link
Contributor

github-actions bot commented Dec 3, 2024

Thanks for your contributions, we'll be closing this PR as it has gone stale. Feel free to reopen if you'd like to continue the discussion.

@github-actions github-actions bot added the Stale label Dec 3, 2024
@github-actions github-actions bot closed this Dec 8, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area: backend Related to backend functionality or under the /backend directory size:S This PR changes 10-29 lines, ignoring generated files. Stale
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants