Feedback about local LLM usage with paper-qa (share your experience about different LLMs and their parametrization). #753
Replies: 9 comments 6 replies
-
As an example :
|
Beta Was this translation helpful? Give feedback.
-
Thanks for starting this discussion, @Snikch63200. It is very needed. PaperQA is highly customizable, and I noticed some users had trouble using the So, I wrote this tutorial. I hope this will help users using |
Beta Was this translation helpful? Give feedback.
-
I am happy to see you get it running. If you could also set an ollama embedding model in your script, then you are completely running on ollama. I will later show with console outputs on my desktop so you could have an better idea. Momentarily, please allow me explain: First, When you say "The configuration required to run ollama is the same as that required to use any provider other than OpenAI", well, it is only true when you run it in the script as you have demonstrated. However, one can not run the Second, about Bundled Settings in contracrow.json. One can not run ollama with contracrow.json setting file directly without editing it. And even after setting the ollama/LocalModels in the config file, it may not work according to my limited previous tests. Can you also test it with As you see, I raise the questions about using the I am sorry I can't provide outputs and command details right now, but I promise I will post the details because I like paper-qa. @Snikch63200 Please continue exploring ollama and paper-qa, I learn a lot from your questions and posts. I appreciate it. I will also update with the statistics of my usage. |
Beta Was this translation helpful? Give feedback.
-
Please be as extensive as possible so that all users can learn. Allow me to reply with three points: First, the (ideal) expected usage of
In this way, one can start using pqa with ollama as equally and as rapidly as they do with those corporate APIs. Unfortunately, users can not do it, and then they go to the backup plan. Second, I tried the exact settings you posted above, but unfortunately, it does not work. It can not index and read the paper. This is what I did: pqa --llm "ollama/llama3.2" \
--summary_llm "ollama/llama3.2" \
--agent.agent_llm "ollama/llama3.2" \
--embedding "ollama/mxbai-embed-large" \
--llm_config '{"model_list": [{"model_name": "ollama/llama3.2", "litellm_params": {"model": "ollama/llama3.2", "api_base": "http://localhost:11434"}}]}' \
--summary_llm_config '{"model_list": [{"model_name": "ollama/llama3.2", "litellm_params": {"model": "ollama/llama3.2", "api_base": "http://localhost:11434"}}]}' \
--agent.agent_llm_config '{"model_list": [{"model_name": "ollama/llama3.2", "litellm_params": {"model": "ollama/llama3.2", "api_base": "http://localhost:11434"}}]}' \
save ollama
[19:36:26] Settings saved to: /home/x/.pqa/settings/ollama.json then I enter the paper dir pqa -s ollama ask "main idea of the paper"
Could not find cost for model ollama/llama3.2.
Encountered exception during tool call for tool gather_evidence: EmptyDocsError('Not gathering evidence due to having no papers.')
Could not find cost for model ollama/llama3.2.
[19:47:09] Completing 'main idea of the paper' as 'certain'.
Generating answer for 'main idea of the paper'.
Could not find cost for model ollama/llama3.2.
Status: Paper Count=0 | Relevant Papers=0 | Current Evidence=0 | Current Cost=$0.0000
[19:47:10] Answer: I cannot provide an answer without the context. Please provide the context for the question "main idea of the paper", and I'll be happy to
assist you in writing a concise and accurate response in the style of a Wikipedia article.
/home/x/Micromamba/envs/paperQA517/lib/python3.12/site-packages/pydantic/main.py:426: UserWarning: Pydantic serializer warnings:
Expected `bool` but got `str` with value `'False'` - serialized value may not be as expected
return self.__pydantic_serializer__.to_python( As you see. It does not index the paper so it can not answer. I wonder what is the output on your side when you run the code above. Then I tried to tackle the problem by adding the paper path manually and create a new setting by adding one extra line to designate the dir path: pqa --llm "ollama/llama3.2" \
--summary_llm "ollama/llama3.2" \
--agent.agent_llm "ollama/llama3.2" \
--embedding "ollama/mxbai-embed-large" \
--llm_config '{"model_list": [{"model_name": "ollama/llama3.2", "litellm_params": {"model": "ollama/llama3.2", "api_base": "http://localhost:11434"}}]}' \
--summary_llm_config '{"model_list": [{"model_name": "ollama/llama3.2", "litellm_params": {"model": "ollama/llama3.2", "api_base": "http://localhost:11434"}}]}' \
--agent.agent_llm_config '{"model_list": [{"model_name": "ollama/llama3.2", "litellm_params": {"model": "ollama/llama3.2", "api_base": "http://localhost:11434"}}]}' \
**--agent.index.paper_directory "/home/x/pe" \**
save ollama2 This time, it did see the paper, but it also failed to continue:
so... llama3.2 does not work. I am going to use llama3.3 anyway, so again I changed the settings to this: pqa --llm "ollama/llama3.3" \
--summary_llm "ollama/llama3.3" \
--agent.agent_llm "ollama/llama3.3" \
--embedding "ollama/mxbai-embed-large" \
--llm_config '{"model_list": [{"model_name": "ollama/llama3.3", "litellm_params": {"model": "ollama/llama3.3", "api_base": "http://localhost:11434"}}]}' \
--summary_llm_config '{"model_list": [{"model_name": "ollama/llama3.3", "litellm_params": {"model": "ollama/llama3.3", "api_base": "http://localhost:11434"}}]}' \
--agent.agent_llm_config '{"model_list": [{"model_name": "ollama/llama3.3", "litellm_params": {"model": "ollama/llama3.3", "api_base": "http://localhost:11434"}}]}' \
--agent.index.paper_directory "/home/x/pe" \
save ollama3 and then
To sum: it works with llama3.3 and setting the paper dir and with one paper (not multiple papers, see below) Third, now, further configuring Bundled Settings such as contracrow.json is too daunting for me now. But these bundled settings are very useful. Could you test the bundle settings, particularly contracrow.json to see how to make it work with ollama? I appreciate the project and the work of the team. |
Beta Was this translation helpful? Give feedback.
-
Further testing with only 8 papers, there is a problem of timing out and it took an extremely long time and fail:
I am attaching the setting json file for your thorough review: and here is the failure message:
|
Beta Was this translation helpful? Give feedback.
-
Hi @rosegarden-coder , Half-sucessfully tested The first problem you noticed :
It's typically a parameter problem (that's why, i created this discussion about parameters considering informations about LLM 'ideal' parameters are very hard to find). I guess by increasing context size. For
Notice :
Second problem you noticed :
-> Just increase 'timeout' to 600 in parameters (see above). For me, combination of In bief, Best regards. |
Beta Was this translation helpful? Give feedback.
-
@Snikch63200 I appreciate your guidance. I have been very busy during the week and I will test your recommendations later. Can you look at the attached json file? ollama3.json Can I put the settings you suggested into this file and make it work -- running |
Beta Was this translation helpful? Give feedback.
-
Hello @rosegarden-coder and @Snikch63200 . Thanks for looking into this further. Let me quickly answer a few of your questions and I will get back to the other ones later. Is that ok?
Notice that using this kind of interface, we would allow configuration through the environment variable. PaperQA does not do that. All the configuration of the agents/llms is done through the
This is an actual bug. What happens is: when you save the settings, PaperQA is saving a whole
As greatly noticed by @Snikch63200 this is a misbehavior of the LLM itself. It seems that Llama failed to complete a valid json of follow the correct schema. I could reproduce this error when I tried to run the exact same command multiple times. Llama3.2 indeed failed sometimes. This is not a PaperQA error. Finally, I'll some more time to look into details of this timeout error as I couldn't reproduce. I will also get back to you on the Thanks for your patience on that matter , I'll get back to you asap. In the meantime, thank you so much. @Snikch63200 for the support here |
Beta Was this translation helpful? Give feedback.
-
Hello @Snikch63200 For the spirit of libre software, can I shamelessly ask you to share your script and configuration? I see that you have dug deeper into this topic. |
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
-
After a lot of trials with different local LLMs and different parameters (also called 'options'), I decided to open this discussion to share experiences about this.
My aim is to compare performances of different local models, with different configurations.
This is not a discussion about configuration issues but configuration optimization.
This is not, stricto sensu, a discussion about paper-qa settings
For each trial thanks to precise :
answer_max_sources
,evidence_k
andmax_concurrent_requests
that have a major impact on speed and answer relevance.New ideas for standardization of tests are welcome.
Beta Was this translation helpful? Give feedback.
All reactions