-
Notifications
You must be signed in to change notification settings - Fork 762
Open
Labels
questionFurther information is requestedFurther information is requested
Description
Hello,
Thank you for the fantastic work on PaperQA. I’ve been able to use it to ask questions by providing over 100 papers as input, and I’ve been using only local models via Ollama. Everything is working well, but I’d like to know how I can avoid reloading the same files and retraining an embedding model each time I have a new Question.
Is there a way to save the vector store and load it later, so it can be used by the LLM to generate answers? I couldn't find documentation about that, I found something re- caching, but it's unclear to me how to use it properly. Can you provide some help?
best wishes
My code so far:
from paperqa import Settings, ask
question = "Can you list all climate-sensitive pathways that affect rotavirus incidence?"
model = "ollama/llama3.1" # "ollama/llama3.2"
embedding = "ollama/nomic-embed-text" # "nomic-embed-text"
local_llm_config = {
"model_list": [
{
"model_name": model,
"litellm_params": {
"model": model,
"api_base": "http://localhost:11434",
},
"answer": {
"evidence_k": 40,
"evidence_detailed_citations": True,
"evidence_summary_length": "about 100 words",
"answer_max_sources": 5,
"answer_length": "about 300 words, but can be longer",
"max_concurrent_requests": 4,
"answer_filter_extra_background": False
}
}
]
}
answer = ask(
question,
settings=Settings(
llm=model,
llm_config=local_llm_config,
summary_llm=model,
summary_llm_config=local_llm_config,
agent=AgentSettings(
agent_llm=model, agent_llm_config=local_llm_config
),
embedding=embedding,
paper_directory="papers/"
),
)
dosubot
Metadata
Metadata
Assignees
Labels
questionFurther information is requestedFurther information is requested