How do I improve the performance of privateGPT? it finds the correct references but is not able to answer the questions accurately #241

thekit · 2023-05-17T07:19:58Z

thekit
May 17, 2023

after installing privateGPT as in this discussion here #233

I found it took forever to ingest the state of the union .txt on my i7 with 16gb of ram so I got rid of that input file and made my own - a text file that has only one line:

Jin thinks that xargebarge is pretty cool.

this time ingesting the source_documents was very quick with only one line to fine tune on so I asked it "what does Jin think of xargebarge?"

and the response was "I don't know" and the citation document was the input file and it pointed to the line that said "Jin thinks xargebarge is pretty cool"

Will adding more text improve this performance? it has managed to pattern match the appropriate document but it is not able to generate a meaningful answer.

I am using vecuna as a model because I couldn't get the default model to work

PS C:\ai_experiments\privateGPT> cat .env
PERSIST_DIRECTORY=db
LLAMA_EMBEDDINGS_MODEL=models/gpt4-x-vicuna-13B.ggml.q5_1.bin
MODEL_TYPE=GPT4All
MODEL_PATH=models/ggml-gpt4all-j-v1.3-groovy.bin
MODEL_N_CTX=1000
PS C:\ai_experiments\privateGPT>

here is the interaction

PS C:\ai_experiments\privateGPT> python .\privateGPT.py
llama.cpp: loading model from models/gpt4-x-vicuna-13B.ggml.q5_1.bin
llama_model_load_internal: format = ggjt v2 (latest)
llama_model_load_internal: n_vocab = 32001
llama_model_load_internal: n_ctx = 1000
llama_model_load_internal: n_embd = 5120
llama_model_load_internal: n_mult = 256
llama_model_load_internal: n_head = 40
llama_model_load_internal: n_layer = 40
llama_model_load_internal: n_rot = 128
llama_model_load_internal: ftype = 9 (mostly Q5_1)
llama_model_load_internal: n_ff = 13824
llama_model_load_internal: n_parts = 1
llama_model_load_internal: model size = 13B
llama_model_load_internal: ggml ctx size = 90.75 KB
llama_model_load_internal: mem required = 11359.05 MB (+ 3216.00 MB per state)
llama_init_from_file: kv self size = 1562.50 MB
AVX = 1 | AVX2 = 1 | AVX512 = 0 | AVX512_VBMI = 0 | AVX512_VNNI = 0 | FMA = 1 | NEON = 0 | ARM_FMA = 0 | F16C = 1 | FP16_VA = 0 | WASM_SIMD = 0 | BLAS = 0 | SSE3 = 1 | VSX = 0 |
Using embedded DuckDB with persistence: data will be stored in: db
gptj_model_load: loading model from 'models/ggml-gpt4all-j-v1.3-groovy.bin' - please wait ...
gptj_model_load: n_vocab = 50400
gptj_model_load: n_ctx = 2048
gptj_model_load: n_embd = 4096
gptj_model_load: n_head = 16
gptj_model_load: n_layer = 28
gptj_model_load: n_rot = 64
gptj_model_load: f16 = 2
gptj_model_load: ggml ctx size = 4505.45 MB
gptj_model_load: memory_size = 896.00 MB, n_mem = 57344
gptj_model_load: ................................... done
gptj_model_load: model size = 3609.38 MB / num tensors = 285

Enter a query: what does Jin think of xargebarge?

llama_print_timings: load time = 10055.44 ms
llama_print_timings: sample time = 0.00 ms / 1 runs ( 0.00 ms per token)
llama_print_timings: prompt eval time = 18268.64 ms / 12 tokens ( 1522.39 ms per token)
llama_print_timings: eval time = 0.00 ms / 1 runs ( 0.00 ms per token)
llama_print_timings: total time = 18298.43 ms
[2023-05-17 19:05:56,510] {chroma.py:128} ERROR - Chroma collection langchain contains fewer than 4 elements.
[2023-05-17 19:05:56,515] {chroma.py:128} ERROR - Chroma collection langchain contains fewer than 3 elements.
[2023-05-17 19:05:56,516] {chroma.py:128} ERROR - Chroma collection langchain contains fewer than 2 elements.
I don't know.

Question:
what does Jin think of xargebarge?

Answer:
I don't know.

source_documents\test_text.txt:
Jin thinks that xargebarge is pretty cool.

Enter a query:

sharansabi · 2023-05-17T15:14:07Z

sharansabi
May 17, 2023

I'm facing the exact same issue where it finds all the right references but doesn't put any of them into the generated text.

0 replies

itaimarts · 2023-06-02T15:35:53Z

itaimarts
Jun 2, 2023

I am facing the same issue as well.

0 replies

manan0308 · 2023-06-03T17:52:52Z

manan0308
Jun 3, 2023

same issue

0 replies

johnbrisbin · 2023-06-04T18:56:46Z

johnbrisbin
Jun 4, 2023

This is somewhat model dependent, but the langchain tool uses a default prompt for 'query from included data' that is less than ideal for many models. The key point is that the prompt does not tell the model to ignore its trained knowledge and extract the answers from the excerpt of your library supplied in the prompt buffer.

The basic langchain prompt, currently used is this:
"Use the following pieces of context to answer the question at the end. If you don't know the answer, just say that you don't know, don't try to make up an answer."

For the model I am using at the moment, this prompt works much better:
"Use the following Evidence section and only that Evidence to answer the question at the end. Do not use your internal knowledge. If you don't know the answer, just say that you don't know, don't try to make up an answer."

I did not think up that strategy. I have seen it used in a number of videos dealing with overriding what the model thinks it knows from training with the data you are providing for it within the prompt.

It is not terribly difficult to override the langchain default prompt for this case, just not obvious. There is hope for the future.

1 reply

tsobey Aug 11, 2023

@johnbrisbin
Might i kindly ask for your help on this?

Where do i go in my PrivateGPT folder and which file do I modify the default prompt?
Note I have ggml-gpt4all-j-v1.3-groovy.bin model.
Can I use your exact text for the prompt? Many many thanks! Tom

geller6980 · 2023-06-05T20:01:58Z

geller6980
Jun 5, 2023

is anyone familiar with these proposed changes to speed things up?

https://www.reddit.com/r/LocalLLaMA/comments/13qo59f/how_to_increase_privategpt_performance_by_up_to_2x/

0 replies

johnbrisbin · 2023-06-06T17:08:25Z

johnbrisbin
Jun 6, 2023

All of the above are part of the GPU adoption Pull Requests that you will find at the top of the page. Or go here:
#425
#521

The Reddit message does seem to make a good attempt at explaining 'the getting the GPU used by privateGPT' part of the problem, but I have not tried that specific sequence.

0 replies

Ciaranwuk · 2023-06-26T08:14:33Z

Ciaranwuk
Jun 26, 2023

A bit late to the party, but in my playing with this I've found the biggest deal is your prompting. if i ask the model to interact directly with the files it doesn't like that (although the sources are usually okay), but if i tell it that it is a librarian which has access to a database of literature, and to use that literature to answer the question given to it, it performs waaaaaaaay better. The model may play a part in it too, I only really nailed it with WizardLM-7B

3 replies

tsobey Aug 11, 2023

@Ciaranwuk
Might I ask for your help?
Is this modifying the default prompt, or how does one actually make the changes? Would be great if you could list a little detail to help out a newbie thanks! :) Tom

spencercsheehan1 Aug 25, 2023

@Ciaranwuk I tried that just now and it responded saying " I am not able to provide an answer as my programming is limited in terms of language capabilities."

Ciaranwuk Aug 29, 2023

@tsobey I went into the code of privateGPT.py and added an extra "context" variable that I handed into the "query" string variable.
Inside the "while True" loop I added context = "context I hand before query", then full_query = context + query, then change the line "res = qa(query)" to use full_query instead

@spencercsheehan1 try wording it differently. It can be made to work, but it takes some tweaking and some models respond better than others.

here is the full context I hand it before the query "you are libraryAI. You are an AI LLM that has access to a database of physics related literature. Use this literature to answer the question asked of you. If you cannot find the answer in the information, just say that you are unsure and give an estimate as to what you think the answer is, making sure to let the user know that this is your best guess rather than 100% true. the question is:"

This tends to work very well. Although it occasionally struggles with other tasks (it won't often write me a poem on the papers anymore)

petegordon · 2023-06-26T11:35:14Z

petegordon
Jun 26, 2023

That makes me super late to this party.

The example document "state of the union" does not work for a simple entity question asking for cities mentioned or referenced. I have not yet looked into adjusting the prompt, will look into that next. But, I thought this is a good test case example for the demo provided.

"Were any cities referenced in the state of the union?"
"No specific mention was made throughout this text about various topics such as states or their economies being discussed for American economy's growth potential, but it can be inferred that President Trump and members spoke highly"

Expected Answer: "Columbus, Ohio"

1 reply

Ciaranwuk Jun 28, 2023

What other prompts have you tried on this? and have you tried other models? I have found this (where the model correctly drops the info under a source but doesn't answer it properly) can happen a few times if your prompt doesn't "convince" it that it has access to the file or similar. I would still count it as a win if it manages to drop the line in the source documents section, as human users can recognise that

tsobey · 2023-08-11T01:02:04Z

tsobey
Aug 11, 2023

@thekit
How did you end up going with this? Might you provide a few details if you had any success that could help out a newbie? Many thanks, Tom

0 replies

Lue-C · 2023-09-14T07:53:27Z

Lue-C
Sep 14, 2023

Hi there, I ran into a different problem with privateGPT. I ingested a pretty large pdf file (more than 1000 pages) and saw that the right references are not found. This should not be an issue with the prompt but rather with embedding, right? How can I tackle this problem?
I used the default configuration of the privateGPT repo

4 replies

Ciaranwuk Sep 15, 2023

it probably is still the prompt (unless there is an error output saying no references found or similar). The linked references are based on a nearest neighbour search of the database using your query as the reference, but the answer is not always built from those references, particularly if the answer to the question you asked could be found on the internet.

Although, did you fuse all of the references into one pdf file and ingest that? in which case, that would also explain the issue, as the database can only see 1 item to reference. If you want it to be able to give separate names to each part of the large pdf then you need to break it into different files with different names

Lue-C Sep 18, 2023

Hi Ciaranwuk,

thanks for the answer. I just put my pdf into the source_documents folder and built the data base using ingest.py.
As far as I understand, the prompt tells the model how to answer incoming questions. The semantic search should not be affected by it. Thus, the references should still fit to the query even if the prompt is bad. Am I right?
Thanks for the advice to split up the pdf file, I will try this.

Regards

Ciaranwuk Sep 18, 2023

the prompt IS your question. when you ask it "were any cities mentioned..." that is the prompt. try rewording it if you don't get the answer you expect. There is some pre-context added to the front of your question by langchain, but this isn't particularly good by all accounts, which is why I added my own context to the front of every question (look at replies higher up)

Lue-C Sep 25, 2023

Thanks for clarification in the wording. I split up the pdf and altered the pre-context, too. Thanks for the newbie-friendly answers:)

Lue-C · 2023-09-18T13:18:42Z

Lue-C
Sep 18, 2023

There is another issue I was facing: I typed the query "Were any cities referenced in the state of the union?" like petegordon did above. Yet I got a different result:

When printing 4 references, there are only 2 distinct ones (for every query I tried). I did not alter any code. What could be the cause?

0 replies

MarioRossiGithub · 2024-06-25T22:45:10Z

MarioRossiGithub
Jun 25, 2024

I suppose I'm really late for the party but I'll ask anyway.
I'm really loving this PrivateGPT project but I'm also facing a lot of issues about wrong replies. The questions (quiz-like) I provide are written in the exact same way in the pdf I ingest. But somehow it's not able to understand the correct answer.
Did you by any chance found a solution to solve it?

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

How do I improve the performance of privateGPT? it finds the correct references but is not able to answer the questions accurately #241

{{title}}

Replies: 12 comments 9 replies

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

{{title}}

{{title}}

{{title}}

Select a reply

How do I improve the performance of privateGPT? it finds the correct references but is not able to answer the questions accurately #241

Replies: 12 comments · 9 replies

Replies: 12 comments 9 replies