Replies: 12 comments 9 replies
-
I'm facing the exact same issue where it finds all the right references but doesn't put any of them into the generated text. |
Beta Was this translation helpful? Give feedback.
-
I am facing the same issue as well. |
Beta Was this translation helpful? Give feedback.
-
same issue |
Beta Was this translation helpful? Give feedback.
-
This is somewhat model dependent, but the langchain tool uses a default prompt for 'query from included data' that is less than ideal for many models. The key point is that the prompt does not tell the model to ignore its trained knowledge and extract the answers from the excerpt of your library supplied in the prompt buffer. The basic langchain prompt, currently used is this: For the model I am using at the moment, this prompt works much better: I did not think up that strategy. I have seen it used in a number of videos dealing with overriding what the model thinks it knows from training with the data you are providing for it within the prompt. It is not terribly difficult to override the langchain default prompt for this case, just not obvious. There is hope for the future. |
Beta Was this translation helpful? Give feedback.
-
is anyone familiar with these proposed changes to speed things up? |
Beta Was this translation helpful? Give feedback.
-
All of the above are part of the GPU adoption Pull Requests that you will find at the top of the page. Or go here: The Reddit message does seem to make a good attempt at explaining 'the getting the GPU used by privateGPT' part of the problem, but I have not tried that specific sequence. |
Beta Was this translation helpful? Give feedback.
-
A bit late to the party, but in my playing with this I've found the biggest deal is your prompting. if i ask the model to interact directly with the files it doesn't like that (although the sources are usually okay), but if i tell it that it is a librarian which has access to a database of literature, and to use that literature to answer the question given to it, it performs waaaaaaaay better. The model may play a part in it too, I only really nailed it with WizardLM-7B |
Beta Was this translation helpful? Give feedback.
-
That makes me super late to this party. The example document "state of the union" does not work for a simple entity question asking for cities mentioned or referenced. I have not yet looked into adjusting the prompt, will look into that next. But, I thought this is a good test case example for the demo provided. "Were any cities referenced in the state of the union?" Expected Answer: "Columbus, Ohio" |
Beta Was this translation helpful? Give feedback.
-
@thekit |
Beta Was this translation helpful? Give feedback.
-
Hi there, I ran into a different problem with privateGPT. I ingested a pretty large pdf file (more than 1000 pages) and saw that the right references are not found. This should not be an issue with the prompt but rather with embedding, right? How can I tackle this problem? |
Beta Was this translation helpful? Give feedback.
-
There is another issue I was facing: I typed the query "Were any cities referenced in the state of the union?" like petegordon did above. Yet I got a different result: When printing 4 references, there are only 2 distinct ones (for every query I tried). I did not alter any code. What could be the cause? |
Beta Was this translation helpful? Give feedback.
-
I suppose I'm really late for the party but I'll ask anyway. |
Beta Was this translation helpful? Give feedback.
-
after installing privateGPT as in this discussion here #233
I found it took forever to ingest the state of the union .txt on my i7 with 16gb of ram so I got rid of that input file and made my own - a text file that has only one line:
Jin thinks that xargebarge is pretty cool.
this time ingesting the source_documents was very quick with only one line to fine tune on so I asked it "what does Jin think of xargebarge?"
and the response was "I don't know" and the citation document was the input file and it pointed to the line that said "Jin thinks xargebarge is pretty cool"
Will adding more text improve this performance? it has managed to pattern match the appropriate document but it is not able to generate a meaningful answer.
I am using vecuna as a model because I couldn't get the default model to work
PS C:\ai_experiments\privateGPT> cat .env
PERSIST_DIRECTORY=db
LLAMA_EMBEDDINGS_MODEL=models/gpt4-x-vicuna-13B.ggml.q5_1.bin
MODEL_TYPE=GPT4All
MODEL_PATH=models/ggml-gpt4all-j-v1.3-groovy.bin
MODEL_N_CTX=1000
PS C:\ai_experiments\privateGPT>
here is the interaction
PS C:\ai_experiments\privateGPT> python .\privateGPT.py
llama.cpp: loading model from models/gpt4-x-vicuna-13B.ggml.q5_1.bin
llama_model_load_internal: format = ggjt v2 (latest)
llama_model_load_internal: n_vocab = 32001
llama_model_load_internal: n_ctx = 1000
llama_model_load_internal: n_embd = 5120
llama_model_load_internal: n_mult = 256
llama_model_load_internal: n_head = 40
llama_model_load_internal: n_layer = 40
llama_model_load_internal: n_rot = 128
llama_model_load_internal: ftype = 9 (mostly Q5_1)
llama_model_load_internal: n_ff = 13824
llama_model_load_internal: n_parts = 1
llama_model_load_internal: model size = 13B
llama_model_load_internal: ggml ctx size = 90.75 KB
llama_model_load_internal: mem required = 11359.05 MB (+ 3216.00 MB per state)
llama_init_from_file: kv self size = 1562.50 MB
AVX = 1 | AVX2 = 1 | AVX512 = 0 | AVX512_VBMI = 0 | AVX512_VNNI = 0 | FMA = 1 | NEON = 0 | ARM_FMA = 0 | F16C = 1 | FP16_VA = 0 | WASM_SIMD = 0 | BLAS = 0 | SSE3 = 1 | VSX = 0 |
Using embedded DuckDB with persistence: data will be stored in: db
gptj_model_load: loading model from 'models/ggml-gpt4all-j-v1.3-groovy.bin' - please wait ...
gptj_model_load: n_vocab = 50400
gptj_model_load: n_ctx = 2048
gptj_model_load: n_embd = 4096
gptj_model_load: n_head = 16
gptj_model_load: n_layer = 28
gptj_model_load: n_rot = 64
gptj_model_load: f16 = 2
gptj_model_load: ggml ctx size = 4505.45 MB
gptj_model_load: memory_size = 896.00 MB, n_mem = 57344
gptj_model_load: ................................... done
gptj_model_load: model size = 3609.38 MB / num tensors = 285
Enter a query: what does Jin think of xargebarge?
llama_print_timings: load time = 10055.44 ms
llama_print_timings: sample time = 0.00 ms / 1 runs ( 0.00 ms per token)
llama_print_timings: prompt eval time = 18268.64 ms / 12 tokens ( 1522.39 ms per token)
llama_print_timings: eval time = 0.00 ms / 1 runs ( 0.00 ms per token)
llama_print_timings: total time = 18298.43 ms
[2023-05-17 19:05:56,510] {chroma.py:128} ERROR - Chroma collection langchain contains fewer than 4 elements.
[2023-05-17 19:05:56,515] {chroma.py:128} ERROR - Chroma collection langchain contains fewer than 3 elements.
[2023-05-17 19:05:56,516] {chroma.py:128} ERROR - Chroma collection langchain contains fewer than 2 elements.
I don't know.
Enter a query:
Beta Was this translation helpful? Give feedback.
All reactions