Please add your ideas - infrastructure planing thread #38

nilsherzig · 2024-04-02T10:20:43Z

nilsherzig
Apr 2, 2024
Maintainer

Please comment if you have any thoughts on this:

We have a "chat layer" which has a history of the user's prompts and messages (green and purple). If needed, this chat chain can call other chains, which run autonomous without user interaction by using a self critique loop.

It would be very easy to add more chains to this, like a "programming chain" using deepseek-coder as the LLM.

I would have to come up with a config format for these chains, but essentially they are just a couple of conditions and strings in / strings out.

romanr · 2024-04-04T13:24:03Z

romanr
Apr 4, 2024

Aren’t GitHub Discussions better suited for this?

1 reply

nilsherzig Apr 4, 2024
Maintainer Author

good point, there you go

priyanmuthu · 2024-04-04T14:41:53Z

priyanmuthu
Apr 4, 2024

I like the idea of tools. I'm very interested in the programming chain, but the challenge is to run it in a sandboxed environment I suppose?

1 reply

nilsherzig Apr 4, 2024
Maintainer Author

yes we would need a solution for this, but there are sandboxing solutions out there and interpreted languages like python can be "filtered".

andrejguran · 2024-04-04T16:42:36Z

andrejguran
Apr 4, 2024

Love the project and idea to open source perplexity 👍
Can you describe a little how does the vector DB get populated with that, what data and how the results are being processed?

2 replies

nilsherzig Apr 4, 2024
Maintainer Author

I think i will write a more detailed post on this. But essentially:

downloads top n websites
removed html tags
removed some other unwanted things
splits resulting text into little chunks (tries to split at new lines or new paragraphs) + chunks have some overlap
chunks are saved into the vector db
vector db returns top n matches on query

arronKler Apr 17, 2024

if the search process is a one round task, why not just remove vector db component and just do text embedding and similarity search in the working memory.
Or maybe just using a ColBert model here is enough to get most relevant text.

tslmy · 2024-04-04T17:28:09Z

tslmy
Apr 4, 2024

Is this "retrieval augmented generation" (RAG), and -- with tools -- "LLM agent"? I didn't see these terms mentioned in the document, so I was wondering if I'm missing something.

2 replies

nilsherzig Apr 8, 2024
Maintainer Author

Assuming my understanding of these terms is right, yes.

I have no real education on the whole LLM topic so I avoided the use of specific terminology :)

Viibrant Apr 8, 2024

Yes it sounds like it, I thought I missed something as well

aagha · 2024-04-04T18:19:34Z

aagha
Apr 4, 2024

I'd love the ability to add local folders (Obsidian notes), Google Drive/Dropbox locations to search.

4 replies

nilsherzig Apr 4, 2024
Maintainer Author

Yes, i think that's a great idea and will probably be the next tool :)

I got it working with just markdown (since im also using obsidian). But im still working on adding csv and pdf support.

twilwa Apr 5, 2024

seconding this -- additionally, llava1.6 works in ollama, so you can vectorize pdfs, screenshots, screen recordings, etc.

nilsherzig Apr 7, 2024
Maintainer Author

oh thats very good to know, thanks :)

Zackaryia Apr 9, 2024

Also being able to scan my entire disk (terrabytes+) and allowing me to find files ive forgotten or search in vector space for my data would be cool.

I would like a more traditional non-llm 10 blue links type of search through the vector db though. I find that particularly useful for finding a photo or video or document that is on my disk but I just cant find it, or i want to agregate all files of a certain type.

Although some data would be more important like my obsidian vault, vs random files on disk so a weighting system may be needed but in practice that might not be true.

zaggynl · 2024-04-04T21:58:16Z

zaggynl
Apr 4, 2024

I combined this with distrobox to setup an ubuntu linux container for rocmn support with 7900xtx in ollama:

Used the ini from this ROCm/ROCm#2990 (comment)
distrobox assemble create --file distrobox.ini
distrobox enter rocm-rl
get the ollama install.sh file
modify the ollama install file to include
Environment="OLLAMA_HOST=0.0.0.0
install it

It does need some limits, this froze up my machine :D
"Action: Calculator. Action Input: list(range(999999999))"

1 reply

nilsherzig Apr 8, 2024
Maintainer Author

Hahaha thanks for reporting. The calculator tool isn't mine (just to shift some blame hehe) but I'm going to add some sort of limiter.

In the meantime you can specify how much resources a docker container can use (it's just some extra lines inside the compose file).

xx77yy · 2024-04-08T22:34:45Z

xx77yy
Apr 8, 2024

what does it uses to search on internet?

1 reply

Technetium1 Apr 11, 2024

https://github.com/searxng/searxng. You can see some random instances available for testing it here: https://searx.space/

Zackaryia · 2024-04-09T04:00:29Z

Zackaryia
Apr 9, 2024

Non web data sources.

Wikidata
Wikipedia
Wolframalpha
Khanacademy
Specific library documentation
Youtube video captions
Wolfram Data Sources https://datarepository.wolframcloud.com
Local Data

Just a lot of places that you can pull data from. Maybe creating a data template / format that can be ingested into the vector db would be ideal, then writing downloadeders that take the data and reformat it for the vector DB would be the easiest way to implement this.

Also I would like to see an open source replacement to Wolfram Alpha, specifically the math solving part of it but that seems like a different much harder task unto itself.

0 replies

MakkiLoyola · 2024-04-09T22:40:23Z

MakkiLoyola
Apr 9, 2024

Hey, im going to recreate the entire Front End of Perplexity, ill post it here once done,

Should be done by the 26th of April

0 replies

chymian · 2024-04-24T04:03:39Z

chymian
Apr 24, 2024

I open discussion on partitioning the memory/vectordb:

to keep an overview, make topic-specivic partitions, like a workspaces
long term memory (like curated stuff, i.e. obsidian)
clearable short term, like failed research attempts, which would spoil further querys
chat-history

and thankyou @nilsherzig this is working very very well, even in that early state.

0 replies

chymian · 2024-04-24T12:57:38Z

chymian
Apr 24, 2024

add an url for embedding models.

to overcome the need to load/unload/change model permanently, introduce a embedding-url

i.e. we can run a

second ollama on that endpoint.
TEI huggingface TextEmbeding Interface.
...

text embeddings are needed by many AI-APPS, so that it's usefull to have a dedicated endpoint on your Machine.
and since the models are small, one can keep them load (on GPU) or running on CPU is ok. (here ;)

0 replies

PriNova · 2024-04-28T20:41:55Z

PriNova
Apr 28, 2024

For those interested but didn't know that it exists.
Web scraping for LLM compatible reading use jina reader:
For any open public websites, simply prepend http://r.jina.ai as URL

It is totally free
As example of the original website HNSW Wikipedia
And here with the jina reader: Jina HNSW Wikipedia Format

So all beatifulsoaping is obsolete even images will be converted with ALT text.

0 replies

Uh oh!

Please add your ideas - infrastructure planing thread #38

Uh oh!

nilsherzig Apr 2, 2024 Maintainer

Replies: 12 comments · 12 replies

Uh oh!

Uh oh!

nilsherzig Apr 4, 2024 Maintainer Author

Uh oh!

Uh oh!

nilsherzig Apr 4, 2024 Maintainer Author

Uh oh!

Uh oh!

nilsherzig Apr 4, 2024 Maintainer Author

Uh oh!

Uh oh!

Uh oh!

nilsherzig Apr 8, 2024 Maintainer Author

Uh oh!

Uh oh!

Uh oh!

Uh oh!

nilsherzig Apr 4, 2024 Maintainer Author

Uh oh!

Uh oh!

nilsherzig Apr 7, 2024 Maintainer Author

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

nilsherzig Apr 8, 2024 Maintainer Author

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

add an url for embedding models.

to overcome the need to load/unload/change model permanently, introduce a embedding-url

Uh oh!

Uh oh!

nilsherzig
Apr 2, 2024
Maintainer

Replies: 12 comments 12 replies

nilsherzig Apr 4, 2024
Maintainer Author

nilsherzig Apr 4, 2024
Maintainer Author

nilsherzig Apr 4, 2024
Maintainer Author

nilsherzig Apr 8, 2024
Maintainer Author

nilsherzig Apr 4, 2024
Maintainer Author

nilsherzig Apr 7, 2024
Maintainer Author

nilsherzig Apr 8, 2024
Maintainer Author