Pinecone Vectorized Memory #122

dschonholtz · 2023-04-04T00:31:46Z

What this does:

Updates the readme showing how to get started with pinecone.
Removes the local memory in favor of pinecone.
Shows memory usage and prints it if the debug flag is on.
Saves everything to memory
Queries the memory on every chat and gets relevant stuff.
Adds memory until the prompt + memory takes up 2500 tokens
Removes all memory management

ryanpeach · 2023-04-04T04:32:37Z

This is totally necessary imo.

waynehamadi · 2023-04-04T05:12:27Z

yeah thank you very much

waynehamadi · 2023-04-04T05:20:28Z

@Torantulino could you create a pinecone account for autogpt? This way you can put the api key in the repo settings (I think only you has access to it right now) as a secret key and we can use it in our integration tests in github action.
We will need tests for this pinecone integration because I expect a lot of people will want to add things.

jantic · 2023-04-04T13:08:14Z

This is really nice and clean code. And it's an exciting update!

Three thoughts:

Do we really need/want delete/overwrite? Seems a bit unnecessary and I don't really see the benefit considering the scope of what will be stored (not that huge, right?)
Down the road, perhaps things like building in a sort of "recency bias" will help to compile even better memory retrievals.
Perhaps memory retrievals should keep filling the context until token capacity is reached rather than a fixed number?

scripts/memory.py

dschonholtz · 2023-04-04T21:30:39Z

This is really nice and clean code. And it's an exciting update!

Three thoughts:

Do we really need/want delete/overwrite? Seems a bit unnecessary and I don't really see the benefit considering the scope of what will be stored (not that huge, right?)

Down the road, perhaps things like building in a sort of "recency bias" will help to compile even better memory retrievals.

Perhaps memory retrievals should keep filling the context until token capacity is reached rather than a fixed number?

I was debating whether or not to leave in delete/overwrite. I wanted to leave the interface the same for the different memory classes.
The simple memory agent definitely needs it. If we want to enforce pinecone usage I can definitely just scrap the delete overwrite code along with the simple agent class

As for memory retrievals. I really like that idea.
For the sake of getting this in I would argue we should make that a solid starter ticket for an excited new dev.
@jantic

dschonholtz · 2023-04-04T22:50:31Z

Ok this should be working and it neatly pulls data out of memory.
I have attempted to tag the relevant issues but I'll do another sweep in a few.

It might be interesting to remove the explicit memory add tool and to just save all history to memory every couple messages. But I'm going to leave that for a future PR as I think the value of having working queryable pinecone is enough for one.

yousefissa · 2023-04-04T23:51:05Z

.env.template

@@ -1,3 +1,5 @@
+PINECONE_API_KEY=your-pinecone-api-key
+PINECONE_REGION=your-pinecone-region


this should be PINECONE_ENV or we should update docs/code to reflrect the new name

jantic

Exciting!

The bummer part: overwrite calls are failing. It's saying it's missing the "data" parameter but it looks like the problem is the function signature is inconsistent with SimpleMemory.

My two cents: I personally think the memory mechanism has too much complication and shouldn't be subject to "memory management" by the model. Just save everything and retrieve what's relevant. If getting rid of "SimpleMemory" makes life easier on this front, I think that would be a good thing. Just stick with the much better memory implementation (PineCone) and save yourself maintenance and reasoning headaches....

@Torantulino I know the feeling about having a flood of pull requests and endless requests coming your way with a new viral open source project. So I feel for you. But if you have the time, this might be something worth thinking about.

dschonholtz · 2023-04-05T01:24:20Z

Sounds good.

I was a bit afraid of that.
I'll torch the memory management and simple memory and repush. Unfortunately that will have to be tomorrow.

lolxdmainkaisemaanlu · 2023-04-05T12:29:45Z

Hope this gets merged soon, would be a huge upgrade.

dschonholtz · 2023-04-05T19:19:20Z

So I've been working on this. I deleted a bunch of code :)
For "saving everything", I save everything every message and then we do look up on every message too. I cap the memory + prompt token usage to 2500 token leaving 1500 tokens for the response.

On another note. You would really like the agent to be able to traverse it's memories if there isn't enough context, so maybe adding a tool to think about something so it can traverse it's own memories would be useful.
Something like, memories stored in SQL. Agent notices a memory that is interesting with respect to a future goal. Decides that it should explore that memory further, and then can pull memories that were saved around that memory. You might get this for free if you save memories in an overlapping way though. You also might just be able to get neighboring vector ids and then you wouldn't need to use SQL.

I am trying to keep these changes relatively simple so we can merge today.

requirements.txt

jantic · 2023-04-05T19:57:40Z

You sometimes get overlap associated with message history and memories, but for now I think that is fine.

Yeah even with the overlap you cite, I'm pretty sure this will be a big improvement over what was in place previously, based on what I've observed. On my end at least (using GPT-3.5) it was really underutilizing the memory when the model was managing it. It'd only save very occasionally. Granted I'd bet it's better with GPT-4 with memory management but I'd really like to not unnecessarily handicap 3.5. And I'm pretty sure it'll be better to just unburden the model from this altogether.

On another note. You would really like the agent to be able to traverse it's memories if there isn't enough context, so maybe adding a tool to think about something so it can traverse it's own memories would be useful.

That sounds potentially very cool. It could be expensive and slow without limits, but definitely worth thinking about esp if there's going to be the option of using local models.

There's also the option of squeezing more into memory. Just recently there was a bunch of buzz on Twitter about an apparent internal compressed "language" of sorts that the GPT-4 model has going on. It's lossy but perhaps good enough? What I'm wondering is if a bulk of processing could be done with this https://twitter.com/gfodor/status/1643303237578293249

dschonholtz · 2023-04-05T20:01:39Z

I'm extremely skeptical of the compression stuff. That is the kind of thing that seems likely to be lossy in subtle ways and have many non-obvious failure modes.
Definitely worth exploring later, even if we just summarize the existing memories rather than having it compress stuff.

dschonholtz · 2023-04-05T20:52:36Z

This still formerly needs a review.
Please review if you are able and upvote the associated item in the discord #pull requests channel to see this get merged Remember this is pull 122
I also am on EST. So I will only be able to click merge or make changes for the next 3 ish hours. If this needs minor changes to merge it when I am unavailable, feel free to to make them and merge.

Torantulino · 2023-04-05T21:19:57Z

Excellent work!

alkeryn · 2023-04-05T21:39:05Z

looks like a cool pr but why not use an open source and self hostable vector database like milvus or other as to depend as little as possible on external services ?

dschonholtz · 2023-04-05T22:16:26Z

Milvus looks cool: https://milvus.io/docs/example_code.md

I honestly picked something that a few other people were looking at inspired by the babyAGI library. It seemed like the simplest option to get started with.

Milvus, might be nearly as easy, but requires additional infra. For now, let's get this in and working and if we find people aren't able to operate in the free tier or we want to deploy our own vector DB infra we can switch over. It looks like it should be pretty close to a drag and drop replacement.

jantic

Great job on this. Thank you for doing this!

Some notes:

get_ada_embedding could be @cached I think

~4000 word limit for short term memory. Your short term memory is short, so immediately save important information to files.

This originally just said save code to files. Now this reads like "save all important memories to file" even though we have the Pinecone memory. Seems like this may need to be reworded...not sure...

If you are unsure how you previously did something or want to recall past events, thinking about similar events will help you remember.

I like this!

jantic · 2023-04-06T02:53:23Z

I'm extremely skeptical of the compression stuff. That is the kind of thing that seems likely to be lossy in subtle ways and have many non-obvious failure modes. Definitely worth exploring later, even if we just summarize the existing memories rather than having it compress stuff.

Yeah I am skeptical too, to be clear. It seems like it might be an interesting experimental mode perhaps.

waynehamadi · 2023-04-04T05:24:54Z

scripts/memory.py

+    try:
+        pinecone_mem = PineconeMemory(memory_name, num_relevant)
+        # test the openai api since we need it to make embeddings.
+        get_ada_embedding("test")


is this intentional ?

thanks @dschonholtz

Hey sorry. Yes, we want to make sure we can make embeddings. WIll throw an error otherwise.

no I was talking about the hard coded "test", it's been fixed

Torantulino · 2023-04-06T09:51:55Z

README.md

+### Setting up environment variables
+   For Windows Users:
+```
+setx PINECONE_API_KEY "YOUR_GOOGLE_API_KEY"


Suggested change

setx PINECONE_API_KEY "YOUR_GOOGLE_API_KEY"

setx PINECONE_API_KEY "YOUR_PINECONE_API_KEY"

Torantulino · 2023-04-06T10:08:29Z

README.md

+### Setting up environment variables
+   For Windows Users:
+```
+setx PINECONE_API_KEY "YOUR_GOOGLE_API_KEY"


Suggested change

setx PINECONE_API_KEY "YOUR_GOOGLE_API_KEY"

setx PINECONE_API_KEY "YOUR_PINECONE_API_KEY"

Torantulino · 2023-04-06T10:08:44Z

README.md

+```
+For macOS and Linux users:
+```
+export PINECONE_API_KEY="YOUR_GOOGLE_API_KEY"


Suggested change

export PINECONE_API_KEY="YOUR_GOOGLE_API_KEY"

export PINECONE_API_KEY="YOUR_PINECONE_API_KEY"

Torantulino · 2023-04-06T10:24:48Z

scripts/data/prompt.txt

-2. No user assistance
-3. Exclusively use the commands listed in double quotes e.g. "command name"
+1. ~4000 word limit for short term memory. Your short term memory is short, so immediately save important information to files.
+2. If you are unsure how you previously did something or want to recall past events, thinking about similar events will help you remember.


scruffynerf · 2023-04-06T15:06:55Z

Might want to consider https://github.com/featureform/featureform
(Specifically embeddinghub piece). Purely python, pip installable.

cvarrichio · 2023-04-06T21:21:59Z

I realize I'm about 10 hours late to the game, but why do the memory in pinecone? If you need a vector database, use something like FAISS. If you need unstructured, pickle or Elasticsearch. Sql, sqlite3. If you need ada embeddings, use OpenAIEmbeddings(). This introduces additional calls and makes a requirement for a pinecone API key. Can someone explain the advantages?

pmb2 · 2023-04-08T02:52:20Z

This might be useful to get large chunks of data in and out:

irthomasthomas · 2023-04-08T08:51:18Z

I would prefer an open source option for vector embeddings, like weaviate, https://github.com/weaviate/weaviate or milvus https://github.com/milvus-io/milvus A commercial cloud option, like pinecone, is fine for many businesses, but for indie hackers it can be a significant drawback to be constrained to the free tier.

…

On Thu, 6 Apr 2023, 22:22 cvarrichio, ***@***.***> wrote: I realize I'm about 10 hours late to the game, but why do the memory in pinecone? If you need a vector database, use something like FAISS. If you need unstructured, pickle or Elasticsearch. Sql, sqlite3. If you need ada embeddings, use OpenAIEmbeddings(). This introduces additional calls and makes a requirement for a pinecone API key. Can someone explain the advantages? — Reply to this email directly, view it on GitHub <#122 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AE476NCO5XC65II62BWDFXDW74XYXANCNFSM6AAAAAAWR7326E> . You are receiving this because you are subscribed to this thread.Message ID: ***@***.***>

cvarrichio · 2023-04-08T11:17:40Z

FAISS and Chroma are also open source and on github and pypi. I do not know of a specific feature comparison between them.

alreadydone · 2023-04-09T22:16:53Z

Notice that #372 has been merged, implementing memory based on the open-source Redis (it's not known as a vector database, but apparently supports vector similarity search to some extent; not claiming it's the best). BTW, Pinecone seems to use FAISS.

Koobah · 2023-04-10T13:15:33Z

Did you consider storing all intermediate steps in memory as well? I am particularly thinking about chunks generated during web browsing. It would be great to store and embed the chunks so gpt can refer to them later.

dschonholtz marked this pull request as draft April 4, 2023 00:45

ryanpeach mentioned this pull request Apr 4, 2023

First draft at adding persistent memory via sqlite3 #124

Merged

therealmichaelgough previously approved these changes Apr 4, 2023

View reviewed changes

jasonjmcghee reviewed Apr 4, 2023

View reviewed changes

scripts/memory.py Outdated Show resolved Hide resolved

Joe0 mentioned this pull request Apr 4, 2023

Add the ability to capture and load snapshots of memory and message history #192

Closed

dschonholtz dismissed therealmichaelgough’s stale review via f640830 April 4, 2023 18:57

dschonholtz force-pushed the pinecone-memory branch 2 times, most recently from c258b57 to f60bd62 Compare April 4, 2023 21:24

dschonholtz changed the title ~~Pinecone first commit~~ Pinecone Vectorized Memory Apr 4, 2023

dschonholtz force-pushed the pinecone-memory branch 2 times, most recently from f99d5d1 to 1e64e68 Compare April 4, 2023 22:42

dschonholtz marked this pull request as ready for review April 4, 2023 22:43

This was referenced Apr 4, 2023

long-term memory? #125

Closed

Display short-term and long-term memory usage #5

Closed

Save Long-Term-Memory to file on shutdown #9

Closed

Long-term memory via vector database #74

Closed

yousefissa reviewed Apr 4, 2023

View reviewed changes

jantic reviewed Apr 5, 2023

View reviewed changes

tatsuiman mentioned this pull request Apr 5, 2023

Auto-GPT excessively searches the web, even for information GPT already knows #114

Closed

dschonholtz force-pushed the pinecone-memory branch from 1e64e68 to eeed66e Compare April 5, 2023 19:17

dschonholtz force-pushed the pinecone-memory branch from eeed66e to 4ca37d2 Compare April 5, 2023 19:20

binoculars reviewed Apr 5, 2023

View reviewed changes

requirements.txt Outdated Show resolved Hide resolved

dschonholtz force-pushed the pinecone-memory branch from d1a09ce to d91a81b Compare April 5, 2023 20:02

dschonholtz mentioned this pull request Apr 5, 2023

How is memory used? #254

Closed

Pinecone memory and memory usage tracking

475671d

dschonholtz force-pushed the pinecone-memory branch from d91a81b to 475671d Compare April 5, 2023 20:20

jantic reviewed Apr 6, 2023

View reviewed changes

waynehamadi previously approved these changes Apr 6, 2023

View reviewed changes

Torantulino reviewed Apr 6, 2023

View reviewed changes

Merge branch 'master' into pinecone-memory

9067a52

Torantulino dismissed waynehamadi’s stale review via 9067a52 April 6, 2023 10:05

Torantulino reviewed Apr 6, 2023

View reviewed changes

Torantulino approved these changes Apr 6, 2023

View reviewed changes

Torantulino merged commit 48451e3 into Significant-Gravitas:master Apr 6, 2023

dschonholtz deleted the pinecone-memory branch April 6, 2023 11:29

helmigsimon mentioned this pull request Apr 7, 2023

socket.gaierror: Name or service not known #384

Closed

1 task

		@@ -1,3 +1,5 @@
		PINECONE_API_KEY=your-pinecone-api-key
		PINECONE_REGION=your-pinecone-region

	setx PINECONE_API_KEY "YOUR_GOOGLE_API_KEY"
	setx PINECONE_API_KEY "YOUR_PINECONE_API_KEY"

	export PINECONE_API_KEY="YOUR_GOOGLE_API_KEY"
	export PINECONE_API_KEY="YOUR_PINECONE_API_KEY"

Pinecone Vectorized Memory #122

Pinecone Vectorized Memory #122

Conversation

dschonholtz commented Apr 4, 2023 • edited Loading

ryanpeach commented Apr 4, 2023

waynehamadi commented Apr 4, 2023

waynehamadi commented Apr 4, 2023 • edited Loading

jantic commented Apr 4, 2023

dschonholtz commented Apr 4, 2023 • edited Loading

dschonholtz commented Apr 4, 2023

yousefissa Apr 4, 2023 • edited Loading

Choose a reason for hiding this comment

jantic left a comment

Choose a reason for hiding this comment

dschonholtz commented Apr 5, 2023

lolxdmainkaisemaanlu commented Apr 5, 2023

dschonholtz commented Apr 5, 2023

jantic commented Apr 5, 2023

dschonholtz commented Apr 5, 2023

dschonholtz commented Apr 5, 2023 • edited Loading

Torantulino commented Apr 5, 2023

alkeryn commented Apr 5, 2023 • edited Loading

dschonholtz commented Apr 5, 2023

jantic left a comment

Choose a reason for hiding this comment

jantic commented Apr 6, 2023 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

scruffynerf commented Apr 6, 2023

cvarrichio commented Apr 6, 2023

pmb2 commented Apr 8, 2023 • edited Loading

irthomasthomas commented Apr 8, 2023 via email

cvarrichio commented Apr 8, 2023

alreadydone commented Apr 9, 2023 • edited Loading

Koobah commented Apr 10, 2023

dschonholtz commented Apr 4, 2023 •

edited

Loading

waynehamadi commented Apr 4, 2023 •

edited

Loading

dschonholtz commented Apr 4, 2023 •

edited

Loading

yousefissa Apr 4, 2023 •

edited

Loading

dschonholtz commented Apr 5, 2023 •

edited

Loading

alkeryn commented Apr 5, 2023 •

edited

Loading

jantic commented Apr 6, 2023 •

edited

Loading

pmb2 commented Apr 8, 2023 •

edited

Loading

alreadydone commented Apr 9, 2023 •

edited

Loading