You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hi everyone!
We've been using flowise with great success for almost a year now in a project, that is getting pretty large now. Without the flexibility and power of Agentflows v2 we wouldn't be able to do that and scale so easy, so thanks for that awesome improvement!
Our challenge is that we have an increasing number of document stores (currently 50, one document loader per document store), 3 mio to 21 mio characters per document store and about 1500 total chunks per doc store. We use postgres as a vector db currently.
With each new document loader, the time to process and upsert documents increases. I think to process and upsert every document loader takes about 5-6 hours, one by one. Currently because of the issue with pgvector and the record manager not clearing up everything properly, we need to upsert everything all the time. And we need to do that on a regular basis, as our data changes regularly as well.
We are using Cypress to automize this and also be able to track in detail if there is an issue for a specific document loader and what might be the reason for it. Sometimes we have the issue that the Flowise UI is not responding or the processing took too long etc. In general it works fine, but we expect the project to further grow to maybe few hundred document stores. Our current approach is not ideal and we want to improve it.
I am curious if others have best practises for automizing lots of document stores/document loaders with regular data updates.
Or maybe features are planned to make this easier and/or faster. I saw that there is already a feature on the roadmap: "Cron Job For Upsert/Refresh Doc Store, AgentflowV2". This could make things a bit easier I assume.
So I am looking forward for any input, ideas and would like to hear how you do it in your projects.
reacted with thumbs up emoji reacted with thumbs down emoji reacted with laugh emoji reacted with hooray emoji reacted with confused emoji reacted with heart emoji reacted with rocket emoji reacted with eyes emoji
Uh oh!
There was an error while loading. Please reload this page.
-
Hi everyone!
We've been using flowise with great success for almost a year now in a project, that is getting pretty large now. Without the flexibility and power of Agentflows v2 we wouldn't be able to do that and scale so easy, so thanks for that awesome improvement!
Our challenge is that we have an increasing number of document stores (currently 50, one document loader per document store), 3 mio to 21 mio characters per document store and about 1500 total chunks per doc store. We use postgres as a vector db currently.
With each new document loader, the time to process and upsert documents increases. I think to process and upsert every document loader takes about 5-6 hours, one by one. Currently because of the issue with pgvector and the record manager not clearing up everything properly, we need to upsert everything all the time. And we need to do that on a regular basis, as our data changes regularly as well.
We are using Cypress to automize this and also be able to track in detail if there is an issue for a specific document loader and what might be the reason for it. Sometimes we have the issue that the Flowise UI is not responding or the processing took too long etc. In general it works fine, but we expect the project to further grow to maybe few hundred document stores. Our current approach is not ideal and we want to improve it.
I am curious if others have best practises for automizing lots of document stores/document loaders with regular data updates.
Or maybe features are planned to make this easier and/or faster. I saw that there is already a feature on the roadmap: "Cron Job For Upsert/Refresh Doc Store, AgentflowV2". This could make things a bit easier I assume.
So I am looking forward for any input, ideas and would like to hear how you do it in your projects.
Beta Was this translation helpful? Give feedback.
All reactions