Skip to content

Conversation

@Kannav02
Copy link
Collaborator

This PR aims to fix a small part of the issue #75

The objective of this PR can be tracked via the following points

  • Correcting the requirements to now include pyMongo
  • Integrating MongoDB as the database to which the feedback will go back
  • referencing different contexts based on their IDs to the main table is the feedback table

To view/test these changes, follow the following steps

  • run the frontend with the mock server
  • specify the MONGO_DB_URI to which the feedback and the context would be fed back to
  • enter a prompt
  • submit a feedback
  • now access the MongoDB instance and you will see the data there, including the timestamp of submissions

This is what I got when I ran this twice

Screenshot 2024-11-25 at 9 49 49 PM
Screenshot 2024-11-25 at 9 50 04 PM

Follow-up question, are there limited number of contexts that we have right now for this application, if yes , I might optimize the database insertions rather not to insert the pre-existing contexts, but to reference them and insert their ids into the main table

Thank you!

@Kannav02
Copy link
Collaborator Author

@luarss , even now I believe there is some problem with the CI pipeline, what should we do for the same?

@Kannav02 Kannav02 requested a review from luarss November 27, 2024 01:55
@luarss
Copy link
Collaborator

luarss commented Nov 27, 2024

Follow-up question, are there limited number of contexts that we have right now for this application, if yes , I might optimize the database insertions rather not to insert the pre-existing contexts, but to reference them and insert their ids into the main table

Do you mean to reference them as context IDs? Where would we see the mapping between context ID and their contents?

@Kannav02
Copy link
Collaborator Author

Follow-up question, are there limited number of contexts that we have right now for this application, if yes , I might optimize the database insertions rather not to insert the pre-existing contexts, but to reference them and insert their ids into the main table

Do you mean to reference them as context IDs? Where would we see the mapping between context ID and their contents?

so my idea is ,I have a main table for all the information and then a table for context, within that table I am abstracting all the details related to context, so you can reference the context details using the context ID assigned to it, my assumption was there are a limited number of context right now, so it shouldn't be a problem , but I just wanted to clarify this with you

@luarss
Copy link
Collaborator

luarss commented Dec 18, 2024

Hello, sorry for the delayed response. Can you please send me your e-mail at jluar@precisioninno.com so we can further discuss about this offline?

@Kannav02
Copy link
Collaborator Author

Sure, i'll send an email about this

Thank you!

@Kannav02
Copy link
Collaborator Author

Hey @luarss , so i was looking at different options for hosting MongoDB on GCP, I found two ways

  1. first one is just directly using a VM and installing MongoDB on that, its not that good in terms of flexibility and portability, so I wouldn't go with this

  2. second one is probably the one we could look into, using Docker instance and deploying them on a small instance for now to test its functionality and then scaling it to a big instance later on

Which one would you prefer?

@luarss
Copy link
Collaborator

luarss commented Jan 18, 2025

Second one is preferred. Can you let me know what are the minimum requirements?

@Kannav02
Copy link
Collaborator Author

So for the instance for containers deployment we can use GKE , and the instance type should be really small for now as we're just getting started with it so maybe any of the e2 family instances should be good and maybe if we look at separate storage for faster access , we could go with supplementary 5GB SSD? but this suggestion is quite optional for now

Screenshot 2025-01-18 at 8 39 22 PM

On another note, I told you about the custom deployment options, but I forgot to tell you about that MongoDB atlas has a direct deployment feature from the cloud with google cloud, it is good but we might need to have a separate discussion if we're looking into this as well,

This is the link for you reference: https://www.mongodb.com/resources/products/platform/mongodb-on-google-cloud

Thank you

@luarss
Copy link
Collaborator

luarss commented Jan 19, 2025

Thanks for doing the research! I am more inclined towards the first solution of GKE. That is also what I chose for the prototype deployment of the RAG webapp.

Let us go with the e2-micro for prototyping for now. You can assume this DB will be on the same internal network as the other nodes, so no need to expose ports externally

@Kannav02
Copy link
Collaborator Author

Got it, just to be on the same page, should I open up a separate issue related to the deployment of the Database, as this is kind of related to the development part,

Also I had a question, should we proceed with the deployments after we're done with the UI and the feedback functionality for MongoDB, or should it be done in parallel.

Thank you Jack!

@luarss
Copy link
Collaborator

luarss commented Jan 20, 2025

Yup, that is fine. Let's put the deployment details to a latter PR.

@Kannav02
Copy link
Collaborator Author

perfect then, I will finish up with MongoDB implementation in this week with documentation and then we can work on UI dashboard

once again thank you jack for your help and letting me be a part of this project

@Kannav02
Copy link
Collaborator Author

Kannav02 commented Jan 21, 2025

Hey @luarss , hope you're doing well!

Just something I wanted to discuss with you, in the previous meeting we talked about the database structure and how we can link the context right to the main schemas, turns out when the response is returned from the backend, it is returned in the following format

    if user_input.list_sources and user_input.list_context:
        response = {
            "response": result["answer"],
            "sources": (links),
            "context": (context),
        }

You can see that it isn't mentioned from what source the context is coming from,

but lets say if we were to assume that from what source is a particular context coming from , we might have to make changes to how data is sent back to the frontend, so we can derive a relationship

whats your opinion on this?

Signed-off-by: Kannav02 <kannavsethi02@gmail.com>
- schema corrected for the database
- parameters included in the main submit_feedback function
- insertion corrected to utilise the correct datetime.now() function

Signed-off-by: Kannav02 <kannavsethi02@gmail.com>
Signed-off-by: Kannav02 <kannavsethi02@gmail.com>
- added the function to now submit feedback back to mongoDB
- corrected the sys path to now include common as a package, workaround , kind of like a pseudopackage

Signed-off-by: Kannav02 <kannavsethi02@gmail.com>
Signed-off-by: Kannav02 <kannavsethi02@gmail.com>
Signed-off-by: Kannav02 <kannavsethi02@gmail.com>
Signed-off-by: Kannav02 <kannavsethi02@gmail.com>
Signed-off-by: Kannav02 <kannavsethi02@gmail.com>
@luarss
Copy link
Collaborator

luarss commented Jan 21, 2025

Hi @Kannav02, we might have to do some modification of the backend code. The links variable is currently deduplicated, but it should have a one-to-one correspondence with context (i.e. same length)

@Kannav02
Copy link
Collaborator Author

sure, we can work on this , I was wondering if we can possibly have another meeting for the same, I need to also show you what I've been thinking about the same

Thank you

@luarss
Copy link
Collaborator

luarss commented Jan 23, 2025

Sent you an e-mail. For meeting requests, feel free to drop me an e-mail :)

@luarss
Copy link
Collaborator

luarss commented Feb 4, 2025

@Kannav02 Made some changes for formatting, lint. Also created an issue: #121 to track the ContextSource migration.

Merge criteria: we should ensure that backend can still work (i.e. curl for /graphs/agent-retriever endpoint). Let me work on this.

Thanks!

@Kannav02
Copy link
Collaborator Author

Kannav02 commented Feb 4, 2025

Thank you for the review @luarss , I made changes to the backend, specifically the files graph.py and chains.py, I believe the format in which they are returning is the new one which isContextSources

Please do let me know if I can help you out in any way possible?

Also should I start working on the documentation for the same

Thank you once again for your guidance :)

Signed-off-by: Jack Luar <jluar@precisioninno.com>
…rontend

Signed-off-by: Jack Luar <jluar@precisioninno.com>
Signed-off-by: Jack Luar <jluar@precisioninno.com>
Signed-off-by: Jack Luar <jluar@precisioninno.com>
Signed-off-by: Jack Luar <jluar@precisioninno.com>
Signed-off-by: Jack Luar <jluar@precisioninno.com>
Signed-off-by: Kannav02 <kannavsethi02@gmail.com>
Signed-off-by: Song Luar <espsluar@gmail.com>
Signed-off-by: luarss <39641663+luarss@users.noreply.github.com>
Kannav02 and others added 4 commits February 24, 2025 22:30
Signed-off-by: Kannav02 <kannavsethi02@gmail.com>
Signed-off-by: Song Luar <jluar@precisioninno.com>
Signed-off-by: Kannav02 <kannavsethi02@gmail.com>
Signed-off-by: Kannav Sethi <90309433+Kannav02@users.noreply.github.com>
@Kannav02
Copy link
Collaborator Author

@luarss @error9098x , this should be good for review now, the merge conflict is resolved

Thank you!

@luarss
Copy link
Collaborator

luarss commented Jun 1, 2025

Hi @Kannav02, while running the streamlit with our backend I encountered this screen. The streamlit logs say that the feedback was submitted successfully - but I think there might be some problems with error handling in utils/mongoClient.py. Can you take a look?

By the way, I started the backend with FAST_MODE=true for the backend to quickly prototype.

image

Signed-off-by: Jack Luar <jluar@precisioninno.com>
@Kannav02
Copy link
Collaborator Author

Kannav02 commented Jun 1, 2025

Thats pretty strange, I'll take a look into this as well, thank you @luarss

@Kannav02
Copy link
Collaborator Author

Kannav02 commented Jun 3, 2025

@luarss , I found the issue, apparently one of the function submit_feedback() wasn't returning anything even when the feedback was being submitted, so it was always defaulting to an error due to None value, It should be working now, do let me know if any other fixes are needed

Thank you!

Signed-off-by: Kannav02 <kannavsethi02@gmail.com>
Signed-off-by: Song Luar <jluar@precisioninno.com>
Copy link
Collaborator

@luarss luarss left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, will merge after CI is fixed. Thanks!

@luarss luarss merged commit c1024a4 into master Jun 5, 2025
2 checks passed
@luarss luarss deleted the issue-75-2-new branch June 5, 2025 04:53
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants