Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug]: MongoDB does not get all the data from a heavy query #19887

Open
1 task done
felix-appsmith opened this issue Jan 18, 2023 · 8 comments
Open
1 task done

[Bug]: MongoDB does not get all the data from a heavy query #19887

felix-appsmith opened this issue Jan 18, 2023 · 8 comments
Labels
Bug Something isn't working Community Reported issues reported by community members High This issue blocks a user from building or impacts a lot of users Integrations Product Issues related to a specific integration Mongo Issues related to Mongo DB plugin Needs Triaging Needs attention from maintainers to triage Production Query & JS Pod Issues related to the query & JS Pod Query performance Issues that have to do with lack in performance of query execution

Comments

@felix-appsmith
Copy link

felix-appsmith commented Jan 18, 2023

Is there an existing issue for this?

  • I have searched the existing issues

Description

A user has a Mongo collection of 3 documents, each document contains approximately 30 images. The problem is that when executing the find function, only 1 is obtained, but when the count function is executed, it obtains 3
I attach images so that you can observe the case.

image
image

Steps To Reproduce

``
1. Create a new database.
2. Create a new collection within the database.
3. Add 3 new documents to the collection using this Python script:

    from pymongo import MongoClient
    from bson.objectid import ObjectId
    from bson.binary import Binary
    
    client = MongoClient("mongodb://localhost:27017")
    db = client["Git_19887_Test"]
    col = db["Test_Upload_Images"]
    
    def read_image(file_path):
        images = []
        for i in range(1, 21):
            with open(file_path, 'rb') as f:
                binary_image = Binary(f.read())
            new_file_path = file_path.replace('.jpg', f'_{i}.jpg')
            images.append((new_file_path, binary_image))
        return images
    
    document1 = {"Inputs": [1, 2, 3], "works": {"work1": "yes", "work2": "no"}, "Features": {"feature1": True, "feature2": False}, "images": [], "images2": [], "_id": ObjectId()}
    document2 = {"Inputs": [4, 5, 6], "works": {"work1": "no", "work2": "yes"}, "Features": {"feature1": False, "feature2": True}, "images": [], "images2": [], "_id": ObjectId()}
    document3 = {"Inputs": [7, 8, 9], "works": {"work1": "yes", "work2": "yes"}, "Features": {"feature1": True, "feature2": True}, "images": [], "images2": [], "_id": ObjectId()}
    
    for document in [document1, document2, document3]:
        images = []
        images2 = []
        for i in range(1, 21):
            image_file_path = "../1.jpg"
            image_list = read_image(image_file_path)
            for new_file_path, binary_image in image_list:
                images.append({new_file_path: binary_image})
            image_file_path = "../1.jpg"
            image_list = read_image(image_file_path)
            for new_file_path, binary_image in image_list:
                images2.append({new_file_path: binary_image})
        document["1"] = images
        document["1"] = images2
    
col.insert_many([document1, document2, document3])

``
4. Use a sample image with a size of 33.3kb, for example, this image.

1

`
5. Now in Appsmith, if we execute the "find" command, we obtain a single result, but if we execute the "count" command, we get all 3 results.

Public Sample App

No response

Issue video log

No response

Version

Appsmith Community v1.8.9

@felix-appsmith felix-appsmith added Bug Something isn't working Needs Triaging Needs attention from maintainers to triage labels Jan 18, 2023
@Nikhil-Nandagopal Nikhil-Nandagopal added High This issue blocks a user from building or impacts a lot of users Mongo Issues related to Mongo DB plugin and removed Needs Triaging Needs attention from maintainers to triage labels Jan 19, 2023
@github-actions github-actions bot added the Integrations Product Issues related to a specific integration label Jan 19, 2023
@sribalajig sribalajig added Medium Issues that frustrate users due to poor UX and removed High This issue blocks a user from building or impacts a lot of users labels Jan 19, 2023
@github-actions github-actions bot added the Query & JS Pod Issues related to the query & JS Pod label Jan 19, 2023
@barshag
Copy link

barshag commented Jan 20, 2023

@sribalajig Why did you lowered the severity to medium?
This is absolutely not an UX problem as the fetch from the DB is not working correctly.
It's maybe even some kind o f "critical" in the sense that this lead to misleading information in some situation where not all the data is presented (and they even not aware of it..).

@sribalajig
Copy link

@barshag can you please help me understand whether the images array is a list of URLs or actual binary data representing the images? Also is this issue is happening for only this collection or other collections as well?
In the mean time, I have bumped this back to high priority. Thanks.

@sribalajig sribalajig added High This issue blocks a user from building or impacts a lot of users and removed Medium Issues that frustrate users due to poor UX labels Jan 20, 2023
@barshag
Copy link

barshag commented Jan 20, 2023 via email

@sribalajig sribalajig added Query performance Issues that have to do with lack in performance of query execution Data Platform Pod Issues related to the underlying data platform and removed Mongo Issues related to Mongo DB plugin Integrations Product Issues related to a specific integration labels Jan 23, 2023
@sribalajig
Copy link

sribalajig commented Jan 25, 2023

@barshag we have prioritised this issue for implementation and it is going to be a part of a larger epic which deals with Appsmith's ability to deal with large data volumes - #18245 (Note : The epic deals with large file uploads, and your problem is with queries. Although not exactly the same, they are related)

Would you be interested in having a discussion with me on your problem? I think it would give a good perspective for dealing with the larger epic as well. You can find my calendar here - https://calendly.com/balaji-gopinath/appsmith-user-interview

In the mean time, if possible, I would suggest storing images in a separate place (like Amazon S3) and referencing the URLs in your Mongo documents. Mongo is not really optimised to store large binary objects.

@nidhi-nair
Copy link
Contributor

@barshag From the steps for reproduction given on this issue, it seems like the result Appsmith is coming back with is limited because to total array of documents exceeds 16 MB which is greater than the allowed batch size for Appsmith. Our Mongo plugin today only allows for the first batch of data to be loaded. We understand that this information is not best represented on the screen, but do you think paginating with this limitation in mind would help your use case?

@barshag
Copy link

barshag commented Apr 17, 2023 via email

@nidhi-nair
Copy link
Contributor

Do you mind sharing what the size of these documents that you are working with is?

I am also looping in @rohan-arthur and @sribalajig to figure out a more comprehensive solve for this.

@nidhi-nair nidhi-nair added the Needs Triaging Needs attention from maintainers to triage label Apr 2, 2024
@riteshkew riteshkew added Mongo Issues related to Mongo DB plugin and removed Data Platform Pod Issues related to the underlying data platform labels Apr 12, 2024
@github-actions github-actions bot added Integrations Product Issues related to a specific integration and removed Query & JS Pod Issues related to the query & JS Pod labels Apr 12, 2024
@Nikhil-Nandagopal Nikhil-Nandagopal added the Community Reported issues reported by community members label Apr 18, 2024
@github-actions github-actions bot added the Query & JS Pod Issues related to the query & JS Pod label Apr 18, 2024
@akshayvijayjain
Copy link

based on the comments, few points to take care while picking this issue

  1. need to store 5 images/media in documents, each around 5 mb, and try to fetch them all
  2. are we able to fetch all 25 mb, probably not, then how much we are able to fetch
  3. what possible solution can we add to be able to fetch all 5 image, or any number images
  4. what is possibility of notifying user, if the fetched data is less than actual existing data

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug Something isn't working Community Reported issues reported by community members High This issue blocks a user from building or impacts a lot of users Integrations Product Issues related to a specific integration Mongo Issues related to Mongo DB plugin Needs Triaging Needs attention from maintainers to triage Production Query & JS Pod Issues related to the query & JS Pod Query performance Issues that have to do with lack in performance of query execution
Projects
Status: No status
Development

No branches or pull requests

10 participants