Skip to content

Mongoengine is very slow on large documents compared to native pymongo usage #1230

Open
@baruchoxman

Description

@baruchoxman

(See also this StackOverflow question)

I have the following mongoengine model:

class MyModel(Document):
    date = DateTimeField(required = True)
    data_dict_1 = DictField(required = False)
    data_dict_2 = DictField(required = True)

In some cases the document in the DB can be very large (around 5-10MB), and the data_dict fields contain complex nested documents (dict of lists of dicts, etc...).

I have encountered two (possibly related) issues:

  1. When I run native pymongo find_one() query, it returns within a second. When I run MyModel.objects.first() it takes 5-10 seconds.
  2. When I query a single large document from the DB, and then access its field, it takes 10-20 seconds just to do the following:
    m = MyModel.objects.first()
    val = m.data_dict_1.get(some_key)

The data in the object does not contain any references to any other objects, so it is not an issue of objects dereferencing.
I suspect it is related to some inefficiency of the internal data representation of mongoengine, which affects the document object construction as well as fields access.

Metadata

Metadata

Assignees

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions