Skip to content

[IMP] jinja_to_qweb: avoid MemoryError #288

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Draft
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

cawo-odoo
Copy link
Contributor

upg-2994884

Avoid MemoryError and/or killed process because of malloc() failing within lxml / libxml2.

Debugging this by determining the size of involved datastructures through means of sys.getsizeof() showed that:

  1. The global variable templates_to_check grows to roughly 1.4GiB after the various calls to upgrade_jinja_fields() by upgrade scripts. The process uses ~1.5GiB at that point
  2. At the start of function verify_upgraded_jinja_fields(), the process is still at ~1.5GiB. While iterating over all the templates in templates_to_check, no significant amount of memory is allocated on top of this to python datastructures. But, with each call to is_converted_template_valid(), the size of the process increases until it hits the RLIMIT. This function calls into lxml multiple times, suggesting that the memory is allocated in malloc() calls in the C library, evading python's accounting. Internet research shows that lxml has a long history of different memory leaks in C code, plus some caching mechanism across documents that could be responsible1. More recent versions of the module seem to have been improved, but still we're stuck with old versions.

This patch solves / works around (2) by running the function body of is_converted_template_valid() in a forked subprocess that is killed immediately after (instead of using a process pool) and thus no additional memory is ever allocated by lxml in the main process, its memory use stays at ~1.5GiB during the runtime of verify_upgraded_jinja_fields(). We also add a final line to that function that clears the global data when it is processed, at which point the processes memory use drops to 100MiB.

This patch does not solve (1). Solving (2) is enough to fully process upg-2994884, but a future upgrade request may require a fix for (1), too. This will probably be a bigger patch, because I think a viable solution for this would mean to remove the global templates_to_check and instead store tha data in the database.

Footnotes

  1. https://benbernardblog.com/tracking-down-a-freaky-python-memory-leak-part-2/

upg-2994884

Avoid `MemoryError` and/or killed process because of `malloc()` failing within
`lxml` / `libxml2`.

Debugging this by determining the size of involved datastructures through means
of `sys.getsizeof()` showed that:
1. The global variable `templates_to_check` grows to roughly 1.4GiB after the
   various calls to `upgrade_jinja_fields()` by upgrade scripts. The process
   uses ~1.5GiB at that point
2. At the start of function `verify_upgraded_jinja_fields()`, the process is
   still at ~1.5GiB. While iterating over all the templates in
   `templates_to_check`, no significant amount of memory is allocated on top of
   this *to python datastructures*. But, with each call to
   `is_converted_template_valid()`, the size of the process increases until it
   hits the RLIMIT. This function calls into `lxml` multiple times, suggesting
   that the memory is allocated in `malloc()` calls in the C library, evading
   python's accounting.
   Internet research shows that `lxml` has a long history of different memory
   leaks in C code, plus some caching mechanism *across documents* that could
   be responsible[^1]. More recent versions of the module seem to have been
   improved, but still we're stuck with old versions.

This patch solves / works around (2) by running the function body of
`is_converted_template_valid()` in a subprocess that is killed immediately
after (instead of using a process pool) and thus no additional memory is ever
allocated by `lxml` in the main process, its memory use stays at ~1.5GiB during
the runtime of `verify_upgraded_jinja_fields()`. We also add a final line to
that function that clears the global data when it is processed, at which point
the processes memory use drops to 100MiB.

This patch does *not* solve (1). Solving (2) is enough to fully process
upg-2994884, but a future upgrade request may require a fix for (1), too. This
will probably be a bigger patch, because I think a viable solution for this
would mean to remove the global `templates_to_check` and instead store tha data
in the database.

[^1]: https://benbernardblog.com/tracking-down-a-freaky-python-memory-leak-part-2/
@robodoo
Copy link
Contributor

robodoo commented Jun 27, 2025

Pull request status dashboard

@aj-fuentes
Copy link
Contributor

From discussion in DM: I agree we should move the data to the DB. One strategy is to do so only if it is too big.

Another thing: you could reuse the process for multiple calls to is_converted_template_valid. In other words let it process something like (randomly chosen) 100 items before recreating it. You can cache the info in is_converted_template_valid. To decide better it would be great if you post a summary of the value of template_data during the upgrade. How many entries are there? For how many tables? How many fields? Or the size of templates?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants