Skip to content

Conversation

@wjddn279
Copy link
Contributor

Motivation

Following the findings discussed in #56641 where Memray proved to be highly effective for diagnosing memory usage patterns, this PR introduces a lightweight utility to make such analysis easier and configurable within Airflow.

Body

The new decorator, enable_memray_trace, enables selective memory profiling of key Airflow components such as the scheduler, dag-processor, and api-server.
When the corresponding configuration flag (for example, scheduler.enable_memray_trace = True in airflow.cfg) is enabled, the target function runs under memray.Tracker, automatically generating a memory profile file (e.g.,$AIRFLOW_HOME/scheduler_memory.bin) for later analysis.

License note

Memray is licensed under the Apache License 2.0, and its bundled dependencies are distributed under permissive licenses (MIT, BSD, etc.), as stated on its official license page.
Therefore, there are no licensing conflicts with Airflow’s Apache 2.0 license.

Question

Would it be acceptable to open this feature directly as a PR, or should it first go through a formal discussion process (such as a mailing list thread or AIP proposal)?


^ Add meaningful description above
Read the Pull Request Guidelines for more information.
In case of fundamental code changes, an Airflow Improvement Proposal (AIP) is needed.
In case of a new dependency, check compliance with the ASF 3rd Party License Policy.
In case of backwards incompatible changes please leave a note in a newsfragment file, named {pr_number}.significant.rst or {issue_number}.significant.rst, in airflow-core/newsfragments.

@potiuk
Copy link
Member

potiuk commented Oct 18, 2025

I love the idea. It's OK (and best) to open PR directly. I have a few comments to add though - in a moment.

@potiuk
Copy link
Member

potiuk commented Oct 18, 2025

This is really cool idea :)

I think what we also need, is to add a simple how-to guide on how to memory-profile Airlfow as a subpage of https://airflow.apache.org/docs/apache-airflow/stable/howto/index.html. It should not be too complex - most of the things there should defer to memray documentation like you did in configuration description, but it would be great to have a kind of walk-through - maybe with a few screenshots, and maybe a memray flamegraph to download based on the analysis done in #56641 on how you can figure out what is wrong. Kind of "this is how you can do it step by step". Enable config, restart the component, do the stuff, retrieve the file etc......

This would be a fantastic guideline for our users to be able to trace things on their own.

@potiuk
Copy link
Member

potiuk commented Oct 18, 2025

Also ... I am not sure if that's possible, but it would be likely a good idea to add another way of enabling memray in this case - we could proably have a signal handler (SIGUSR2 for example) that could possibly toggle memory tracing / profiling without changing the configuration - but not sure if that's possible or easy in general case. It would work nicely likely in api-server - we can easily restart the workers, and dag processor (to memory profile dag parsers), but for scheduler we would likely have to restart the scheduler somehow. And that might be a future follow maybe :)

@wjddn279
Copy link
Contributor Author

Thank you for the review! I’ll work on the following items and get back to you once they’re done:

  • Apply the review feedback
  • Add a new subpage: How to memory-profile Airflow
  • Fix the failing tests

I’ll also give some more thought to how we could implement the runtime toggle for memory tracing that you mentioned at the end.

@jason810496 jason810496 self-requested a review October 20, 2025 08:35
Copy link
Member

@jason810496 jason810496 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the update!
I used memray to resolve OOM issue in Airflow before. It's nice to see memray be integrated in Airflow natively for profiling!

@wjddn279
Copy link
Contributor Author

@jason810496
I was about to finish writing the documentation and request a review, but thank you so much for checking it out first! Let me go through it one by one!

@wjddn279
Copy link
Contributor Author

@potiuk @jason810496

I've completed the requested changes and reflected them in the document! I would appreciate it if you could review them.

Copy link
Member

@jason810496 jason810496 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The documentation LGTM, thanks!

Copy link
Member

@potiuk potiuk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Love it !

@potiuk potiuk force-pushed the add-memray-tracker branch from 85e4e4e to 7c9c1f9 Compare October 25, 2025 17:27
@potiuk
Copy link
Member

potiuk commented Oct 25, 2025

Rebased.. And I am happy to merge if it's green/

@potiuk potiuk merged commit f8c6e81 into apache:main Oct 26, 2025
8 checks passed
@zachliu
Copy link
Contributor

zachliu commented Nov 6, 2025

thank you so much for this!
is it going to be in 3.1.3? or 3.2.x?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants