-
Notifications
You must be signed in to change notification settings - Fork 378
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
WIP: Independent child process monitoring #118 #134
Conversation
raise NotImplementedError('The psutil module is required when to' | ||
' monitor memory usage of children' | ||
' processes') | ||
raise NotImplementedError(( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry about this, I think this was just a carry over from the historical merge; I can reset back to the original if needed.
mpmprof
Outdated
lines += 1 | ||
|
||
# Collect memory usage of spawned children and write to profile | ||
for idx, mem in enumerate(mp._get_child_memory(proc.pid)): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So here is the line where I independently collect the memory of each child and add them to the .dat file with the "CHLD#" key. If include_children=True
then "main" will be the total memory consumed, and if not, it will be only the main process, but all processes will be plotted with independent_children=True
or whatever flag we come up with.
mpmprof
Outdated
mpoint = (0, 0) | ||
|
||
# Plot all of the series, the main process and the child. | ||
for proc, data in series.items(): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And then obviously, plot needs to be adapted to plot all the children.
mpmprof uses argparse e.g. #128 |
Thanks @bbengfort! . Do you think it is possible to have only one executable file, i.e., merge mpmprof into mprof? |
@fabianp yes, it's possible; just not very straightforward. I'll take a crack at it. |
Great!
… On 19 Mar 2017, at 11:24, Benjamin Bengfort ***@***.***> wrote:
@fabianp <https://github.com/fabianp> yes, it's possible; just not very straightforward. I'll take a crack at it.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub <https://github.com/fabianp/memory_profiler/pull/134#issuecomment-287606977>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AAQ8h8D_Fjoqwm0monZTeaGq2RMoQTR2ks5rnQJpgaJpZM4MhiXU>.
|
@fabianp ok, I've pushed a version that does it - would you mind taking a look and running your tests/examples? |
Great work @bbengfort ! The code looks great, let me try it on some of my own problems. In the meantime, you can add your name to the Authors part of the README :-) |
@fabianp thanks! I've updated the README with the example and usage and also added max value markers to the plot for the largest child. |
If I run the example with |
README.rst
Outdated
the total memory of the program as well as each child individually. | ||
|
||
.. warning:: currently the child tracking only works if a ``stream`` is provided to the ``profile`` (e.g. from the command line or in the decorator). | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How hard do you think it would be to remove this limitation?, i.e., to track children even if the function is not @Profile'd
README.rst
Outdated
the total memory of the program as well as each child individually. | ||
|
||
.. warning:: Currently the child tracking only works if a ``stream`` is provided to the ``profile`` (e.g. from the command line or in the decorator). If you are using the API to retrieve values then the flag will not do anything. | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am confused, as the example works fine even without decorated functions
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I meant that the stream argument has to be passed in, if stream = None
then I can't return multiple values. I could if you wanted me to return a list instead of a single float. I didn't necessarily mean it wouldn't work with the decorator.
Perhaps I could rephrase as follows:
"Currently tracking individual children requires a reporting output stream; if you'd like direct access to the memory usage of individual children, see the _get_child_memory
function."
Though of course, that then points the user to an internal function. I'm open to suggestions.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've put placeholder comments in the code (line 363 and 403) where this issue occurs.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, thanks for the explanation. I think the API would be simpler if we allow memory_usage to return a nested list instead embedding this into the stream. What do you think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, I've updated memory_usage
to return a nested list and updated the README accordingly.
Unfortunately, I think that the pickle error is a product of multiprocessing library: functions can only be pickled if they're defined at the top level and when you run it by creating a subprocess in I can see if I can change the example (though that may be difficult), but I also wanted to note that the problem exists even with the original version of the http://stackoverflow.com/questions/8804830/python-multiprocessing-pickling-error |
memory_profiler.py
Outdated
else: | ||
ret.append(mem_usage) | ||
if multiprocess: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here is the first place where I don't return multiple values and instead warn
memory_profiler.py
Outdated
else: | ||
ret.append(mem_usage) | ||
|
||
if multiprocess: | ||
warnings.warn("use include_children not multiprocess without a stream") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here is the second place where I don't return multiple values and instead warn
For the pickling, what I find strange is that both commands (with and without the "python" word in mprof) should be identical, as per lines 217-223 of mprof. Not sure why multiprocessing fails only on one of these cases since they should be executed the same way. |
So I see what you mean, however there is a subtle difference. The args to Popen if
Whereas the args to Popen if passed as an executable are
In the first case, So that took me a while to figure out, but I think it makes sense. What do you think? |
Thanks for the explanation and your time. I've merged the PR and added a small patch (52c5788) so that the example can be run without the python keyword. Thanks! |
Sure thing, let me know if there is anything else I can help with; sorry that this feature took so long! |
except psutil.NoSuchProcess: | ||
# https://github.com/fabianp/memory_profiler/issues/71 | ||
pass | ||
mem += sum(_get_child_memory(process, meminfo_attr)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi, quick question, as processes in modern linux share common memory after forking from their parent and only copy it on writes, wouldn't this sum
report the wrong number?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unfortunately I'm not sure of the behavior of psutil with regards to linux forks; I simply used the same meminfo_attr as the summation function. If that's true then this is indeed a problem, but I'd guess that we'd have to dig into psutil to figure it out.
It would indeed be great to know (but I'm clueless here)
On Jun 12, 2017 12:09 PM, "Benjamin Bengfort" <notifications@github.com> wrote:
*@bbengfort* commented on this pull request.
------------------------------
In memory_profiler.py
<https://github.com/fabianp/memory_profiler/pull/134#discussion_r121500525>:
@@ -111,12 +143,7 @@ def ps_util_tool():
else 'get_memory_info'
mem = getattr(process, meminfo_attr)()[0] / _TWO_20
if include_children:
- try:
- for p in process.children(recursive=True):
- mem += getattr(p, meminfo_attr)()[0] / _TWO_20
- except psutil.NoSuchProcess:
- # https://github.com/fabianp/memory_profiler/issues/71
- pass
+ mem += sum(_get_child_memory(process, meminfo_attr))
Unfortunately I'm not sure of the behavior of psutil with regards to linux
forks; I simply used the same meminfo_attr as the summation function. If
that's true then this is indeed a problem, but I'd guess that we'd have to
dig into psutil to figure it out.
—
You are receiving this because you modified the open/close state.
Reply to this email directly, view it on GitHub
<https://github.com/fabianp/memory_profiler/pull/134#discussion_r121500525>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AAQ8h-Xtff8eOlmZVoMv3dePOSis4Wziks5sDYzxgaJpZM4MhiXU>
.
|
This pull request is a start at resolving #118 -- as mentioned in that issue, I've created a function
_get_child_memory
that iterates through all child processes yielding their memory. This is summed if theinclude_children
flag is true.I've also updated the reference from #71 to wrap the entire yield in try/except and yield 0.0 if the race condition occurs; but this needs to be checked.
Including this into the main
mprof
library is the next step. I've created a demompmprof
which logs child and main process memory separately in th same.dat
file, then plots them correctly. Here is an example:The next steps are to merge
mpmprof
withmprof
with some flag, e.g.independent_children
or something like that (suggestions welcome) to make it behave likempmprof
and to have theplot
command correctly handle both cases.