Skip to content

CPU-Memory keeps accumulating during trainer.predict #19398

Open
@surajpaib

Description

@surajpaib

Bug description

This is very similar to closed issue #15656

I am working on predicting using PL Trainer on 3D images and these are huge, my process keeps getting killed when a large number of samples are to be predicted. I found #15656 and expected that to be the solution but setting return_predictions=False does not fix the memory accumulation.

What seems to work instead is adding a gc.collect() in the predict_loop. This keeps CPU memory usage constant as would be expected.

It seems like setting return_predictions=False should stop the memory accumulation but I'm confused as to why the gc.collect() is needed.

This is where the gc.collect() is applied: https://github.com/project-lighter/lighter/blob/07018bb2c66c0c8848bab748299e2c2d21c7d185/lighter/callbacks/writer/base.py#L120

I've also attached a memory log using scalene of the return predictions and the gc collect comparison. As you can see, there is no memory growth for gc collect.

Would you be able to provide any intuition on this? It would be much appreciated!

What version are you seeing the problem on?

v2.1

How to reproduce the bug

No response

Error messages and logs

gc_collect.pdf
return_predictions_false.pdf

Environment

Current environment
#- Lightning Component (e.g. Trainer, LightningModule, LightningApp, LightningWork, LightningFlow):
#- PyTorch Lightning Version (e.g., 1.5.0):
#- Lightning App Version (e.g., 0.5.2):
#- PyTorch Version (e.g., 2.0):
#- Python version (e.g., 3.9):
#- OS (e.g., Linux):
#- CUDA/cuDNN version:
#- GPU models and configuration:
#- How you installed Lightning(`conda`, `pip`, source):
#- Running environment of LightningApp (e.g. local, cloud):

More info

No response

cc @lantiga @Borda

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions