Skip to content

Conversation

@DNA386
Copy link
Contributor

@DNA386 DNA386 commented Sep 16, 2025

Detach the tensors returned in the training loops to ensure the computation graph can be cleared from memory properly after each batch.

@y-richie-y
Copy link
Collaborator

As far as I know, .item() already converts the loss to a python scalar, hence detaching the value from the computation graph, so .detach() is redundant. Once loss goes out of scope, it can be garbage collected. Can you provide more information or give an explanation as to why this resolves the memory leak?

Copy link
Contributor Author

@DNA386 DNA386 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

detatch not required with item()

@DNA386
Copy link
Contributor Author

DNA386 commented Jan 6, 2026

You're right, its not needed on the loss.
We do still need to detach y though (which was the actual source of the leak)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants