GPU Memory Leak on Loading Pre-Trained Checkpoint

### Search before asking

- [X] I have searched the YOLOv5 [issues](https://github.com/ultralytics/yolov5/issues) and found no similar bug report.


### YOLOv5 Component

Training

### Bug

Training YOLO from a checkpoint (*.pt) consumes more GPU memory than training from a pre-trained weight (i.e. yolov5l).

### Environment

- YOLO: YOLOv5 (latest; how to check the yolo version?)
- CUDA: 11.6 (Tesla T4, 15360MiB)
- OS: Ubuntu 18.04.6 LTS (Bionic Beaver)
- Python: 3.8.12

### Minimal Reproducible Example

In the below training command, case 2 requires more GPU memory than case 1.
```
# 1. train from pre-trained model
train.py ... --weights yolov5l

# 2. train from pre-trained checkpoint
train.py ... --weights pre_trained_checkpoint.pt
```

### Additional

As reported on the pytorch forum[1], loading state dict on CUDA device causes memory leak. We should load it on CPU memory:

```python
state_dict = torch.load(directory, map_location=lambda storage, loc: storage)
```

- [1] https://discuss.pytorch.org/t/load-state-dict-causes-memory-leak/36189/5?u=bilzrd

### Are you willing to submit a PR?

- [X] Yes I'd like to help by submitting a PR!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

GPU Memory Leak on Loading Pre-Trained Checkpoint #6515

Search before asking

YOLOv5 Component

Bug

Environment

Minimal Reproducible Example

Additional

Are you willing to submit a PR?

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Uh oh!

GPU Memory Leak on Loading Pre-Trained Checkpoint #6515

Description

Search before asking

YOLOv5 Component

Bug

Environment

Minimal Reproducible Example

Additional

Are you willing to submit a PR?

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions