-
Couldn't load subscription status.
- Fork 3.6k
Description
Description & Motivation
Saving the last.ckpt as a symlink on local file systems makes a lot of sense for most workflows. However, in a several cases, users often back up their checkpoints to cloud storage (AWS, GCP, etc.). In these scenarios, it is difficult to manage symlinks because they are often an all-or-nothing upload -- i.e. we cannot choose which symlinks to upload without being highly prescriptive on upload.
Checkpoints, especially last.ckpt, are critical for resuming runs, fine-tuning, etc. So we often want to back these up. However, when last.ckpt is a symlink, the backup process to cloud becomes much more involved.
Pitch
Add option `save_last=copy', where we save a copy of the last checkpoint
Alternatives
No response
Additional context
No response