Skip to content

finetuning on v4-32 dies suddenly #1050

@shankerabhigyan

Description

@shankerabhigyan

Finetuning for gemma dies suddenly after about 12 hours. There are no warnings or messages in the output logs, the process is just killed.
image
The script https://ai.google.dev/gemma/docs/distributed_tuning was being run using nohup.

What could be some possible debugging steps or is this a server-side problem?
I experienced the same behaviour in v3-8 devices.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions