Skip to content

Conversation

alex000kim
Copy link
Contributor

  • Update docs/source/generate_examples.py to scan subdirs two levels deep
  • Add more focus in examples/training/torchtitan/README.md on distributed training, auto-recovery and scalability

Copy link
Collaborator

@Michaelvll Michaelvll left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @alex000kim! Left some comments.

alex000kim and others added 10 commits September 27, 2025 08:46
Co-authored-by: Zhanghao Wu <zhanghao.wu@outlook.com>
Co-authored-by: Zhanghao Wu <zhanghao.wu@outlook.com>
Co-authored-by: Zhanghao Wu <zhanghao.wu@outlook.com>
Co-authored-by: Zhanghao Wu <zhanghao.wu@outlook.com>
Co-authored-by: Zhanghao Wu <zhanghao.wu@outlook.com>
Co-authored-by: Zhanghao Wu <zhanghao.wu@outlook.com>
Co-authored-by: Zhanghao Wu <zhanghao.wu@outlook.com>
Copy link
Collaborator

@Michaelvll Michaelvll left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @alex000kim! Some additional comments.

alex000kim and others added 4 commits September 29, 2025 13:22
Co-authored-by: Zhanghao Wu <zhanghao.wu@outlook.com>
Co-authored-by: Zhanghao Wu <zhanghao.wu@outlook.com>
Copy link
Collaborator

@Michaelvll Michaelvll left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @alex000kim! Let's merge it first : )

@Michaelvll Michaelvll merged commit 96e3061 into skypilot-org:master Sep 29, 2025
19 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants