Skip to content

acforvs/ysda-llm-scaling-week

Repository files navigation

Speeding up training with Triton and FP8 / Ускоряем обучения за счёт Triton и FP8

This repository contains materials for the lecture on FP8 & Triton, part of the short course on Scaling the LLM training, organized in collaboration with Yandex and Yandex School of Data Analysis.

Local setup

For materials in Russian, use the ru/ directory. For materials in English, use the en/ directory.

To open the notebook locally, use the following command from the root of this repo:

cd trace-viewer
npm install
npm run dev

Then navigate to either

  • http://localhost:5173?trace=var/traces/ru.lecture_triton_fp8.json
  • or http://localhost:5173?trace=var/traces/en.lecture_triton_fp8.json,

depending on your preferred language.

Re-running the code (>= H100 is required)

To re-generate the traces, run:

python execute.py -m ru.lecture_triton_fp8
python execute.py -m en.lecture_triton_fp8

Citation

If you find this content useful, consider citing it as follows:

@misc{LLMScalingWeekFP8Triton,
  author = {Savinov, Vladislav},
  title = {Speeding up training with Triton and FP8},
  publisher = {GitHub},
  journal = {GitHub repository},
  howpublished = {\url{https://github.com/acforvs/ysda-llm-scaling-week}},
  year = {2025}
}

References

  1. DeepSeek-AI. (2024). DeepSeek-V3 Technical Report
  2. DeepSeek-AI. DeepGEMM
  3. Team Cohere. (2025). Command A: An Enterprise-Ready Large Language Model
  4. Micikevicius, P. et al. (2022). FP8 Formats for Deep Learning
  5. OpenAI. Triton
  6. NVIDIA. TransformerEngine
  7. Austin et al. (2025). How to Scale Your Model, Google DeepMind
  8. Modal Labs. GPU Glossary
  9. Meta AI. (2025). The Llama 4 herd: The beginning of a new era of natively multimodal AI innovation

Acknowledgments

I'd like to thank the team behind CS336, which was a big inspiration for how the materials are structured and for some parts of the talk.

About

Materials for the "Speeding up training with Triton and FP8" which were used for the https://llmscaling.yandex.com/en.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published