Closed
Description
Estimated branch cute date: Tuesday, Dec 3
Estimate release date: Tuesday, Dec 10
Release owner: @felipemello1
New features
- feat(cli): allow users to download models from Kaggle #2002
- Add support for QAT + LoRA #1931
- Integrate INT8 mixed-precision from torchao 0.7 #1552
- feat: add gemma2 variants #1835
- Support Early Exit Loss and/or Layer Dropout #1076
- Add Ascend NPU as a backend #1826
- Early fusion multimodal models #1904
- CLIP Text Encoder #1969
- 2D RoPE + CLIP updates #1973
- PPO Performance Improvements #2066
- Support finetuning base model weights in QAT + LoRA flow #2089
- Add ability to shard custom layers for DPO and LoRA distributed #2072
- Update QAT: add grad clipping, torch.compile, collate fn #1854
- Add LR Scheduler to full finetune distributed #2017
- Activation Offloading for Lora DPO #2083
QoL improvements:
- Use hf transfer as default #2046
- Switch to PyTorch's built-in RMSNorm #2054
- torchrun defaults for concurrent distributed training jobs #2015
- remove default to ignore safetensors #2042
- update configs #1954
- Refactor Recipe State Dict Code #1964
- Update KV Cache to use num_kv_heads instead of num_heads #1961
Bug fixes
- Fix Qlora/lora for 3.2 vision #2028
- A more encompassing fix for offloading + ac #1936
- Fix grad accum + FSDP CPU offload, pass None via CLI #1941
- Make sure CLIP resized pos_embed is contiguous #1986
- Add **quantization_kwargs to
FrozenNF4Linear
andLoRALinear
andDoRALinear
#1987 - [Bug] model_type argument as str for checkpoints classes #1946
- Convert all non-rgb images to rgb #1976
- gemma2 had wrong path to scheduler #2013
- Adding MM eval tests / attention bugfixes #1989
- Error message on
packed=True
for stack exchange dataset #2079 - Fail early with
packed=True
on MM datasets. #2080
Deprecations -> Removal
- Remove unused FSDP components #2016
- Remove deprecated
TiedEmbeddingTransformerDecoder
#2041 - Remove deprecated
TiedEmbeddingTransformerDecoder
#2047 - Deprecate
SimPOLoss
#2062
Documentation
Metadata
Metadata
Assignees
Labels
No labels