GitHub - ShenZhang-Shin/LEDiT: [NeurIPS 2025] PyTorch Implementation of "LEDiT: Your Length-Extrapolatable Diffusion Transformer without Positional Encoding"

LEDiT: Your Length-Extrapolatable Diffusion Transformer without Positional Encoding
_{Official PyTorch Implementation}

Paper | Project Page

This repo contains PyTorch model definitions, pre-trained weights and training/sampling code for our paper exploring length-extrapolatable diffusion transformer(LEDiT). You can find more visualizations on our project page.

Your Length-Extrapolatable Diffusion Transformer without Positional Encoding
Shen Zhang¹, Siyuan Liang¹, Yaning Tan², Zhaowei Chen¹, Linze Li¹, Ge Wu³, Yuhao Chen¹, Shuheng Li¹, Zhenyu Zhao¹, Caihua Chen², Jiajun Liang¹†, Yao Tang¹†
¹JIIOV Technology, ²Nanjing University, ³Nankai University

Diffusion transformers (DiTs) struggle to generate images at resolutions higher than their training resolutions. The primary obstacle is that the explicit positional encodings(PE), such as RoPE, need extrapolation which degrades performance when the inference resolution differs from training.

In this paper, we propose a Length-Extrapolatable Diffusion Transformer(LEDiT), a simple yet powerful architecture to overcome this limitation. LEDiT needs no explicit PEs, thereby avoiding extrapolation. The key innovations of LEDiT are introducing causal attention to implicitly impart global positional information to tokens, while enhancing locality to precisely distinguish adjacent tokens. Experiments on 256x256 and 512x512 ImageNet show that LEDiT can scale the inference resolution to 512x512 and 1024x1024, respectively, while achieving better image quality compared to current state-of-the-art length extrapolation methods(NTK-aware, YaRN). Moreover, LEDiT achieves strong extrapolation performance with just 100k steps of fine-tuning on a pretrained DiT, demonstrating its potential for integration into existing text-to-image DiTs.

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
visuals		visuals
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

LEDiT: Your Length-Extrapolatable Diffusion Transformer without Positional Encoding
_{Official PyTorch Implementation}

Paper | Project Page

About

Uh oh!

Releases

Packages

ShenZhang-Shin/LEDiT

Folders and files

Latest commit

History

Repository files navigation

LEDiT: Your Length-Extrapolatable Diffusion Transformer without Positional EncodingOfficial PyTorch Implementation

Paper | Project Page

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

LEDiT: Your Length-Extrapolatable Diffusion Transformer without Positional Encoding
_{Official PyTorch Implementation}

Packages