LLaDA2.0: Scaling Up Diffusion Language Models to 100B

Model Introduction

We are thrilled to introduce LLaDA2.0, a milestone series of discrete diffusion Large Language Models (dLLMs) from the Ant Group. The LLaDA2.0 family, featuring LLaDA2.0-mini (16B) and LLaDA2.0-flash (100B) with a Mixture-of-Experts (MoE) architecture, marks the first time diffusion models have been scaled to the 100-billion parameter level.

Key Features

🚀 Scaled to 100B Parameters: LLaDA2.0-flash is the largest diffusion language model to date, demonstrating exceptional performance on code generation and complex instruction-following tasks.
⚡ 2.1x Inference Acceleration: Leveraging a parallel decoding mechanism, LLaDA2.0-flash-CAP achieves an inference speed of up to 535 tokens/s, significantly outpacing comparable AR models.
🔍 Fully Open Source: The model weights for both the 16B and 100B versions, along with associated training code, are fully open-sourced on Hugging Face.

Model Variants

Model ID	Description	Hugging Face Link
`inclusionAI/LLaDA2.0-mini`	Instruction-tuned model, ready for downstream applications.	🤗 Model Card
`inclusionAI/LLaDA2.0-flash`	Instruction-tuned model, ready for downstream applications.	🤗 Model Card
`inclusionAI/LLaDA2.0-mini-CAP`	Enhanced with Confidence-Aware Parallel, for efficient inference.	🤗 Model Card
`inclusionAI/LLaDA2.0-flash-CAP`	Enhanced with Confidence-Aware Parallel, for efficient inference.	🤗 Model Card

Evaluation Results

Deployment and Usage

To make our 100B model practical, we have performed deep engineering optimizations. We built a custom inference engine based on dInfer and SGLang, which supports KV-Cache reuse and block-level parallel decoding. This makes LLaDA2.0 not just an academic achievement but a high-performance generation model ready for real-world deployment.

License

This project is licensed under the terms of the Apache License 2.0.

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
figures		figures
LICENSE		LICENSE
README.md		README.md
tech_report.pdf		tech_report.pdf

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

LLaDA2.0: Scaling Up Diffusion Language Models to 100B

Model Introduction

Key Features

Model Variants

Evaluation Results

Deployment and Usage

License

About

Uh oh!

Releases

Packages

Contributors 2

License

inclusionAI/LLaDA2.0

Folders and files

Latest commit

History

Repository files navigation

LLaDA2.0: Scaling Up Diffusion Language Models to 100B

Model Introduction

Key Features

Model Variants

Evaluation Results

Deployment and Usage

License

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Packages