Skip to content

inclusionAI/LLaDA2.0

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 
 
 
 
 

Repository files navigation

LLaDA2.0: Scaling Up Diffusion Language Models to 100B

repo repo models tech report

Model Introduction

We are thrilled to introduce LLaDA2.0, a milestone series of discrete diffusion Large Language Models (dLLMs) from the Ant Group. The LLaDA2.0 family, featuring LLaDA2.0-mini (16B) and LLaDA2.0-flash (100B) with a Mixture-of-Experts (MoE) architecture, marks the first time diffusion models have been scaled to the 100-billion parameter level.

Key Features

  • 🚀 Scaled to 100B Parameters: LLaDA2.0-flash is the largest diffusion language model to date, demonstrating exceptional performance on code generation and complex instruction-following tasks.
  • ⚡ 2.1x Inference Acceleration: Leveraging a parallel decoding mechanism, LLaDA2.0-flash-CAP achieves an inference speed of up to 535 tokens/s, significantly outpacing comparable AR models.
  • 🔍 Fully Open Source: The model weights for both the 16B and 100B versions, along with associated training code, are fully open-sourced on Hugging Face.
LLaDA2.0 Decoding Tractory

Model Variants

Model ID Description Hugging Face Link
inclusionAI/LLaDA2.0-mini Instruction-tuned model, ready for downstream applications. 🤗 Model Card
inclusionAI/LLaDA2.0-flash Instruction-tuned model, ready for downstream applications. 🤗 Model Card
inclusionAI/LLaDA2.0-mini-CAP Enhanced with Confidence-Aware Parallel, for efficient inference. 🤗 Model Card
inclusionAI/LLaDA2.0-flash-CAP Enhanced with Confidence-Aware Parallel, for efficient inference. 🤗 Model Card

Evaluation Results

Evaluation Results

Deployment and Usage

To make our 100B model practical, we have performed deep engineering optimizations. We built a custom inference engine based on dInfer and SGLang, which supports KV-Cache reuse and block-level parallel decoding. This makes LLaDA2.0 not just an academic achievement but a high-performance generation model ready for real-world deployment.

License

This project is licensed under the terms of the Apache License 2.0.

About

LLaDA2.0 is the diffusion language model series developed by InclusionAI team, Ant Group.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published