Tags from world_engine

1.5.6

2026-05-20T15:59:31Z

1.5.5

2026-04-30T20:48:32Z

1.5.4: MoE Support (#47)

2026-04-23T17:53:44Z

MoE Support (#47)

* implement state loading / saving

* moe + fbgemm optimization

* wp-1.5 staging

* clean up and fix ae

* fix temporal compression rope bugs

* vae reset in world_engine.reset

* reduce peak memory

* Implements the orthorope angles computation instead of precomputing (#25)

* fix: uv sync issue with python version 3.9

* fix: VRAM explosion

* refactor: init on gpu device directly

* fix: don't use fbgemm on windows for now

* feat: orthoropeangles

* fix: NoCastModule OrthoRoPEAngles

* fix: remove pos_ids from args

* fix: remove old src rope replacement patch

* fix: remove out of scope ae changes

* fix: remove out of scope text encoder changes

* fix: patch_model pos_ids

---------

Co-authored-by: Philpax <me@philpax.me>

* test: revert direct device init (#28)

* feat: use built triton-windows fork to fix long-path issue

* update gen_sample

* better quant

* avoid warning when creating mouse / scroll tensors

* disable unimportant compile options

* clean up model loading

* remove unnecessary push_to_hub

* remove unnecessary save_pretrained

* reduce cpu memory

* pass device

* moe wip

* default quant=None for benchmarking

* moe support

* remove deprecated benchmark_moe

* remove blocking path benchmark

* remove flashinfer path

* fix moe loading, expert weighing, refactor MoE class

---------

Co-authored-by: Clydingus <40514241+Clydingus@users.noreply.github.com>
Co-authored-by: Philpax <me@philpax.me>

1.5.3

2026-04-09T01:04:54Z

1.5.2

2026-04-09T00:58:37Z

1.5.1: Excluding layers from quantization (#36)

2026-04-08T19:35:35Z

Excluding layers from quantization (#36)

* exclude layers from quant using fqn

* strict checks

1.5.0: Adds support for int8 w8a8_gemlite quantization (#34)

2026-04-08T14:37:03Z

Adds support for int8 w8a8_gemlite quantization (#34)

* add torchao quantize_

* testing

* testing yes

* use taehv overide

* yuh

* add apply qat

* yuh

* uh

* enable int4 benchmarking and inference

* apply quantize_model w8a8

* add int8 ptq

* quant none

* int8 gemlite implementation

* clean up, remove torchao quantization

* add gemlite to requirements

* remove unused quant kernels and imports

* restore gen_sample.py, more cleanup

* update readme with Quantization docs

* fixed requirements gemlite

* Clean up pyproject.toml and add config defaults to base_model.py

* Add gemlite warmup+cache, and update gemlite version

* cleanup pyproject.toml and resize in examples