favicon diffusor: a high-performance browser diffusion transformer

Ever wanted fast diffusion on device? Struggled with compatibility and libraries? Worry no more—favicon diffusor is here! Using WebGPU, its supported on almost any device (that can run chrome) and can diffuse hippos anywhere (even as a favicon)!

A quick weekend project where I hacked on a bunch of WebGPU kernels from scratch and tried to optimize them. Building on my last "from scratch DiT" this starts at the kernel level and rewrites diffusion transformers using WGSL. A subsecond 32-step diffusion inference time allows for an awesome demo of actually diffusing the favicon of a website realtime in ~0.7s with a ~11M parameter model

https://notebook.neelr.dev/stories/in-browser-favicon-diffusion-scratch-dit-pt-2

perf

Of course.... here are the approximate numbers on an M1 Pro! Currently faster than tf.js and (of course) baseline JS—transformers.js doesn't support custom layer building, so I didn't include it.

Implementation	Time (s)	vs Baseline	vs TensorFlow.js
Favicon Diffusor	0.86	88.6% faster	45.2% faster
TensorFlow.js	1.57	79.3% faster	baseline
Baseline JS	7.57	baseline	382% slower

tech

The implementation includes several key optimizations:

Custom WGSL shaders for core operations
Efficient memory management and tensor operations
Optimized attention mechanisms
Streamlined data pipelining

getting started

you need a browser with WebGPU support (Chrome Canary or other modern browsers with WebGPU flags enabled)

installation

clone the repository:

git clone https://github.com/neelr/favicon-diffusor.git
cd favicon-diffusor

Run a development server:

npx http-server

development

The project structure includes:

dit.py - PyTorch reference implementation
dit.js - JavaScript implementation
shaders/ - WebGPU shader implementations
train.py - Training scripts
compile.sh - Compile the shaders into a single file
Various utility and testing files

Open to contributions! dit.js and shaders/shaders.js are the only files you really need for the demo and the rest are just for training and testing. Those two combined are only ~2k lines of code.

Name		Name	Last commit message	Last commit date
Latest commit History 29 Commits
benchmarks		benchmarks
shaders		shaders
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
compile.sh		compile.sh
dit.js		dit.js
dit.py		dit.py
embed.html		embed.html
index.html		index.html
requirements.txt		requirements.txt
sample.html		sample.html
sdxl_generate.py		sdxl_generate.py
test.ipynb		test.ipynb
train.py		train.py
train_hippos.py		train_hippos.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

favicon diffusor: a high-performance browser diffusion transformer

perf

tech

getting started

installation

development

TODO

resources

About

Uh oh!

Uh oh!

Languages

License

neelr/favicon-diffusion

Folders and files

Latest commit

History

Repository files navigation

favicon diffusor: a high-performance browser diffusion transformer

perf

tech

getting started

installation

development

TODO

resources

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Uh oh!

Languages