Skip to content

Latest commit

 

History

History
43 lines (26 loc) · 3.75 KB

README.md

File metadata and controls

43 lines (26 loc) · 3.75 KB

handwriter.ttf

✍️ A Handwriting Synthesizer by abusing Harfbuzz WASM Shaper.

🔗 Check more stupid stuff at Harfbuzz-WASM-Fantasy.

Introduction

During the hype of llama.ttf months ago, I was speculating the potential of WASM shaper for even crazier purpose, one that fitter to a font shaper's duty -- to synthesize font at runtime. This project as proof-of-concept implements a synthesizer that generates and rasterizes handwriting-style font, backed by a super-lightweight RNN model (~14MiB).

The project must be run in an application linked against libharfbuzz with the experimental WASM shaper enabled, which does not hold for any products currently. Considering that it's not easy to build such a library from scratch, I prebuilt a Docker image hsfzxjy/harfbuzz-wasm-handwriting-synthesis which contains both the TTF file and a modified version of gedit.

Usage You may try out this project with the following steps:

  1. On a Linux system with X11 (WSL is fine), run GIT_LFS_SKIP_SMUDGE=1 git clone https://github.com/hsfzxjy/handwriter.ttf;
  2. In directory handwriter.ttf, run make run, which fetches Docker image hsfzxjy/harfbuzz-wasm-handwriting-synthesis and starts the gedit application inside;
  3. Start typing in the pop-up gedit window. Each line should prefixed by # to trigger the shaper, e.g., typing #hello world.

Some strokes might look cursed due to the limitation of the model, appending a space should make it better.

Watch on Youtube

2024-08-21.13-31-43.mp4

Technical Details

Algorithm

The project follows Alex Graves's paper Generating Sequences With Recurrent Neural Networks and adopts an RNN model for handwriting synthesis. Shortly, the generation process undergoes multiple steps to produce a series of strokes given the input text. At each step the model predicts the next pen position given the current one. Afterwards, Bresenham's line algorithm rasterizes the strokes into pixel locations, which are set as the offsets for an array of "black-box" glyphs.

I've tried some more recent models, but their runtime latency is unaffordable.

Performance

The final TTF file is highly optimized, reaching the speed of 0.08 sec/character on Intel Ultra 125H. Each text run's generation time is proportional to the text length.

The journey to perfect optimization is interesting, which I shall introduce in blog posts later. Some important notes:

  • Use rten as inference backend to make sure neural ops are executed with SIMD instructions.
  • Pre-transpose the RHS of MatMul to make them col-major, improving the performance by ~15%.
  • To run modules containing SIMD instructions, wasm-micro-runtime should be compiled with -DWAMR_BUILD_SIMD=1 and WASM file must be AOT-compiled by wamrc.
  • Enable specific optimization in wamrc (--opt-level=3, --enable-segue=i32.load,f32.load,i32.store,f32.store and --enable-tail-call), improving the performance by ~55%.

License

This project is licensed under the Apache 2.0 LICENSE. Copyright (c) 2024 hsfzxjy.