Skip to content

Commit e78f616

Browse files
committed
chore: Prepare 1.0.0~alpha2
1 parent 4ec76f3 commit e78f616

File tree

1 file changed

+31
-4
lines changed

1 file changed

+31
-4
lines changed

CHANGES.md

Lines changed: 31 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -5,12 +5,35 @@ All notable changes to this project will be documented in this file.
55
- Only document user-facing changes (features, bug fixes, performance improvements, API changes, etc.)
66
- Add new entries at the top of the appropriate section (most recent first)
77

8-
## [1.0.0~alpha2] - TBD
8+
## [1.0.0~alpha2] - 2025-11-03
9+
10+
We're excited to announce the release of Raven 1.0.0~alpha2! Less than a month after alpha1, this release notably includes contributions from Outreachy applicants in preparation for the upcoming _two_ internships.
11+
12+
Some highlights from this release include:
13+
14+
- NumPy-compatible text I/O with `Nx_io.{save,load}_text`
15+
- Lots of new functions in Nx/Rune, including neural-net ones `dropout`, `log_softmax`, `batch_norm`, `layer_norm`, and activation functions like `celu` and `celu`, and generic ones like `conjugate`, `index_put`, and more.
16+
- Addition of `.top` libraries for `nx`, `rune`, and `hugin` that auto-install pretty-printers in the OCaml toplevel. You can run e.g. `#require "nx.top"`.
17+
- Addition of a visualization API in Fehu via the new `fehu.visualize` library, supporting video recording.
18+
- Redesign of Kaun core datastructure and checkpointing subsystem for complete snapshotting.
19+
- Many, many bug fixes and correctness improvements.
20+
21+
We've also made numerous performance improvements across the board:
22+
23+
- Nx elementwise ops: 5–50× faster (e.g., Add 50×50 f32 88.81 µs → 1.83 µs, **48×**; Mul 100×100 f32 78.51 µs → 2.41 µs, **33×**).
24+
- Nx conv2d: **4–5×** faster on common shapes; up to **115×** on heavy f64 batched cases (e.g., B16 C64→128 16×16 K3 f64 1.61 s → 13.96 ms).
25+
- Rune autodiff: **1.2–3.7×** faster on core grads (e.g., MatMulGrad Medium 34.04 ms → 11.91 ms, **2.86×**; Large 190.19 ms → 50.97 ms, **3.73×**).
26+
- Talon dataframes: big wins in joins and group-bys (Join 805.35 ms → 26.10 ms, **31×**; Group-by 170.80 ms → 19.03 ms, ****; Filter 9.93 ms → 3.39 ms, ****).
27+
- Saga tokenizers: realistic workloads **4–17%** faster (e.g., WordPiece encode single 136.05 µs → 115.92 µs, **1.17×**; BPE batch_32 24.52 ms → 22.27 ms, **1.10×**)
28+
29+
We're closing 8 user-reported issues or feature requests and are totalling 30 contributions from 15 unique contributors.
930

1031
### Nx
1132

12-
- Add `Nx_core.Cache_dir` module with consolidated cache directory utilities respecting `RAVEN_CACHE_ROOT`, `XDG_CACHE_HOME`, and `HOME` fallback, replacing project-specific cache logic across the whole raven ecosystem (#133, @Arsalaan-Alam)
33+
- Add `Nx_io.Cache_dir` module with consolidated cache directory utilities respecting `RAVEN_CACHE_ROOT`, `XDG_CACHE_HOME`, and `HOME` fallback, replacing project-specific cache logic across the whole raven ecosystem (#134, @Arsalaan-Alam)
1334
- Add `Nx_io.save_txt` / `Nx_io.load_txt` with NumPy-compatible formatting, comments, and dtype support (#120, @six-shot)
35+
- Optimize `multi_dot` for matrix chains, reducing intermediate allocations and improving performance (@tmattio)
36+
- Add public `index_put` function for indexed updates (@tmattio)
1437
- Clarify `reshape` documentation to match its view-only semantics (@tmattio)
1538
- Provide `nx.top`, `rune.top`, and `hugin.top` libraries that auto-install pretty printers in the OCaml toplevel and update Quill to load them (@tmattio)
1639
- Add `ifill` for explicit in-place fills and make `fill` return a copied tensor (@tmattio)
@@ -19,7 +42,7 @@ All notable changes to this project will be documented in this file.
1942
- Speed up float reductions with contiguous multi-axis fast paths (@tmattio)
2043
- Fast-path padding-free `unfold` to lower conv2d overhead (@tmattio)
2144
- Move neural-network operations (softmax, log_softmax, relu, gelu, silu, sigmoid, tanh) from Kaun to Nx (@tmattio)
22-
- Add public `conjugate` function for complex number conjugation (#123, @Arsalaan-Alam)
45+
- Add public `conjugate` function for complex number conjugation (#125, @Arsalaan-Alam)
2346
- Fix complex vdot to conjugate first tensor before multiplication, ensuring correct mathematical behavior (#123, @Arsalaan-Alam)
2447
- Update comparison and conditional operations to use boolean tensors (#115, @nirnayroy)
2548
- Add support for rcond parameter and underdetermined systems to `lstsq` (#102, @Shocker444)
@@ -52,18 +75,22 @@ All notable changes to this project will be documented in this file.
5275
### Kaun
5376

5477
- Added Similarity and Polysemy analysis to the BERT example (#137, @nirnayroy)
78+
- Support attention masks via the new `Kaun.Attention` module (@tmattio)
79+
- Support loading sharded Hugging Face safetensors (@tmattio)
80+
- Fix BERT and GPT‑2 model loading (@tmattio)
5581
- API simplification: removed type parameters from public types; `Ptree` now supports mixed‑dtype trees via packed tensors with typed getters. (@tmattio)
5682
- Checkpointing overhaul: versioned `Train_state` with schema tagging, explicit `Checkpoint.{Snapshot,Artifact,Manifest,Repository}` (retention, tags, metadata), and simple save/load helpers for snapshots and params. (@tmattio)
5783
- Overhaul dataset combinators: derive tensor specs from Rune dtype, fix sampling/window bugs, validate weighted sampling, and respect `drop_remainder` (@tmattio)
5884
- Make dataset `prefetch` truly asynchronous with background domains and allow reusing an external Domainslib pool via `parallel_map ~pool` (@tmattio)
85+
- Use `Dataset.iter` for epoch batches to reduce overhead (@tmattio)
5986
- Update BERT and GPT-2 tokenizer cache to use `Nx.Cache` for consistent cache directory resolution (#133, @Arsalaan-Alam)
6087
- Honor text dataset encodings via incremental Uutf decoding (#122, @Satarupa22-SD).
6188
- Preserve empty sequential modules when unflattening so indices stay aligned for checkpoint round-tripping (@tmattio)
6289
- Prevent `Training.fit`/`evaluate` from consuming entire datasets eagerly and fail fast when a dataset yields no batches, avoiding hangs and division-by-zero crashes (@tmattio)
6390
- Allow metric history to tolerate metrics that appear or disappear between epochs so dynamic metric sets no longer raise during training (@tmattio)
6491
- Make `Optimizer.clip_by_global_norm` robust to zero gradients and empty parameter trees to avoid NaNs during training (@tmattio)
6592
- Split CSV loader into `from_csv` and `from_csv_with_labels` to retain labels when requested (#114, @Satarupa22-SD)
66-
- Implement AUC-ROC and AUC-PR in Kaun metrics and simplify their signatures (#109, #131, @Shocker444)
93+
- Implement AUC-ROC and AUC-PR in Kaun metrics and simplify their signatures (#124, #131, @Shocker444)
6794
- Add mean absolute percentage error, explained variance, R² (with optional adjustment), KL-divergence, and top-k accuracy to Kaun metrics (@tmattio)
6895
- Add NDCG, MAP, and MRR ranking metrics to Kaun metrics (@tmattio)
6996
- Add BLEU, ROUGE, and METEOR metrics to Kaun for pre-tokenized sequences, removing tokenizer dependencies (@tmattio)

0 commit comments

Comments
 (0)