You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: CHANGES.md
+31-4Lines changed: 31 additions & 4 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -5,12 +5,35 @@ All notable changes to this project will be documented in this file.
5
5
- Only document user-facing changes (features, bug fixes, performance improvements, API changes, etc.)
6
6
- Add new entries at the top of the appropriate section (most recent first)
7
7
8
-
## [1.0.0~alpha2] - TBD
8
+
## [1.0.0~alpha2] - 2025-11-03
9
+
10
+
We're excited to announce the release of Raven 1.0.0~alpha2! Less than a month after alpha1, this release notably includes contributions from Outreachy applicants in preparation for the upcoming _two_ internships.
11
+
12
+
Some highlights from this release include:
13
+
14
+
- NumPy-compatible text I/O with `Nx_io.{save,load}_text`
15
+
- Lots of new functions in Nx/Rune, including neural-net ones `dropout`, `log_softmax`, `batch_norm`, `layer_norm`, and activation functions like `celu` and `celu`, and generic ones like `conjugate`, `index_put`, and more.
16
+
- Addition of `.top` libraries for `nx`, `rune`, and `hugin` that auto-install pretty-printers in the OCaml toplevel. You can run e.g. `#require "nx.top"`.
17
+
- Addition of a visualization API in Fehu via the new `fehu.visualize` library, supporting video recording.
18
+
- Redesign of Kaun core datastructure and checkpointing subsystem for complete snapshotting.
19
+
- Many, many bug fixes and correctness improvements.
20
+
21
+
We've also made numerous performance improvements across the board:
- Nx conv2d: **4–5×** faster on common shapes; up to **115×** on heavy f64 batched cases (e.g., B16 C64→128 16×16 K3 f64 1.61 s → 13.96 ms).
25
+
- Rune autodiff: **1.2–3.7×** faster on core grads (e.g., MatMulGrad Medium 34.04 ms → 11.91 ms, **2.86×**; Large 190.19 ms → 50.97 ms, **3.73×**).
26
+
- Talon dataframes: big wins in joins and group-bys (Join 805.35 ms → 26.10 ms, **31×**; Group-by 170.80 ms → 19.03 ms, **9×**; Filter 9.93 ms → 3.39 ms, **3×**).
We're closing 8 user-reported issues or feature requests and are totalling 30 contributions from 15 unique contributors.
9
30
10
31
### Nx
11
32
12
-
- Add `Nx_core.Cache_dir` module with consolidated cache directory utilities respecting `RAVEN_CACHE_ROOT`, `XDG_CACHE_HOME`, and `HOME` fallback, replacing project-specific cache logic across the whole raven ecosystem (#133, @Arsalaan-Alam)
33
+
- Add `Nx_io.Cache_dir` module with consolidated cache directory utilities respecting `RAVEN_CACHE_ROOT`, `XDG_CACHE_HOME`, and `HOME` fallback, replacing project-specific cache logic across the whole raven ecosystem (#134, @Arsalaan-Alam)
13
34
- Add `Nx_io.save_txt` / `Nx_io.load_txt` with NumPy-compatible formatting, comments, and dtype support (#120, @six-shot)
35
+
- Optimize `multi_dot` for matrix chains, reducing intermediate allocations and improving performance (@tmattio)
36
+
- Add public `index_put` function for indexed updates (@tmattio)
14
37
- Clarify `reshape` documentation to match its view-only semantics (@tmattio)
15
38
- Provide `nx.top`, `rune.top`, and `hugin.top` libraries that auto-install pretty printers in the OCaml toplevel and update Quill to load them (@tmattio)
16
39
- Add `ifill` for explicit in-place fills and make `fill` return a copied tensor (@tmattio)
@@ -19,7 +42,7 @@ All notable changes to this project will be documented in this file.
19
42
- Speed up float reductions with contiguous multi-axis fast paths (@tmattio)
20
43
- Fast-path padding-free `unfold` to lower conv2d overhead (@tmattio)
21
44
- Move neural-network operations (softmax, log_softmax, relu, gelu, silu, sigmoid, tanh) from Kaun to Nx (@tmattio)
22
-
- Add public `conjugate` function for complex number conjugation (#123, @Arsalaan-Alam)
45
+
- Add public `conjugate` function for complex number conjugation (#125, @Arsalaan-Alam)
23
46
- Fix complex vdot to conjugate first tensor before multiplication, ensuring correct mathematical behavior (#123, @Arsalaan-Alam)
24
47
- Update comparison and conditional operations to use boolean tensors (#115, @nirnayroy)
25
48
- Add support for rcond parameter and underdetermined systems to `lstsq` (#102, @Shocker444)
@@ -52,18 +75,22 @@ All notable changes to this project will be documented in this file.
52
75
### Kaun
53
76
54
77
- Added Similarity and Polysemy analysis to the BERT example (#137, @nirnayroy)
78
+
- Support attention masks via the new `Kaun.Attention` module (@tmattio)
79
+
- Support loading sharded Hugging Face safetensors (@tmattio)
80
+
- Fix BERT and GPT‑2 model loading (@tmattio)
55
81
- API simplification: removed type parameters from public types; `Ptree` now supports mixed‑dtype trees via packed tensors with typed getters. (@tmattio)
56
82
- Checkpointing overhaul: versioned `Train_state` with schema tagging, explicit `Checkpoint.{Snapshot,Artifact,Manifest,Repository}` (retention, tags, metadata), and simple save/load helpers for snapshots and params. (@tmattio)
57
83
- Overhaul dataset combinators: derive tensor specs from Rune dtype, fix sampling/window bugs, validate weighted sampling, and respect `drop_remainder` (@tmattio)
58
84
- Make dataset `prefetch` truly asynchronous with background domains and allow reusing an external Domainslib pool via `parallel_map ~pool` (@tmattio)
85
+
- Use `Dataset.iter` for epoch batches to reduce overhead (@tmattio)
59
86
- Update BERT and GPT-2 tokenizer cache to use `Nx.Cache` for consistent cache directory resolution (#133, @Arsalaan-Alam)
60
87
- Honor text dataset encodings via incremental Uutf decoding (#122, @Satarupa22-SD).
61
88
- Preserve empty sequential modules when unflattening so indices stay aligned for checkpoint round-tripping (@tmattio)
62
89
- Prevent `Training.fit`/`evaluate` from consuming entire datasets eagerly and fail fast when a dataset yields no batches, avoiding hangs and division-by-zero crashes (@tmattio)
63
90
- Allow metric history to tolerate metrics that appear or disappear between epochs so dynamic metric sets no longer raise during training (@tmattio)
64
91
- Make `Optimizer.clip_by_global_norm` robust to zero gradients and empty parameter trees to avoid NaNs during training (@tmattio)
65
92
- Split CSV loader into `from_csv` and `from_csv_with_labels` to retain labels when requested (#114, @Satarupa22-SD)
66
-
- Implement AUC-ROC and AUC-PR in Kaun metrics and simplify their signatures (#109, #131, @Shocker444)
93
+
- Implement AUC-ROC and AUC-PR in Kaun metrics and simplify their signatures (#124, #131, @Shocker444)
67
94
- Add mean absolute percentage error, explained variance, R² (with optional adjustment), KL-divergence, and top-k accuracy to Kaun metrics (@tmattio)
68
95
- Add NDCG, MAP, and MRR ranking metrics to Kaun metrics (@tmattio)
69
96
- Add BLEU, ROUGE, and METEOR metrics to Kaun for pre-tokenized sequences, removing tokenizer dependencies (@tmattio)
0 commit comments