Tags: ruvnet/RuVector
Tags
style(sparse-attn): cargo fmt over crate sources after no_std refactor Co-Authored-By: claude-flow <ruv@ruv.net>
ADR-183 Tier 1-3 complete: vitals worker + CSI LoRA embedder + CPU en… …coder path - Separability ratio 4.515× (target ≥2×): PASS - p99 embed latency 0.002ms (target <12ms): PASS - Smoke test 38/38 PASS - All cluster nodes deployed: cognitum-v0/cluster-1/2/3 - Release: cognitum-one/v0-appliance v0.1.0-csi-lora
ruvllm Pi 5 + Hailo HAT cluster — ADR-179 SOTA
ruvector-hailo backend v0.2.0 (iter 227) Branch hailo-backend snapshot at iter 227. NPU acceleration is the production default since iter 163 (~70 embeds/sec/worker, p50=55-57 ms, 9.6× over cpu-fallback on cognitum-v0 / Pi 5 + AI HAT+). Two arcs of work since v0.1.0-iter156b: Arc 1 — security/hardening (iters 174-213): 8-layer DoS gate stack (byte caps, stream cap, RPC timeout, CVE-2023-44487, keepalive, batch cap, rate-limit per-item), HEF sha256 pin, SEGV-on-shutdown fix, HailoRT FFI 2s timeout, retry short-circuit on terminal errors, OOM-bounded operator-path file reads, client TLS flag plumbing, fakeworker DoS-gate parity. Arc 2 — ADR-178 integration gap analysis (iters 215-227): EmbeddingProvider impl on both hailo embedders + workspace rejoin (Gap B), ruvllm-bridge deploy artifacts (Gap A), csi-bridge docs disambiguation (Gap C short-term), hailo-cluster-as-provider example (Gap D short-term), ADR-167 stale-stratigraphy collapsed (Gap F), install-bridge.sh → install-mmwave-bridge.sh rename (Gap H). Verification: cargo build --workspace --release: clean (3m49s) cargo check --workspace: clean cluster lib + integration tests --features tls: 23 suites green hailo lib tests: 21 default + 22 cpu-fallback + 7 tokenizer pass cargo deny check on both crates: advisories/bans/licenses/sources ok cargo audit --deny warnings (with iter-224 ignores): exit 0 Pi cognitum-v0 deployed + bit-identical embed verified (vec_head=0.0181,-0.0220,0.0451,0.0159 unchanged) Co-Authored-By: claude-flow <ruv@ruv.net>
perf(hailo): cache + NPU bench — 15.86M embeds/sec on cache hits (ite… …r 168) Iter-165 leftover #9 closed. Re-ran cluster-bench against the same Pi 5 NPU worker, this time exercising the iter-108 LRU cache at the cluster coordinator: cold (unique keys): 70.2 embeds/sec p50=56ms mixed (keyspace=2048, cache=1024): 74.7 embeds/sec p50=55ms hit=5.9% hot (keyspace=32, cache=1024): 15.86 M emb/sec p50<1µs hit=100% The hot-path 15.86M figure is real — the cluster coordinator returns already-served vectors in-process without touching the gRPC stack or the NPU. For repeat-text workloads (RAG over a stable corpus, ruvllm context prefix sharing, search query autocomplete) this is the actual throughput an application sees. Even at 5.9% hit rate (mostly-unique workload) the cache adds a small ~6% throughput improvement. The operator-facing recommendation is to enable --cache=N at any deploy where the same texts are embedded more than once. ADR-176 status table + measurements section updated with the three-row bench. Pi worker stopped post-bench; the iter-156b HEF stays at /var/lib/ruvector-hailo/models/all-minilm-l6-v2/model.hef ready for the next start. Co-Authored-By: claude-flow <ruv@ruv.net>
ruvllm-esp32 v0.3.0-rc1 — ADR-165+166 (workflow + VFS fixes)
PreviousNext