feat(dash,pve): MemoryMap redesign — GTT-first, Proxmox-aware, sidebar + hardware page#370
Merged
Conversation
Adds PveDetectionState enum + helper that combines two cheap /proc signals (kernel '-pve' tag + LXC cgroup shape). Returns DETECTED when both fire, UNCERTAIN on one, NOT_DETECTED otherwise. Never raises. Used in PR2 to surface a configure-Proxmox nudge in the memory-map widget when /etc/hal0/proxmox.json is missing on a hosted LXC. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
- Export detect_proxmox_host + PveDetectionState in __all__. - Add test_never_raises_on_permission_error covering the OSError branch that the prior test's binary-content path didn't reach. - Correct docstring wording (signals are 'strong' + 'medium', not 'two strong'); replaced with neutral 'both signals present'. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
When /etc/hal0/proxmox.json is missing, the host block now carries `detected: true/false` (+ a one-line hint when DETECTED) so the dashboard MemoryMap can render a non-blocking 'Configure Proxmox →' band instead of staying silent on hosted LXCs. Shape stays backwards-compatible: `host.configured: false` still holds; old clients that don't read the new keys see no change. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Adds the third test in TestHostDetectionInStatsHardware confirming the configured: true pre-detection branch is untouched by the new code. Also guards against accidental detect_proxmox_host() calls in the configured path (raises AssertionError if reached). Closes spec-reviewer gap noted on bef248a. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
- UNCERTAIN now also produces detected:true (matches pve.py docstring intent — UI nudges on both DETECTED and UNCERTAIN; only the detection field distinguishes them). - Rename intermediate dict to host_block to avoid reusing 'slim' for two semantically different shapes. - Extract _PVE_CONFIGURE_HINT constant so the test asserts against the same string the route emits. - Add UNCERTAIN integration test (4th in TestHostDetectionInStatsHardware). - Tighten configured-pass-through test to assert == project_slim(full) so additive regressions on the slim shape are caught. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Adds the live-counter sibling of useHardware (static probe). Polls at 2.5s and surfaces gtt_used_mb, npu_status, and the host.* Proxmox block. Consumed by MemoryMap in the next commit. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
- Remove tenants?: from StatsHardwareHost — pve.project_slim() strips it before the response reaches /api/stats/hardware. Future expanded MemoryMap pulls tenants from /api/settings/proxmox via a separate hook (Task 5). - Add per_upstream + upstream_names to StatsHardware (always emitted by the route's response builder). - Align queryKey with useLemonade convention: ['stats', 'hardware']. StatsHardwareTenant kept and exported for reuse by the future settings-shape hook; docblock explains where it actually appears. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Adds the normalising hook for the MemoryMap component (renderer lands in Task 6). Fans in /api/hardware (static probe), /api/stats/hardware (live counters), /api/slots, and /api/settings/proxmox (for tenants[] which the stats endpoint slims out). Per-slot attribution: NPU shares from npu_status.model_mb evenly; GPU shares from gtt_used_mb, weighted by registry footprint when known. CPU slots self-report. Other RAM = ram_used - sum(cpu shares). Headroom = min(pool, host) - 2 GB safety margin; labelled by binding constraint. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
- C1: drop ramTotalGb double-conversion — useHardware already returns ram.total in GB; the MB_PER_GB round-trip was a misleading no-op. - I1: replace unifiedGb's wrong ram_used_mb fallback with ram_total_mb (which /api/stats/hardware already emits); StatsHardware interface updated to type the new field. - I2: include pveSettings.isLoading in the model's loading flag so the renderer doesn't flash 'no tenants' on cold page loads. - M2: delete dead StatsHardwareTenant export (useProxmoxSettings defines its own superset shape). - M1: comment the stats-vs-settings cadence mismatch in the host block builder. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Pure render layer over useMemoryMapModel. Sidebar variant matches the existing side-card chrome; expanded variant uses the .card pattern with a two-tier bar (host pool + inside-LXC) and a full legend. Headroom callout names the binding constraint; PVE nudge appears when detected_unconfigured. Component lands unwired — Tasks 9-11 swap in the consumers. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
- Fix dead PveNudge link: #settings/proxmox -> #settings (real route
has no sub-path parser; settings view manages internal section).
- Add --mem-tenant-{1,2,3} CSS tokens to :root so the host bar's
tenant segment renders with the intended amber-grey rather than
silently falling back to var(--fg-5).
- data-loading attribute on sidebar variant root (was only on
expanded), so future loading-shimmer rules apply to both.
- Filter the self-LXC out of the expanded variant's tenants legend
by matching tenant name against Hardware.name (hostname). Avoids
double-counting hal0's own LXC alongside selfShareGb.
- Move PveNudge inline styles into .memmap-pve-nudge CSS rule.
- Add scoped .memmap .dim / .memmap-expanded .dim utility rule plus
.memmap-legend-sub spacing — removes inline marginLeft on LegendRow.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Covers the three host modes (off / detected_unconfigured / configured), the binding-constraint headroom label (pool / host), and the expanded variant's host pool + tenants legend. All describes are .skip pending the wire-up commits (Tasks 9-11) that mount MemoryMap into the actual dashboard routes. The mocks are in place; unskip when the consumers land. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Deletes the inline MemoryMap that lived in dashboard.jsx and imports the shared component from ./memory-map. The new component reads from useSlots() / useHardware() / useStatsHardware() directly — no slots prop needed. Visually equivalent in the off/detected_unconfigured host modes; configured mode now surfaces host pressure. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Replaces the two <MemoryMap slots={slots} /> call-sites in slots.jsx
(L713, L805) with the shared component. The slots prop is dropped —
the new MemoryMap pulls from useSlots() directly.
Resolves a latent break left by Task 9: dashboard.jsx no longer
exports MemoryMap to window, so slots.jsx's bare reference would
have resolved to the new memory-map.jsx window export by accident.
This makes the dependency explicit.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Renders the full-width MemoryMap (host pool + inside-LXC two-tier bar, sortable legend, headroom callout) below HardwareSection in the main column of /dashboard. Unskips the memory-map-v3 spec — all six tests green against the new wire-up. Fix: headroom selector tests scoped to .memmap-sidebar to avoid strict mode violation now that both sidebar and expanded variants render .memmap-headroom on the same page. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…describes The .skip was removed in 43d481c when Task 11 wired the consumers; the describe titles still carried the gating hint. Cosmetic only — tests already running green. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This was referenced May 28, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Rewrite of the dashboard's Memory map so it works correctly on Strix Halo UMA and surfaces Proxmox host pressure.
Before: read only
/api/hardware(RAM total/used) + summed each slot's self-reportedmetrics.mem. On UMA the model bytes live in GTT, not RAM — a 6 GB ROCm slot moved the bar by ~0 GB. Silent on Proxmox host pressure.After: single
<MemoryMap variant="sidebar|expanded" />driven byuseMemoryMapModel. GTT-first attribution. Proxmox auto-detected when/etc/hal0/proxmox.jsonis missing. Two-tier host bar + tenants legend on the hardware page when configured.What changed
Backend (pve / api):
pve.detect_proxmox_host()— best-effort LXC-on-PVE detection from/proc/version+/proc/1/cgroup. Never raises./api/stats/hardwarehostblock now carries{detected, detection, hint}when unconfigured-but-detected. Configured path unchanged.Frontend (UI):
useStatsHardware()(2.5s, live counters incl.gtt_used_mb,npu_status.model_mb,host.*) anduseProxmoxSettings()(10s, full tenants[] for the expanded legend that the slim stats endpoint strips).ui/src/dash/memory-map.jsx—useMemoryMapModel(all attribution math) +MemoryMaprenderer (sidebar + expanded variants).dashboard.jsx, swaps both call-sites inslots.jsx, addsHardwareMemorySectionbetweenHardwareSectionand the side cards.Design choices (settled in spec):
npu_status.model_mb; GPU shares fromgtt_used_mbweighted bymetrics.mem; CPU self-reported. ≈ marker only when sharing a pool.Hardware.name— avoids double-counting againstselfShareGb._PVE_CONFIGURE_HINTconstant is single source of truth — tests import it.Full spec:
docs/superpowers/specs/2026-05-28-memory-map-redesign-design.mdPlan:
docs/superpowers/plans/2026-05-28-memory-map-redesign.mdTest plan
uv run pytest tests/hardware tests/api -v— 488 passed, 3 skipped (pre-existing).npx playwright test --reporter=line— 81 passed, 16 skipped (16 pre-existing skips unrelated).npx playwright test memory-map-v3— 6/6 new specs PASS covering off / detected_unconfigured / configured / pool-limited / host-limited / expanded variant.pip install -e .+systemctl restart hal0-api./api/stats/hardwarereturnshost: {configured:false, detected:true, detection:"uncertain", hint:"..."}; dashboard renders, sidebar Memory map visible.Notes / follow-ups (not blocking)
/proc/1/cgroupin cgroup-v2 unified mode shows/init.scoperather than/lxc/<vmid>/.... UI behaviour is correct (UNCERTAIN nudges identically to DETECTED). Broaden the cgroup signal to recognise/init.scope+-pvekernel as DETECTED — file as follow-up.cgroup memory.maxas a third headroom-binding-constraint candidate (after pool and host). Rare on Strix Halo; deferred.--mem-tenant-{1,2,3}palette polish.🤖 Generated with Claude Code