Skip to content

Conversation

@gspark-etri
Copy link
Contributor

Fix Rebellions NPU detection with rbln SDK 2.0.x

Summary

Fixes Rebellions NPU detection failure when using rbln SDK 2.0.x by correcting the type of the npu field in JSON deserialization.

Problem

all-smi fails to detect Rebellions NPUs on systems running rbln SDK 2.0.x, despite rbln-stat/rbln-smi working correctly. The tool runs without errors but shows no NPU devices.

Root Cause

There is a type mismatch between all-smi's expected schema and the actual JSON output from rbln SDK 2.0.x:

all-smi expectation (src/device/readers/rebellions.rs):

struct RblnDevice {
    npu: String,  // Expects string
    ...
}

Actual rbln-stat/rbln-smi 2.0.1 output:

{
  "devices": [
    { "npu": 0, "name": "RBLN-CA22", ... }  // Returns integer
  ]
}

This type mismatch causes silent JSON deserialization failures in serde, preventing NPU detection.

Changes

Changed the npu field type from String to u32 in the RblnDevice struct:

 struct RblnDevice {
     #[allow(dead_code)]
-    npu: String,
+    npu: u32,
     name: String,
     sid: String,

This is a one-line fix that restores compatibility with current rbln SDK versions.

Testing

Tested on:

  • Hardware: 4x Rebellions ATOM RBLN-CA22 NPUs
  • Software: rbln SDK 2.0.1, rbln-stat 2.0.1, rbln-smi 2.0.1
  • OS: Ubuntu 22.04 (Linux 6.8.0-40-generic)

Before fix:

$ ./all-smi
# No Rebellions NPUs detected

After fix:

$ ./all-smi
Rebellions ATOM (4 NPUs detected):
  NPU 0: RBLN-CA22 (rbln0) - 41°C, 19.2W, 0.0% util, 0.0B/15.7GiB
  NPU 1: RBLN-CA22 (rbln1) - 37°C, 18.7W, 0.0% util, 0.0B/15.7GiB
  NPU 2: RBLN-CA22 (rbln2) - 38°C, 18.2W, 0.0% util, 0.0B/15.7GiB
  NPU 3: RBLN-CA22 (rbln3) - 39°C, 19.0W, 0.0% util, 0.0B/15.7GiB

All NPU metrics (temperature, power, memory, utilization) are correctly displayed.

Backward Compatibility

This fix ensures compatibility with rbln SDK 2.0.1 (September 2025), which is the current stable version. The change is minimal (single line) and aligns the code with the documented SDK behavior.

If older SDK versions used a different format, this could be addressed with custom deserialization to support both types,.

@cla-assistant
Copy link

cla-assistant bot commented Dec 31, 2025

CLA assistant check
All committers have signed the CLA.

@inureyes inureyes self-assigned this Jan 1, 2026
@inureyes inureyes added status:review Under review priority:medium Medium priority issue mode:api API mode related mock-server Mock server related device:npu NPU (Neural Processing Unit) related mode:local Local mode related labels Jan 1, 2026
@inureyes
Copy link
Member

inureyes commented Jan 1, 2026

Backward Compatibility Concern

Thank you for this fix! The type change from String to u32 correctly addresses SDK 2.0.x compatibility.

However, I have a concern about backward compatibility with older SDK versions (1.x). If older versions output "npu": "0" (string) instead of "npu": 0 (integer), this change would break those installations.

Proposed Solution

I'll implement a custom serde deserializer that accepts both string and integer types:

#[serde(deserialize_with = "deserialize_string_or_u32")]
npu: u32,

This approach:

  • ✅ Supports SDK 2.0.x (integer format)
  • ✅ Maintains backward compatibility with SDK 1.x (string format)
  • ✅ No external dependencies required
  • ✅ Zero runtime overhead for the common case

Additional Note

The RblnContext struct also has an npu: String field (line 81). If SDK 2.0.x changed this to integer as well, we may need to apply the same fix there. However, since the current PR focuses on RblnDevice, I'll address that separately if needed.

I'll push a commit with the backward-compatible implementation shortly.

Implement a custom serde deserializer that accepts both string and
integer types for the `npu` field in RblnDevice. This ensures
compatibility across different SDK versions:

- SDK 1.x: outputs "npu": "0" (string format)
- SDK 2.0.x: outputs "npu": 0 (integer format)

The deserializer uses a Visitor pattern to handle both types
gracefully without external dependencies.
@inureyes
Copy link
Member

inureyes commented Jan 1, 2026

Update pushed: Commit d856d62 adds the backward-compatible deserializer.

Changes made:

  • Added deserialize_string_or_u32 custom deserializer function
  • Applied #[serde(deserialize_with = ...)] attribute to the npu field
  • Full documentation explaining SDK version compatibility

The code passes cargo check, cargo clippy, and cargo fmt.

@inureyes inureyes merged commit 3c21c4d into lablup:main Jan 1, 2026
4 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

device:npu NPU (Neural Processing Unit) related mock-server Mock server related mode:api API mode related mode:local Local mode related priority:medium Medium priority issue status:review Under review

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants