feat: add proper library API for external Rust projects #109

inureyes · 2026-01-02T09:59:27Z

Summary

Add AllSmi client struct with high-level, ergonomic API for hardware monitoring
Create unified Error type using thiserror for comprehensive error handling
Add prelude module for convenient imports of all common types
Implement platform-specific initialization (macOS NativeMetricsManager, Linux HlsmiManager)
Ensure thread safety with Send + Sync bounds on AllSmi
Add comprehensive documentation with examples in lib.rs and README.md

Changes

New Files

src/client.rs - Main AllSmi struct with get_gpu_info(), get_cpu_info(), get_memory_info(), get_process_info(), get_chassis_info()
src/error.rs - Unified Error enum with PlatformInit, NoDevicesFound, DeviceAccess, PermissionDenied, NotSupported, Io variants
src/prelude.rs - Re-exports for AllSmi, Error, Result, core types, traits, and factory functions
examples/library_usage.rs - Comprehensive example demonstrating library API usage
tests/library_api_test.rs - Integration tests for the library API

Modified Files

src/lib.rs - Updated to export new public API with comprehensive documentation
README.md - Added "Library API" section with usage examples

Test Plan

Run cargo test - all 17 library API tests pass
Run cargo clippy - no warnings
Run cargo run --example library_usage - example executes successfully
Verify AllSmi is Send + Sync
Verify graceful handling when no GPUs are detected

Closes #106

Introduce high-level AllSmi client struct with ergonomic API for querying GPU/NPU, CPU, memory, and chassis information. Add unified Error type using thiserror, prelude module for convenient imports, and comprehensive examples and tests for library usage. Closes #106

inureyes · 2026-01-02T10:03:41Z

Security & Performance Review - PR #109

Analysis Summary

PR Branch: feature/issue-106-library-api
Scope: changed-files
Languages: Rust
Total issues: 5
CRITICAL: 0 | HIGH: 1 | MEDIUM: 3 | LOW: 1

Prioritized Issue List

HIGH (1 issue)

1. Unsafe Send + Sync implementation lacks detailed safety documentation

File: /home/inureyes/Development/backend.ai/all-smi/src/client.rs (lines 433-435)

Current code:

// Safety: AllSmi uses thread-safe readers
unsafe impl Send for AllSmi {}
unsafe impl Sync for AllSmi {}

Issue: The safety comment is too brief. For unsafe impl, Rust conventions require detailed documentation explaining:
1. Why the struct is actually Send/Sync safe
2. What invariants are maintained
3. What types inside make this safe
Impact: While the underlying traits (GpuReader, CpuReader, etc.) all require Send + Sync, the struct contains Box<dyn Trait> which by itself is not Send + Sync without the trait bounds. The current implementation IS correct because the traits require Send + Sync, but future maintainers might not understand this relationship.

Recommendation: Expand the safety comment:

// SAFETY: AllSmi is Send + Sync because:
// 1. All reader traits (GpuReader, CpuReader, MemoryReader, ChassisReader) 
//    require Send + Sync bounds
// 2. Box<dyn Trait + Send + Sync> is Send + Sync
// 3. Platform-specific fields (_macos_initialized, _gaudi_initialized) are 
//    primitive bool types which are inherently Send + Sync
// 4. The underlying platform managers use thread-safe primitives (Arc, Mutex, RwLock)
unsafe impl Send for AllSmi {}
unsafe impl Sync for AllSmi {}

MEDIUM (3 issues)

2. Multiple AllSmi instances may cause race conditions with platform managers

File: /home/inureyes/Development/backend.ai/all-smi/src/client.rs (lines 177-226)
Issue: Creating multiple AllSmi instances could lead to:
- On macOS: Multiple calls to initialize_native_metrics_manager (handled gracefully via Mutex guard)
- On Linux with Gaudi: Multiple calls to initialize_hlsmi_manager (also handled gracefully)
- On Drop: Multiple shutdowns could race
Current mitigation: The managers use Lazy<Mutex<Option<Arc<...>>>> pattern which handles this correctly.
Impact: LOW - The implementation correctly handles this, but documentation should clarify the behavior.
Recommendation: Add documentation to AllSmi::new() explaining that multiple instances share the same underlying platform managers.

3. Missing #[must_use] attribute on Result-returning methods

Files: /home/inureyes/Development/backend.ai/all-smi/src/client.rs, /home/inureyes/Development/backend.ai/all-smi/src/error.rs
Issue: The AllSmi::new() and AllSmi::with_config() methods return Result but lack #[must_use] attributes.
Impact: Users could accidentally ignore initialization errors.
Recommendation: Add #[must_use = "this returns a Result that should be checked"] to both methods.

4. Documentation warnings from cargo doc

Files: Multiple
Issues found:
- src/error.rs:88 - Ambiguous link to Error (both enum and derive macro)
- src/device/hlsmi/parser.rs:52 - Unresolved link [suffix]
- src/parsing/common.rs:68 - Unresolved link [a-zA-Z0-9_]
- src/parsing/macros.rs:17,24 - Unclosed HTML tags
Impact: Documentation may be confusing for users.
Recommendation: Fix doc comments by escaping brackets and disambiguating links.

LOW (1 issue)

5. has_gpus() checks reader count, not actual GPU count

File: /home/inureyes/Development/backend.ai/all-smi/src/client.rs (lines 403-405)

Current code:

pub fn has_gpus(&self) -> bool {
    !self.gpu_readers.is_empty()
}

Issue: This returns true if there are GPU readers available, not if GPUs were actually detected. A system with NVIDIA drivers but no physical GPU would have a reader but get_gpu_info() would return empty.
Impact: Minor - documentation already clarifies this, but the method name is potentially misleading.
Recommendation: Either rename to has_gpu_readers() or change implementation to !self.get_gpu_info().is_empty() (with caching for performance).

Positive Findings

Thread safety is correctly implemented: The underlying traits all require Send + Sync, and the platform managers use proper synchronization primitives.
Error handling is comprehensive: The Error enum covers all common failure modes with descriptive messages.
API design follows Rust conventions: Builder pattern for AllSmiConfig, Result type alias, prelude module for convenience.
Tests are thorough: Integration tests verify Send + Sync bounds, creation, and all API methods.
Documentation is extensive: Comprehensive doc comments with examples on all public items.
Graceful degradation: The API doesn't fail if hardware isn't available - it simply returns empty collections.

Test Results

cargo clippy: PASSED (no warnings)
cargo test: PASSED (153 unit tests + 17 integration tests + 15 doc tests)
cargo doc: PASSED (5 minor warnings noted above)

Recommendations Summary

Priority	Action	Status
HIGH	Expand unsafe impl safety documentation	Suggested
MEDIUM	Document multiple instance behavior	Suggested
MEDIUM	Add #[must_use] attributes	Suggested
MEDIUM	Fix documentation warnings	Suggested
LOW	Clarify has_gpus() semantics	Suggested

Conclusion

This PR introduces a well-designed library API for all-smi. The implementation is sound and safe. The issues identified are primarily documentation and API ergonomics improvements rather than functional bugs. The thread safety implementation is correct despite the minimal safety comment, and the error handling is comprehensive.

Recommendation: APPROVE with minor documentation improvements suggested.

inureyes added type:enhancement New feature or request status:review Under review priority:medium Medium priority issue mode:api API mode related device:nvidia-gpu NVIDIA GPU related device:apple-silicon Apple Silicon related device:npu NPU (Neural Processing Unit) related labels Jan 2, 2026

fix: improve thread safety documentation and add #[must_use] attributes

92030d7

inureyes merged commit 188f817 into main Jan 2, 2026
2 checks passed

inureyes deleted the feature/issue-106-library-api branch January 2, 2026 10:08

inureyes self-assigned this Jan 3, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

feat: add proper library API for external Rust projects #109

feat: add proper library API for external Rust projects #109

Uh oh!

inureyes commented Jan 2, 2026

Uh oh!

inureyes commented Jan 2, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

feat: add proper library API for external Rust projects #109

feat: add proper library API for external Rust projects #109

Uh oh!

Conversation

inureyes commented Jan 2, 2026

Summary

Changes

New Files

Modified Files

Test Plan

Uh oh!

inureyes commented Jan 2, 2026

Security & Performance Review - PR #109

Analysis Summary

Prioritized Issue List

HIGH (1 issue)

MEDIUM (3 issues)

LOW (1 issue)

Positive Findings

Test Results

Recommendations Summary

Conclusion

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants