Feature/rocm gpu detection by solderzzc · Pull Request #140 · SharpAI/DeepCamera

solderzzc · 2026-03-09T02:49:05Z

No description provided.

- load_optimized() now catches device='cuda' failures on ROCm systems where PyTorch-ROCm is not installed, degrades to CPU gracefully - deploy.sh removes CPU-only onnxruntime before installing onnxruntime-rocm to prevent the shadowing bug

- _try_rocm() checks torch.cuda.is_available() before setting device='cuda' If PyTorch-ROCm is not installed, device stays 'cpu' from the start - load_optimized() fallback pre-checks torch.cuda instead of catching NVIDIA driver exceptions reactively (cleaner logs, no crash) - Added test: no-PyTorch-ROCm falls back to cpu device (15 tests total)

Root cause: ultralytics AutoUpdate detects onnx/onnxslim/onnxruntime as missing during ONNX export and auto-installs CPU onnxruntime, re-shadowing onnxruntime-rocm. Three-layer defense: - requirements_rocm.txt: pre-install onnx + onnxslim so ultralytics doesn't trigger AutoUpdate for ONNX export deps - deploy.sh: set YOLO_AUTOINSTALL=0 during export step - deploy.sh: post-export cleanup removes CPU onnxruntime if present

Instead of installing wrong packages then cleaning up: - Phase 1: PyTorch from ROCm --index-url (forces ROCm build, not CUDA) - Phase 2: remaining packages incl. onnxruntime-rocm, onnx, onnxslim - YOLO_AUTOINSTALL=0 prevents ultralytics from auto-installing CPU onnxruntime Removed: pre-install onnxruntime cleanup, post-export onnxruntime cleanup (no longer needed when packages are installed correctly)

deploy.sh now reads ROCm version from /opt/rocm/.info/version, amd-smi, or rocminfo and constructs the PyTorch index URL dynamically (e.g. rocm7.2 instead of hardcoded rocm6.2). Falls back to 6.2 only if version detection fails.

PyTorch only publishes wheels for specific ROCm versions (e.g. 6.2, 7.0, 7.1) — not every point release. For ROCm 7.2, deploy now tries: 7.2 → 7.1 → 7.0 → 6.4 → 6.3 → 6.2 → 6.1 → 6.0 Stops at first successful install. Falls back to PyPI CPU torch if no ROCm wheels found at all.

Ultralytics' ONNX loader only supports CUDAExecutionProvider (NVIDIA). On ROCm, it falls back to CPU even though ROCMExecutionProvider is available. PyTorch + HIP runs natively on AMD GPUs via device='cuda'. - Change ROCm BackendSpec: onnx → pytorch (skip ONNX export entirely) - Set YOLO_AUTOINSTALL=0 in detect.py to prevent ultralytics from auto-installing onnxruntime-gpu (NVIDIA) at runtime

solderzzc added 7 commits March 8, 2026 15:37

solderzzc merged commit 385e692 into develop Mar 9, 2026
1 check passed

solderzzc deleted the feature/rocm-gpu-detection branch March 9, 2026 05:07

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Feature/rocm gpu detection#140

Feature/rocm gpu detection#140
solderzzc merged 7 commits intodevelopfrom
feature/rocm-gpu-detection

solderzzc commented Mar 9, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

Conversation

solderzzc commented Mar 9, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant