Update selectable device Profile #12353
Merged
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
When model using device
cuda:1
orcuda:2
or whatever (Anyway, it's notcuda:0
) to inference model ,Profile
just occupycuda:0
GPU memory forever. Found Pytorch documentation thattorch.cuda.synchronize
can select device. Sotorch.cuda.synchronize
maybe usecuda:0
by default when the parameter isNone
.After experiment, GPU memory no longer just occupies
cuda:0
forever.The code which using
Profile
has been changed, but there are still 2 places ( export.py #L124, models/common.py #L680 ) have not been modified due to my uncertainty. I guess it should also be possible to pass in parametersmodel.device
.🛠️ PR Summary
Made with ❤️ by Ultralytics Actions
🌟 Summary
This PR enhances profiling by incorporating device specificity in performance measurements.
📊 Key Changes
Profile
class now accepts an optionaldevice
argument to enable device-specific synchronization.Profile
class are now instantiated with thedevice
parameter.🎯 Purpose & Impact