-
Notifications
You must be signed in to change notification settings - Fork 54
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Enhancement/select device #488
Conversation
Codecov Report
@@ Coverage Diff @@
## master #488 +/- ##
==========================================
+ Coverage 96.48% 97.41% +0.93%
==========================================
Files 75 75
Lines 15276 15019 -257
==========================================
- Hits 14739 14631 -108
+ Misses 537 388 -149
Continue to review full report at Codecov.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you please also revert back to ht_device
instead of heat_device
? looking through all of these files is painful when the refactor touches so many lines. Also, the change feels arbitrary.
CHANGELOG.md
Outdated
@@ -15,6 +15,7 @@ | |||
- [#470](https://github.com/helmholtz-analytics/heat/pull/470) Enhancement: Accelerate distance calculations in kmeans clustering by introduction of new module spatial.distance | |||
- [#478](https://github.com/helmholtz-analytics/heat/pull/478) `ht.array` now typecasts the local torch tensors if the torch tensors given are not the torch version of the specified dtype + unit test updates | |||
- [#479](https://github.com/helmholtz-analytics/heat/pull/479) Completion of spatial.distance module to support 2D input arrays of different splittings (None or 0) and different datatypes, also if second input argument is None | |||
- [#488](https://github.com/helmholtz-analytics/heat/pull/488) Enhancement: test device selection |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you expand on this? if someone was to read this they likely wouldnt understand what this means. Its okay to use two sentences here
heat/core/tests/deviceselection.py
Outdated
ht.use_device("cpu") | ||
torch.cuda.set_device(torch.device(ht.gpu.torch_device)) | ||
torch_device = ht.gpu.torch_device | ||
heat_device = ht.gpu |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we make this a function instead? and then put it in the device.py file within core?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think that it should be a library function. It is only needed in the tests.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i understand, but it goes against the programming model that we have everywhere else. and although this is only used in the unit tests at the moment, it may be useful as a tool for other developers if it is a library function.
ht_device is also a arbitrary name 🤷♂️ When I think about it, a heat device is a heater 😀 |
honestly, i like |
heat/cluster/tests/test_kmeans.py
Outdated
device = ht.gpu.torch_device | ||
ht_device = ht.gpu | ||
ht.torch.cuda.set_device(device) | ||
from heat.core.tests.test_suites.basic_test import BasicTest |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should probably be called TestCase as well
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should it be imported as TestCase or should the name be changed in general?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The name should be changed in general in my opinion
heat/cluster/tests/test_kmeans.py
Outdated
class TestKMeans(unittest.TestCase): | ||
class TestKMeans(BasicTest): | ||
@classmethod | ||
def setUpClass(cls): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does this need to be explicitly called here? Should be automatically done due to inheritance
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The ones without already existing setup functions can be removed. Good.
ht.use_device("gpu") | ||
torch.cuda.set_device(torch.device(ht.gpu.torch_device)) | ||
torch_device = ht.cpu.torch_device | ||
ht_device = ht.cpu |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The set_device here is sufficient. I do not think that the explicit device passing, i.e. device=self.ht_device is necessary in the test cases anymore.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I use the ht_device for the special cases lcpu and lgpu where a tensor is created which is different from the standard device.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
not passing the device directly does not work for me
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
do we really want to test this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A user can still want to put a specific tensor on the cpu.
It tests that a function returns the a tensor of the input device type and not the use_device one.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But the opposite case in 'lgpu' is okay? It is setting gpu specifically while use_device is cpu?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also not, there should be GPU or CPU only for now
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Then why do we even offer the device parameter on tensor creation? 😕
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unless we are not meaning the same thing, allocating individual tensors on a different device should very much be allowed. For the test cases it is generally
a) not necessary to do it in this way,
b) should only be tested in a limited number of test cases in the factories function.
What I am getting here is, that you want to have different devices on different MPI ranks. This would go against the BSP model. If you want to write test cases that explicitly set the device on factory creation that is possible without the introduction of lcpu, lgpu. If you want to test whether you can allocate a DNDarray on a device different from the global one, there is no need for lcpu and lgpu, but rather something along the lines of
if ht.device == ht.cpu and torch.cuda.is_available:
self.other_device = ht.gpu
else:
self.other_device = ht.cpu
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As you are setting the torch device here as well. We can also remove all the self.device.torch_device
calls from the test cases right?
@@ -1,4 +1,5 @@ | |||
from unittest import TestCase | |||
import os |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This whole file still needs an init.py
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
init.py is no longer needed for Python 3.3+
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Or what do you mean?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You are right, it is not needed anymore. However, we have the init.pys in all the folders. I think we should be consistent here.
…ytics/heat into enhancement/select-device
return TestCase.__device | ||
|
||
@classmethod | ||
def setUpClass(cls): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Docsstring need to be updated here by the Raises section
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- Raises added
- I found that there is already an __init__.py in the test_suites folder.
- Pytorch doesn't have an option to set the standard device. It is always CPU. torch.cuda.set_device() only sets the default GPU if no ID is given
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- Can you import the test cases there as well?
- Understood.
@@ -1,4 +1,5 @@ | |||
from unittest import TestCase | |||
import os |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You are right, it is not needed anymore. However, we have the init.pys in all the folders. I think we should be consistent here.
ht.use_device("gpu") | ||
torch.cuda.set_device(torch.device(ht.gpu.torch_device)) | ||
torch_device = ht.cpu.torch_device | ||
ht_device = ht.cpu |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As you are setting the torch device here as well. We can also remove all the self.device.torch_device
calls from the test cases right?
Bum conflicts |
Description
This PR changes the names used for device selection for testing purposes. The old 'header' is now in BasicTest
Changes proposed:
Type of change
Remove irrelevant options:
Due Diligence
Does this change modify the behaviour of other functions? If so, which?
yes, all test functions