-
Notifications
You must be signed in to change notification settings - Fork 54
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Test Suite #365
Test Suite #365
Conversation
# Conflicts: # heat/core/dndarray.py # heat/core/manipulations.py
Codecov Report
@@ Coverage Diff @@
## master #365 +/- ##
=========================================
+ Coverage 97.98% 98% +0.01%
=========================================
Files 53 55 +2
Lines 10937 11042 +105
=========================================
+ Hits 10717 10822 +105
Misses 220 220
Continue to review full report at Codecov.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Couple of changes are needed here. Might have missed some still
|
||
class BasicTest(TestCase): | ||
|
||
__comm = MPICommunication() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This does not seem like a good idea. The comm might be dependent on the input variable
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So the comm should rather be passed via the assert functions? For the array_equal function I understand the point but for the func_equal, the input values don't have a comm.
equal_res = np.array(compare_func(local_numpy, numpy_array[slices])) | ||
self.comm.Allreduce(MPI.IN_PLACE, equal_res, MPI.LAND) | ||
self.assertTrue(equal_res, 'Local tensors do not match the corresponding numpy slices.') | ||
self.assertEqual(local_numpy.dtype, numpy_array.dtype, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It does not need to necessarily equal but potential float_eq
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So should this just be left out and just rely on the equality check of the tensor's values
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
test unique should have shape, dtype, and splits tested. It should also be tested with doubles and longs in the input array
this looks great. thank you for adding this!
Check if the heat_array is equivalent to the numpy_array. Therefore first the split heat_array is compared to | ||
the corresponding numpy_array slice locally and second the heat_array is combined and fully compared with the | ||
numpy_array. | ||
Note if the heat array is split it also needs to be balanced. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
a secondary note should be added here that the standard numpy dtype is float64. this is different from torch/heat where the default if float32. essentially, the numpy arrays used in the function should be created with the proper dtype
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this a general rule or can there be changes or single functions that differ in their behaviour? If the user wants to be certain, he can always pass args where the dtype is explicitly defined. I just don't want to make any general assumptions about the behaviour of the functions which might not be correct and result in unexpected results.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes the dtype can change after a function. my concern is that the types will not match up because of the difference in what is the standard types of the packages.
i think that the thing to do would be to create the numpy array with the dtype of the heat array, and if no dtype is available then it should be created with dtype=float32. that way they have the same starting point. and i believe that numpy will only increase the precision if it is necessary. i.e. type changes which occur in the course of the function will remain at the same precision.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry, I think I don't get the problem your are describing. This function only compares two arrays provided by the user in terms of values and type. Under which circumstances would the fact, that numpy creates 64 bit datatypes by default while heat/torch creates 32 bit datatypes, affect this process?
raise TypeError('The input tensors type must be one of [tuple, list, ' + | ||
'numpy.ndarray, torch.tensor] but is {}'.format(type(tensor))) | ||
|
||
np_res = numpy_func(np_array, **numpy_args) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
dtype (float64 vs float32)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The user is always able to define this using the numpy_args
. I want to keep the function general and therefore not make too many assumptions about what the user actually wants.
# Conflicts: # heat/core/manipulations.py
Most of the requested changes got fixed/discussed. Would be great if you could have a new look. |
bump, black error |
Black got fixed |
Description
Target issue #323
This PR introduces a test suite for improved unit testing of our functions.
Most test cases are quite trivial or don't involve the splitting of tensors, because for different number of nodes the result looks differently. Furthermore many edge cases are left untested.
This test suits helps in creating better tests by reducing boilerplate code for comparing an heat array with the expected numpy result.
For even more simplicity, heat functions that should perform equivalent to numpy functions can be tested in one single line on random numbers or a predefined array.
The general idea is to compare a (split) heat tensor with a numpy tensor without having to worry about differing split axis or number of nodes.
Feel free to tell me which functionality is missing that would reduce your time writing unit tests.
If this PR gets merged the next step would be to apply this test suite to all existing functions.
To get an idea how unit tests with this test suite might look lake check out the test_basic_test.py file or have a look at the documentation examples.
Type of change