-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Optimize ERF with MKL math function #5
base: master
Are you sure you want to change the base?
Conversation
@with_seed() | ||
def test_math(): | ||
ops = ['log', 'erf', 'square'] | ||
check_value= True |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: space after check_value
ops = ['log', 'erf', 'square'] | ||
check_value= True | ||
lshape = 1000 | ||
rshapes = [1, 10, 100, 1000, 10000] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there any reason why you picked 5 shapes?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The test case is initialy to evaluate the scalability of mkl vml functions on different size of tensors, since it is not proper to track the time cost on calculation, I will modify the shape to 1d, 2d, 3d tensors
def math_square(shape, dtype, check_value): | ||
np_x = np.random.rand(shape[0], shape[1]) | ||
x = mx.nd.array(np_x, dtype=dtype) | ||
mx.nd.waitall() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do we need to frequently add waitall and wait_to_read in tests?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The test case is initially to calcuatee the time cost of the calculations on different size tensors, wwwill remove the waitall, wait_to_read fron the test case as the test case targting the correctness of the calculation.
@@ -35,9 +35,10 @@ | |||
#include "../mxnet_op.h" | |||
#include "../elemwise_op_common.h" | |||
#include "../../ndarray/ndarray_function.h" | |||
|
|||
#if MSHADOW_USE_MKL == 1 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have not set USE_MKL before. Just curious: is blas=MKL also tested in mxnet CI?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
MSHADOW_USE_MKL
is widely used in mshadow and mxnet to indicate MKL is used as BLAS library. Yes, USE_BLAS=mkl
is built and tested in CI:
https://github.com/apache/incubator-mxnet/blob/master/ci/docker/runtime_functions.sh#L375
https://github.com/apache/incubator-mxnet/blob/master/ci/docker/runtime_functions.sh#L553
* With this macro means mxnet compile with MKL to accelerate math function with mkl. | ||
* Will Register FCompute with UnaryOp::MKL_Compute() to compelet the math function. | ||
*/ | ||
#define MXNET_MKL_OPERATOR_REGISTER_UNARY(__name$) \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How is this different from MXNET_OPERATOR_REGISTER_UNARY
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The implementation of these two macros are same, duplicates code removed.
|
||
)code" ADD_FILELINE) | ||
.set_attr<nnvm::FGradient>("FGradient", ElemwiseGradUseIn{"_backward_square"}); | ||
#else |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmmm this does not scale to 100 ops. We need to think about better ways.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes. Need revisit the design to make it scalable for both operator registration and the kernel launching everywhere in mxnet.
Fix review comments
Description
(Brief description on what this PR is about)
Checklist
Essentials
Please feel free to remove inapplicable items for your PR.
Changes
Comments