-
Notifications
You must be signed in to change notification settings - Fork 6.8k
Conversation
values[0].v_handle = const_cast<DLTensor*>(&(tblobs[0].dltensor())); | ||
|
||
// scalar param | ||
type_codes[1] = kDLFloat; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@yzhliu Since I need to pass a double
param to the op func generated by TVM, I cannot use the Call
function defined in the TVMOpModule
. I moved the logic of preparing TVMArgs
up here from the Call
function to MXNet op's FCompute
function and added an independent CallEx
in TVMOpModule
to just invoke the kernel. We can discuss the change of the API to cater for more use cases.
4b372cc
to
595e2f7
Compare
88e63b4
to
127b036
Compare
d7f2963
to
327e0f7
Compare
Makefile
Outdated
@@ -473,11 +473,13 @@ CFLAGS += -I$(TVM_PATH)/include -DMXNET_USE_TVM_OP=1 | |||
LDFLAGS += -L$(ROOTDIR)/lib -ltvm_runtime -Wl,-rpath,'$${ORIGIN}' | |||
|
|||
TVM_USE_CUDA := OFF | |||
TVM_OP_CUDA_ARCH := NONE |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Any particular reason you are introducing a second set instead of using the arch set variable we already have? In which use case would these two differ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reverted. Thanks for the suggestion.
__macro$(__VA_ARGS__, int32_t); \ | ||
__macro$(__VA_ARGS__, int64_t); \ | ||
__macro$(__VA_ARGS__, bool) | ||
|
||
#define IMPLEMENT_WORKLOAD_VALUE_FOR_TYPE(__op$, __typ$) \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
could you please add a comment to this macro to clarify?
@@ -240,27 +236,38 @@ def is_int(dtype): | |||
in_data_dim = random.choice([2, 3, 4]) | |||
shape = rand_shape_nd(in_data_dim, dim=3) | |||
acc_type = {'float16': 'float32', 'float32': 'float64', 'float64': 'float64', | |||
'int8': 'int32', 'int32': 'int64', 'int64': 'int64'} | |||
'int8': 'int32', 'int32': 'int64', 'int64': 'int64', 'bool': 'int64'} | |||
for hybridize in [False, True]: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
would using https://docs.python.org/3.7/library/itertools.html#itertools.product help readability of the code and make it less nested?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the suggestion. Will consider refactoring it in the following PRs.
c7e273c
to
db57be4
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Approve for build system
db57be4
to
b49cfce
Compare
Add np.equal implemented using tvmop Fix setting DLDataType conversion for boolean ndarray Add equal_gpu Fix inputs with different ndims Fix copying boolean ndarrays across devices Refactor binary logic op impl by tvm Add more logic ops Refactor TVMOpModule::Call to CallEx Add binary scalar logic op expr and schedule Add binary scalar logic ops Add free functions for logic ops Rebase with master to fix SetDLTensor bug Fix pylint Add sum op for boolean ndarrays using tvm op module Add sum boolean gpu compute Add bool type support to boolean_mask Boolean indexing working Clean up Fix merge Sync Makefile Rebase Add boolean indexing test Fix sanity Fix gpu and add autograd test Rebase Fix test for windows Fix tests Try to fix cuda arch missing error in ci Fix ci Fix windows build Try to fix cmake Fix cmake Fix Revert config.mk
5dffe7a
to
a6dac14
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Description
np.bool_
as their output tensors'dtype
.Checklist
Essentials
Please feel free to remove inapplicable items for your PR.
Comments
Follow-up work includes:
ndarray
boolean indexingThank @yzhliu @hzfan for the help on debugging.