Skip to content

Conversation

@wesm
Copy link
Member

@wesm wesm commented Feb 14, 2019

Also resolves ARROW-4110, which has been on my list for some time.

This ended up being a huge pain.

  • detail::PrimitiveAllocatingUnaryKernel can now allocate memory for any kind of fixed width type.
  • I factored out simple bitmap propagation into detail::PropagateNulls
  • I moved the null count resolution code one level down into ArrayData, since there are cases where it may be set to kUnknownNullCount (e.g. after a slice) and you need to know what it is. This isn't tested but I suggest addressing this in a follow up patch

I also moved hand-maintained macro spaghetti for instantiating CastFunctors into a Python code-generation script. This might be the most controversial change in this patch, but the problem here is that we needed to exclude 1 macro case for each numeric type -- currently they were relying on NUMERIC_CASES. This means the list of generated types is slightly different for each type, lending to poor code reuse. Rather than maintaining this code by hand, it is so much simpler to generate it with a script.

Speaking of code generation, I think we should continue to invest in code generation scripts to make generating mundane C++ code for pre-compiled kernels simpler. I checked the file in but I'm not opposed to auto-generating the files as part of the CMake build -- we could do that in a follow up PR.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Deleting this stuff felt sooooo good

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yesssss.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was being maintained with templates before... that was overly elaborate. If this gets too complicated, we can easily code-generate it

@wesm
Copy link
Member Author

wesm commented Feb 14, 2019

@emkornfield @xhochy @fsaintjacques for your comments...

Copy link
Contributor

@emkornfield emkornfield left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Took a first pass through. I need to look at where the aggregate stuff landed to see if output_type makes sense there.

On code generation I think its worth experimenting with it, but we should try to have some ground rules for when it makes sense (and a CMAKE rule to regenerate automatically so generated code doesn't get out of sync with checked in code. I think I prefer seeing the generated code so I do think it should be checked in, but I think I might prefer it living in its own directory if possible).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do DCHECKS support a message?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is copy-pasta. I'll add a message

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why the memset?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It probably isn't necessary, but it removing it breaks unit tests, could we file a follow-up JIRA to figure out why and remove it?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

style nit: I know this is the convention in the code base if we pass in parameters that are going to be copied anyways, we might consider passing by value and using std::move inside the method instead. This allows for passing one time use parameters with std::move as well, which can avoid a copy. The performance benefits (especially for shared_ptrs) aren't clear

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK with me. done

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is it possible to indicate what script generated the file? (I know grep/directory structure would make it not too hard to find but still might be useful).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

NULLPTR?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

that's only necessary in header files

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it would be nice to add a test for this (one of the reasons I added gmock...

I didn't on the first pass because I figured I would add it as part of the TODO

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I should have pushed harder to get ValueCounts done :(

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not your change but can we use the unknown null count constant here?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

documentation?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

just curious why the preference for this form vs the original?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This way the memory pool doesn't have to leak. I don't have a strong preference

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Neither do I just curious about the change.

@wesm
Copy link
Member Author

wesm commented Feb 14, 2019

I'll take care of the small comments tomorrow. I'm sort of against writing a lot more new code in this patch since the goal here is to refactor and remove dead code while leaving existing behaviors unchanged

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i left a bug in here that ASAN caught, we should also check that buffer_size > 0 (I can file a JIRA and submit a PR separately if you want me to handle it).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done. Can you add a unit test as a follow up?

@xhochy
Copy link
Member

xhochy commented Feb 14, 2019

I can understand the need for the code generation but I still hope that someday a C++ ninja comes along and removes it again. With C++ template system and C preprocessor, I would not have expected that we need code generation.

@wesm
Copy link
Member Author

wesm commented Feb 14, 2019

So TensorFlow has a similar situation for casts, and they have avoided code generation:

https://github.com/tensorflow/tensorflow/blob/master/tensorflow/core/kernels/cast_op_impl.h#L36

It's easier in TensorFlow, though:

  • Only deals in numeric types, no date/time types, string types, or nested types to deal with
  • Does not appear to do zero copy optimizations (?), always produces a copy AFAICT

I view code generation as a last resort, so I don't exact us to get into the habit of it

wesm added 3 commits February 14, 2019 11:03
…nside CastKernel. Use code generation to avoid instantiating CastFunctors for identity casts that are never used

Change-Id: Ied8293612f46b953bab6ff1fdd737a43a65cba8f
Change-Id: I694a562066ff49863edf25eb81feba063390d0f6
Change-Id: I5180e072b4451ec575c6b9157679fa31dfba5b74
@wesm
Copy link
Member Author

wesm commented Feb 14, 2019

Would like to merge this once the build is passing, if there are any parting thoughts?

wesm added 2 commits February 14, 2019 13:04
Change-Id: Ia083b00f529b9fc172694417e0e42cbe9b00b480
Change-Id: I5ccbff285faf8e364d939d46269ff8ae1186509c
@wesm
Copy link
Member Author

wesm commented Feb 15, 2019

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants