ARROW-4558: [C++][Flight] Implement gRPC customizations without UB #3633

wesm · 2019-02-13T05:39:06Z

I admit this feels gross, but it's less gross than what was there before. I can do some more clean up but wanted to get feedback before spending any more time on it

So, the problem partially lies with the gRPC C++ library. The obvious thing, and first thing I tried, was to specialize SerializationTraits<protocol::FlightData> and do casts between FlightData and protocol::FlightData (the proto) at the last possible moment. Unfortunately, this seems to not be possible because of this:

https://github.com/grpc/grpc/blob/master/include/grpcpp/impl/codegen/proto_utils.h#L100

So I had to override that Googly hack and go to some shenanigans (see protocol.h/protocol.cc) to make sure the same templates are always visible both in Flight.grpc.pb.cc as well as our client.cc/server.cc

…_cast on the client and server C++ types Change-Id: I19b28f112c52cc658b4f37f9516607b85738726c

wesm · 2019-02-13T05:40:11Z

cc @pitrou @lihalite

pitrou · 2019-02-13T09:27:45Z

Perhaps it would be useful to ping the gRPC community to know if this is idiomatic.

pitrou · 2019-02-13T09:33:54Z

cpp/src/arrow/flight/client.cc

-    if (!custom_writer->grpc::ClientWriter<IpcPayload>::Write(payload,
-                                                              grpc::WriteOptions())) {
+
+    if (!writer_->Write(*reinterpret_cast<const pb::FlightData*>(&payload),


Basically we are praying that Write doesn't do anything with the FlightData pointer except pass it to SerializationTraits<FlightData>, right?

That's right. But the reader/writer layers have no knowledge of the message types; SerializationTraits is the only point of contact

https://github.com/grpc/grpc/blob/f41affc9c7e0358612b9747e914e8b01af53f75d/include/grpcpp/impl/codegen/call_op_set.h#L395

https://github.com/grpc/grpc/blob/46bd2f7adb926053345665d5c487fa20acd2b5b0/include/grpcpp/impl/codegen/server_interface.h#L255

ghost · 2019-02-13T13:27:01Z

cpp/src/arrow/flight/protocol.cc

+
+#include "arrow/flight/protocol.h"
+
+#include "arrow/flight/Flight.grpc.pb.cc"  // NOLINT


So the trick of protocol.cc and protocol.h is to make sure you can only get the generated Protobuf code along with our template specialization?

It appears that the linker (at least with gcc on Linux) was seeing the vtable generated in Flight.grpc.pb.cc and overriding the ones in server.cc/client.cc. This was the only way I could make sure that the same code is generated in all places

I think I'd appreciate a comment about that just so that someone else looking at it isn't confused.

ghost

This feels icky, but less icky than before.

grpc/grpc#15764 is the upstream issue we need, I think (it'd be ideal if whatever API they come up with eventually just gives us a byte stream/sink).

pitrou · 2019-02-13T14:54:47Z

@lihalite Interesting.
Apparently, https://github.com/tensorflow/tensorflow/blob/ced4fddc0400fedce8ba15fb9a3e6e765ddda757/tensorflow/core/distributed_runtime/rpc/grpc_tensor_coding.cc#L171 shows how to do actual zero-copy writes, i.e. write directy from application data to socket - by creating a bunch of slices that point to application data and chaining them in the ByteBuffer (which can contain any number of slices).

ghost · 2019-02-13T15:04:29Z

@pitrou That is awesome, maybe we should do that in a follow-up.

wesm · 2019-02-13T15:15:57Z

@pitrou I was going to propose that on the e-mail thread but you found it already =) Composing a slice from smaller zero-copy constituent slices is the way to avoid memcpy there. The only annoyance is that we'll have to add padding slices but c'est la vie

ghost · 2019-02-13T15:17:50Z

FWIW, the Java side also adds padding slices (it was one of the incompatibilities between the two implementations)

arrow/java/flight/src/main/java/org/apache/arrow/flight/ArrowMessage.java

Lines 276 to 281 in 27ba26c

    
           // [ARROW-4213] These buffers must be aligned to an 8-byte boundary in order to be readable from C++. 
        
           if (b.readableBytes() % 8 != 0) { 
        
             int paddingBytes = 8 - (b.readableBytes() % 8); 
        
             assert paddingBytes > 0 && paddingBytes < 8; 
        
             size += paddingBytes; 
        
             allBufs.add(PADDING_BUFFERS.get(paddingBytes).retain());

pitrou · 2019-02-13T15:18:11Z

Hmm... why do we need padding slices?

Change-Id: I2f8d64eee08c3eb68eb38da64c4729f68cd0c466

wesm · 2019-02-13T15:29:40Z

I commented on the gRPC thread

wesm · 2019-02-13T15:31:05Z

@pitrou a buffer's size may not be a multiple of 8. But a buffer should be padded already so we should be able to round up the buffer size to the next multiple of 8 and avoid the padding slices

https://github.com/apache/arrow/blob/master/cpp/src/arrow/flight/serialization-internal.h#L302

pitrou · 2019-02-13T15:33:27Z

Right... but the "padding" won't be necessarily zero-filled (for example if taking a slice of a buffer). Is it a problem?

pitrou · 2019-02-13T15:37:57Z

I looked at Tensorflow's use of gRPC and it seems they don't do zero-copy on the receiving side. For example with the SendTensor method in eager_service.proto, it seems the server will use regular deserialization for the tensor bytes data sent by the client.

Also there are weird hacks where protobuf-generated definitions are replaced/overriden with a hand-written ones, I'm not sure why that is:
https://github.com/tensorflow/tensorflow/blob/ced4fddc0400fedce8ba15fb9a3e6e765ddda757/tensorflow/core/distributed_runtime/rpc/eager/grpc_eager_service.h#L43
https://github.com/tensorflow/tensorflow/blob/ced4fddc0400fedce8ba15fb9a3e6e765ddda757/tensorflow/core/distributed_runtime/rpc/grpc_worker_service_impl.h#L98

wesm · 2019-02-13T15:38:24Z

I opened https://issues.apache.org/jira/browse/ARROW-4562 about the composite byte buffer optimization

wesm · 2019-02-13T15:41:27Z

I'm sure you could ask TF about it and find the right person to talk to.

You might want to try benchmarking with two of the nodes in the UL Nashville network rather than localhost-to-localhost. They are connected with gigabit ethernet only (I'm upgrading the network switches but not sure I can get 10-gigabit working, too many factors out of my control) so it would be interesting to see to what extent we are saturating the network bandwidth

wesm · 2019-02-13T15:44:42Z

@pitrou regarding padding, I could imagine an application that hashes IPC payloads for some purpose (caching?) having an issue with non-zero padding. It seems like an esoteric concern; I'm not sure it is worth paying the price, or at least having the "sanitize padding" option as something that's disabled by default

pitrou · 2019-02-13T15:47:24Z

Perhaps the price isn't that large either. Ideally the batches are large so adding a 8-bytes zero-initialized slice may be cheap...

Change-Id: I7a191986b6544310ab71ae074ef218bb776d06f9

wesm · 2019-02-13T16:57:01Z

OK to merge this with a passing build? This could benefit from an IWYU pass but I might like to do that for all of Flight to clean up all the includes and forward-declarations

wesm · 2019-02-13T16:58:59Z

I should rename protocol.h to protocol-internal.h so the header is not installed.

pitrou

LGTM, one comment.

pitrou · 2019-02-13T16:59:36Z

cpp/src/arrow/flight/customize_protobuf.h

+#include <memory>
+
+#include "grpcpp/impl/codegen/config_protobuf.h"
+#undef GRPC_OPEN_SOURCE_PROTO


Comment on this? I think it's easy to miss it or misunderstand why it's here. (as in "what happens if I remove this")

pitrou · 2019-02-13T17:02:58Z

cpp/src/arrow/flight/serialization-internal.cc

+using google::protobuf::io::CodedInputStream;
+using google::protobuf::io::CodedOutputStream;
+
+bool ReadBytesZeroCopy(const std::shared_ptr<arrow::Buffer>& source_data,


Oh, and you needn't redeclare this (already defined above).

… feedback Change-Id: I908aa0568ea0f75968fe37c20fa7629ce9a9af36

wesm · 2019-02-13T17:45:06Z

OK I'll wait for the build to run again and then merge

wesm · 2019-02-13T20:17:41Z

Merging. A couple of builds experienced transient failures

Implement gRPC customizations another way without calling reinterpret…

23fe416

…_cast on the client and server C++ types Change-Id: I19b28f112c52cc658b4f37f9516607b85738726c

pitrou reviewed Feb 13, 2019

View reviewed changes

ghost reviewed Feb 13, 2019

View reviewed changes

Ensure .proto file is compiled before anything else

ac405b3

Change-Id: I2f8d64eee08c3eb68eb38da64c4729f68cd0c466

wesm mentioned this pull request Feb 13, 2019

[gRPC C++] Support True Custom Serialization and Generic grpc/grpc#15764

Closed

5 tasks

wesm changed the title ~~WIP ARROW-4558: [C++][Flight] Implement gRPC customizations without UB~~ ARROW-4558: [C++][Flight] Implement gRPC customizations without UB Feb 13, 2019

Add comments about the purpose of protocol.cc

b3609d4

Change-Id: I7a191986b6544310ab71ae074ef218bb776d06f9

pitrou reviewed Feb 13, 2019

View reviewed changes

Further refinements, make protocol.h an internal header. Comments per…

ed6eb80

… feedback Change-Id: I908aa0568ea0f75968fe37c20fa7629ce9a9af36

wesm closed this in 69d595a Feb 13, 2019

This was referenced Feb 13, 2019

[C++][Flight] Avoid undefined behavior with gRPC memory optimizations #21105

Closed

[C++][Flight] Create outgoing composite grpc::ByteBuffer instead of allocating contiguous slice and copying IpcPayload into it #21108

Closed


		#include "arrow/flight/protocol.h"

		#include "arrow/flight/Flight.grpc.pb.cc" // NOLINT

Uh oh!

ARROW-4558: [C++][Flight] Implement gRPC customizations without UB #3633

ARROW-4558: [C++][Flight] Implement gRPC customizations without UB #3633

Uh oh!

Conversation

wesm commented Feb 13, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

wesm commented Feb 13, 2019

Uh oh!

pitrou commented Feb 13, 2019

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

ghost left a comment

Choose a reason for hiding this comment

Uh oh!

pitrou commented Feb 13, 2019

Uh oh!

ghost commented Feb 13, 2019

Uh oh!

wesm commented Feb 13, 2019

Uh oh!

ghost commented Feb 13, 2019

Uh oh!

pitrou commented Feb 13, 2019

Uh oh!

wesm commented Feb 13, 2019

Uh oh!

wesm commented Feb 13, 2019

Uh oh!

pitrou commented Feb 13, 2019

Uh oh!

pitrou commented Feb 13, 2019

Uh oh!

wesm commented Feb 13, 2019

Uh oh!

wesm commented Feb 13, 2019

Uh oh!

wesm commented Feb 13, 2019

Uh oh!

pitrou commented Feb 13, 2019

Uh oh!

wesm commented Feb 13, 2019

Uh oh!

wesm commented Feb 13, 2019

Uh oh!

pitrou left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

wesm commented Feb 13, 2019

Uh oh!

wesm commented Feb 13, 2019

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

wesm commented Feb 13, 2019 •

edited

Loading