Remote plan execution #14012

rongrong · 2020-01-24T18:37:24Z

Resolves #14053

depended by https://github.com/facebookexternal/presto-facebook/pull/1149

xumingming · 2020-05-09T14:09:44Z

One minor comment, I have seen the word THRIFT spread out in the code, e.g. RowExpressionInterpreter:

 case THRIFT:
     // do not interpret remote functions on coordinator
      return call(node.getDisplayName(), functionHandle, node.getType(), 
                        toRowExpressions(argumentValues, node.getArguments()));

I'd suggest replace THRIFT with some more general word like REMOTE, the reason is: THRIFT limit the remote execution to use Thrift, while I think once this PR is merged, there must be needs to support mechanism other than thrift, e.g. function implemented in a HTTP server, it will requires us to repeat all the modification for the new mechanism as we did here for THRIFT.

By using REMOTE, we are defining a remote function execution framework, only when we actually need to call the remote function, we specify the concrete technology to use e.g. THRIFT/HTTP/ etc, it makes adding new mechanism other than THRIFT easier.

rongrong · 2020-06-12T20:50:23Z

By using REMOTE, we are defining a remote function execution framework, only when we actually need to call the remote function, we specify the concrete technology to use e.g. THRIFT/HTTP/ etc, it makes adding new mechanism other than THRIFT easier.

I agree with the principle. That's why in planning we only distinguish LOCAL vs REMOTE. But the FuncitonImplementationType is an implementation specific property. To be able to communicate with the remote service, the engine needs to know which protocol will be used. So i think it makes sense to distinguish specific protocols for implementation type. This is still WIP and the details of how this should be configured are not worked out yet. But making sure the protocol is easily configurable / extendible is a goal.

.../facebook/presto/functionNamespace/execution/SimpleAddressSqlFunctionLanguageConfigSpec.java

...n/java/com/facebook/presto/functionNamespace/execution/thrift/ThriftSqlFunctionExecutor.java

...n/java/com/facebook/presto/functionNamespace/AbstractSqlInvokedFunctionNamespaceManager.java

rongrong · 2020-09-10T03:04:52Z

Ready for review @caithagoras @highker

caithagoras · 2020-09-10T16:27:09Z

Thanks! Will take a pass today.

presto-main/src/main/java/com/facebook/presto/sql/planner/PlanFragmenter.java

caithagoras · 2020-09-10T20:55:18Z

...gers/src/main/java/com/facebook/presto/functionNamespace/execution/SqlFunctionExecutors.java

+public class SqlFunctionExecutors
+{
+    private final Map<Language, FunctionImplementationType> supportedLanguages;
+    private final ThriftSqlFunctionExecutor thriftSqlFunctionExecutor;


This field does not have a getter, are we not using it?

Added in later commit. I can also remove it from injection i suppose. The commits are not well separated logically (cause I broke them up later and both commits modified same files).

...src/main/java/com/facebook/presto/functionNamespace/execution/SqlFunctionLanguageConfig.java

caithagoras · 2020-09-10T20:59:06Z

presto-tests/src/test/java/com/facebook/presto/tests/TestSqlFunctions.java

+            queryRunner.enableTestFunctionNamespaces(
+                    ImmutableList.of("testing", "example"),
+                    ImmutableMap.of(
+                            "supported-function-languages", "sql, java",


nit: Usually comma-separate do not contains space. We can still support space, but maybe remove space here in the test example.

It's good to test that space is fine though?

...n/java/com/facebook/presto/functionNamespace/AbstractSqlInvokedFunctionNamespaceManager.java

wenleix

"RemoteProjectOperator.java" generally looks good.

presto-main/src/main/java/com/facebook/presto/operator/RemoteProjectOperator.java

wenleix · 2020-09-11T20:19:18Z

presto-main/src/main/java/com/facebook/presto/operator/RemoteProjectOperator.java

+    private boolean processingPage()
+    {
+        for (int i = 0; i < result.length; i++) {
+            if (result[i] != null) {


If I understand correctly, if result array contains non-null element, it means addInput was called after last getOutput: this is because in getOutput, result array will be filled with null (line 110).

If that's the case, does it make sense to just have a AtomicBoolean for this flag, say something like inputsAdded ? For two reason:

More efficient (probably not really important)

I was originally looking into the difference between resultReady, and wondering why we don't need to check result[i].isDone()

I didn't quite get the second reason. I thought about the first one. Didn't do it cause feels performance is not enough reason to add an additional variable here. Maintaining two states separately has "maintenance cost" as well. I can add it if you prefer. Don't have strong opinion here.

// result array will be filled with all null values after getOutput() get called

presto-main/src/main/java/com/facebook/presto/operator/RemoteProjectOperator.java

caithagoras · 2020-09-12T02:48:26Z

lgtm % @wenleix 's comment and failing builds.

wenleix · 2020-09-13T19:16:48Z

...gers/src/main/java/com/facebook/presto/functionNamespace/execution/SqlFunctionExecutors.java

+
+import static java.util.Objects.requireNonNull;
+
+public class SqlFunctionExecutors


curious: why this is called SqlFunctionExecutors instead of SqlFunctionExecutor? Looks like there is just one ThriftSqlFunctionExecutor inside ? 😂

Since execution is configurable, and Presto communication is primarily in http there could potentially be a http executor as well. Or maybe if we want to add other RPC protocol later. Calling it "executors" instead of "executor" just so we don't need to rename it in case any of those happens 😂

wenleix

"Remote function execution with thrift executor". Skimmed, generally looks good. Some random comments.

Don't have enough context about what RoutineCharacteristics.Language represents...are we plan to use hive in production?

A side note is there seems to be two similar "RoutineCharacteristics" exists in com.facebook.presto.spi.function and com.facebook.presto.sql.tree?

wenleix · 2020-09-13T19:33:14Z

...n/java/com/facebook/presto/functionNamespace/execution/thrift/ThriftSqlFunctionExecutor.java

+        this.thriftUdfClient = thriftUdfClient;
+    }
+
+    public CompletableFuture<Block> executeFunction(ThriftScalarFunctionImplementation functionImplementation, Page input, List<Integer> channels, List<Type> argumentTypes, Type returnType)


curious: why not just return ListenableFuture? I see guava library is included in presto-function-namepsaces-manager?

The API in FunctionNamespaceManager is in SPI.

...n/java/com/facebook/presto/functionNamespace/execution/thrift/ThriftSqlFunctionExecutor.java

wenleix · 2020-09-14T04:31:53Z

...n/java/com/facebook/presto/functionNamespace/execution/thrift/ThriftSqlFunctionExecutor.java

+            return toCompletableFuture(thriftUdfClient.get(Optional.of(functionImplementation.getLanguage().getLanguage())).invokeUdf(
+                    new ThriftFunctionHandle(
+                            functionId.getFunctionName().toString(),
+                            functionId.getArgumentTypes().stream()


This comment doesn't request any change. Just purely a comment: at first glance seeing "argument types" contained in FunctionId looks weird 😃

I can totally imagine how does this happen. -- We are running out of words such as FunctionSigature, etc. ... So FunctionId becomes the best word. 😂

Hmm now you mention it, it does feel non-obvious. But the unique identifier of a function is name + argument types so that's why the name is FunctionId and it includes the name and the argument types.

rongrong · 2020-09-14T17:56:10Z

Don't have enough context about what RoutineCharacteristics.Language represents...are we plan to use hive in production?

Language is from the SQL syntax CREATE FUNCTION... LANGUAGE language. The spec defined some languages like SQL, C, etc. We decided to extend the language to be an identifier so people don't need to pollute the syntax to add support to different external function APIs. The design of the syntax should not be affected by what Facebook plan or not plan to introduce in production.

A side note is there seems to be two similar "RoutineCharacteristics" exists in com.facebook.presto.spi.function and com.facebook.presto.sql.tree?

yes, one is a syntax tree construct directly from ast, the other is what we use in plan.

rongrong · 2020-09-14T19:18:29Z

Comment addressed. @caithagoras @wenleix

The default Locality when not specified is UNKNOWN, which should be illegal after planning. We have a plan sanity check rule to make sure no ProjectNode has Locality UNKOWN. Unfortunately this code is added after plan sanity check so we didn't catch the error.

rongrong force-pushed the remote-plan-execution branch 4 times, most recently from 324372b to 8796b28 Compare January 28, 2020 01:05

rongrong force-pushed the remote-plan-execution branch from 8796b28 to e58a5a7 Compare February 4, 2020 18:29

rongrong force-pushed the remote-plan-execution branch from e58a5a7 to 3df2ad4 Compare March 4, 2020 21:57

rongrong force-pushed the remote-plan-execution branch 3 times, most recently from e46e379 to f893ce8 Compare March 18, 2020 21:51

rongrong force-pushed the remote-plan-execution branch 5 times, most recently from b49c01e to a459539 Compare June 12, 2020 20:19

rongrong force-pushed the remote-plan-execution branch from a459539 to d27084d Compare July 15, 2020 22:13

rongrong force-pushed the remote-plan-execution branch from d27084d to a46b880 Compare September 1, 2020 20:31

rongrong changed the title ~~[POC] Remote plan execution~~ Remote plan execution Sep 1, 2020

rongrong requested review from caithagoras, highker and prithvip September 1, 2020 21:10

caithagoras reviewed Sep 2, 2020

View reviewed changes

.../facebook/presto/functionNamespace/execution/SimpleAddressSqlFunctionLanguageConfigSpec.java Outdated Show resolved Hide resolved

caithagoras reviewed Sep 2, 2020

View reviewed changes

rongrong force-pushed the remote-plan-execution branch 6 times, most recently from 23c0ccf to 42d1dd6 Compare September 4, 2020 19:52

rongrong requested a review from caithagoras September 10, 2020 00:56

caithagoras reviewed Sep 10, 2020

View reviewed changes

rongrong force-pushed the remote-plan-execution branch from 2a80bce to c08821c Compare September 10, 2020 21:50

rongrong requested a review from wenleix September 10, 2020 22:23

highker removed their request for review September 11, 2020 03:14

wenleix reviewed Sep 11, 2020

View reviewed changes

rongrong force-pushed the remote-plan-execution branch from c08821c to ef1163c Compare September 12, 2020 00:40

wenleix reviewed Sep 13, 2020

View reviewed changes

wenleix reviewed Sep 14, 2020

View reviewed changes

rongrong force-pushed the remote-plan-execution branch 2 times, most recently from a54911c to c7666be Compare September 14, 2020 19:17

rongrong requested review from wenleix and caithagoras September 14, 2020 19:17

caithagoras approved these changes Sep 14, 2020

View reviewed changes

rongrong force-pushed the remote-plan-execution branch from c7666be to 5520210 Compare September 15, 2020 00:58

rongrong added 3 commits September 15, 2020 12:38

Remote function execution with thrift executor

3825bdb

Support Remote function execution

aaeff3f

rongrong force-pushed the remote-plan-execution branch from 5520210 to aaeff3f Compare September 15, 2020 19:38

rongrong closed this Sep 15, 2020

rongrong deleted the remote-plan-execution branch September 15, 2020 19:38

rongrong merged commit aaeff3f into prestodb:master Sep 15, 2020

This was referenced Oct 6, 2020

Add release notes for 0.242 #15270

Merged

[Test] Add release notes for 0.242 #15291

Closed

[Test-Only] Add release notes for 0.242 #15294

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Remote plan execution #14012

Remote plan execution #14012

rongrong commented Jan 24, 2020 •

edited

Loading

xumingming commented May 9, 2020 •

edited

Loading

rongrong commented Jun 12, 2020

rongrong commented Sep 10, 2020

caithagoras commented Sep 10, 2020

caithagoras Sep 10, 2020

rongrong Sep 10, 2020

caithagoras Sep 10, 2020

rongrong Sep 10, 2020

wenleix left a comment

wenleix Sep 11, 2020

rongrong Sep 12, 2020

wenleix Sep 14, 2020

caithagoras commented Sep 12, 2020

wenleix Sep 13, 2020

rongrong Sep 14, 2020

wenleix left a comment

wenleix Sep 13, 2020

rongrong Sep 14, 2020

wenleix Sep 14, 2020

rongrong Sep 14, 2020

rongrong commented Sep 14, 2020 •

edited

Loading

rongrong commented Sep 14, 2020


		import static java.util.Objects.requireNonNull;

		public class SqlFunctionExecutors

Remote plan execution #14012

Remote plan execution #14012

Conversation

rongrong commented Jan 24, 2020 • edited Loading

xumingming commented May 9, 2020 • edited Loading

rongrong commented Jun 12, 2020

rongrong commented Sep 10, 2020

caithagoras commented Sep 10, 2020

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

wenleix left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

caithagoras commented Sep 12, 2020

Choose a reason for hiding this comment

Choose a reason for hiding this comment

wenleix left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

rongrong commented Sep 14, 2020 • edited Loading

rongrong commented Sep 14, 2020

rongrong commented Jan 24, 2020 •

edited

Loading

xumingming commented May 9, 2020 •

edited

Loading

rongrong commented Sep 14, 2020 •

edited

Loading