Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remote plan execution #14012

Merged
merged 3 commits into from
Sep 15, 2020
Merged

Conversation

rongrong
Copy link
Contributor

@rongrong rongrong commented Jan 24, 2020

@rongrong rongrong force-pushed the remote-plan-execution branch 4 times, most recently from 324372b to 8796b28 Compare January 28, 2020 01:05
@rongrong rongrong force-pushed the remote-plan-execution branch from 8796b28 to e58a5a7 Compare February 4, 2020 18:29
@rongrong rongrong force-pushed the remote-plan-execution branch from e58a5a7 to 3df2ad4 Compare March 4, 2020 21:57
@rongrong rongrong force-pushed the remote-plan-execution branch 3 times, most recently from e46e379 to f893ce8 Compare March 18, 2020 21:51
@xumingming
Copy link
Contributor

xumingming commented May 9, 2020

One minor comment, I have seen the word THRIFT spread out in the code, e.g. RowExpressionInterpreter:

 case THRIFT:
     // do not interpret remote functions on coordinator
      return call(node.getDisplayName(), functionHandle, node.getType(), 
                        toRowExpressions(argumentValues, node.getArguments()));

I'd suggest replace THRIFT with some more general word like REMOTE, the reason is: THRIFT limit the remote execution to use Thrift, while I think once this PR is merged, there must be needs to support mechanism other than thrift, e.g. function implemented in a HTTP server, it will requires us to repeat all the modification for the new mechanism as we did here for THRIFT.

By using REMOTE, we are defining a remote function execution framework, only when we actually need to call the remote function, we specify the concrete technology to use e.g. THRIFT/HTTP/ etc, it makes adding new mechanism other than THRIFT easier.

@rongrong rongrong force-pushed the remote-plan-execution branch 5 times, most recently from b49c01e to a459539 Compare June 12, 2020 20:19
@rongrong
Copy link
Contributor Author

By using REMOTE, we are defining a remote function execution framework, only when we actually need to call the remote function, we specify the concrete technology to use e.g. THRIFT/HTTP/ etc, it makes adding new mechanism other than THRIFT easier.

I agree with the principle. That's why in planning we only distinguish LOCAL vs REMOTE. But the FuncitonImplementationType is an implementation specific property. To be able to communicate with the remote service, the engine needs to know which protocol will be used. So i think it makes sense to distinguish specific protocols for implementation type. This is still WIP and the details of how this should be configured are not worked out yet. But making sure the protocol is easily configurable / extendible is a goal.

@rongrong rongrong force-pushed the remote-plan-execution branch from a459539 to d27084d Compare July 15, 2020 22:13
@rongrong rongrong force-pushed the remote-plan-execution branch from d27084d to a46b880 Compare September 1, 2020 20:31
@rongrong rongrong changed the title [POC] Remote plan execution Remote plan execution Sep 1, 2020
@rongrong rongrong force-pushed the remote-plan-execution branch 6 times, most recently from 23c0ccf to 42d1dd6 Compare September 4, 2020 19:52
@rongrong
Copy link
Contributor Author

Ready for review @caithagoras @highker

@caithagoras
Copy link
Contributor

Thanks! Will take a pass today.

public class SqlFunctionExecutors
{
private final Map<Language, FunctionImplementationType> supportedLanguages;
private final ThriftSqlFunctionExecutor thriftSqlFunctionExecutor;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This field does not have a getter, are we not using it?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added in later commit. I can also remove it from injection i suppose. The commits are not well separated logically (cause I broke them up later and both commits modified same files).

queryRunner.enableTestFunctionNamespaces(
ImmutableList.of("testing", "example"),
ImmutableMap.of(
"supported-function-languages", "sql, java",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: Usually comma-separate do not contains space. We can still support space, but maybe remove space here in the test example.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's good to test that space is fine though?

@rongrong rongrong force-pushed the remote-plan-execution branch from 2a80bce to c08821c Compare September 10, 2020 21:50
@rongrong rongrong requested a review from wenleix September 10, 2020 22:23
@highker highker removed their request for review September 11, 2020 03:14
Copy link
Contributor

@wenleix wenleix left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"RemoteProjectOperator.java" generally looks good.

private boolean processingPage()
{
for (int i = 0; i < result.length; i++) {
if (result[i] != null) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I understand correctly, if result array contains non-null element, it means addInput was called after last getOutput: this is because in getOutput, result array will be filled with null (line 110).

If that's the case, does it make sense to just have a AtomicBoolean for this flag, say something like inputsAdded ? For two reason:

  1. More efficient (probably not really important)
  2. I was originally looking into the difference between resultReady, and wondering why we don't need to check result[i].isDone()

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't quite get the second reason. I thought about the first one. Didn't do it cause feels performance is not enough reason to add an additional variable here. Maintaining two states separately has "maintenance cost" as well. I can add it if you prefer. Don't have strong opinion here.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

// result array will be filled with all null values after getOutput() get called

@rongrong rongrong force-pushed the remote-plan-execution branch from c08821c to ef1163c Compare September 12, 2020 00:40
@caithagoras
Copy link
Contributor

lgtm % @wenleix 's comment and failing builds.


import static java.util.Objects.requireNonNull;

public class SqlFunctionExecutors
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

curious: why this is called SqlFunctionExecutors instead of SqlFunctionExecutor? Looks like there is just one ThriftSqlFunctionExecutor inside ? 😂

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since execution is configurable, and Presto communication is primarily in http there could potentially be a http executor as well. Or maybe if we want to add other RPC protocol later. Calling it "executors" instead of "executor" just so we don't need to rename it in case any of those happens 😂

Copy link
Contributor

@wenleix wenleix left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"Remote function execution with thrift executor". Skimmed, generally looks good. Some random comments.

Don't have enough context about what RoutineCharacteristics.Language represents...are we plan to use hive in production?

A side note is there seems to be two similar "RoutineCharacteristics" exists in com.facebook.presto.spi.function and com.facebook.presto.sql.tree?

this.thriftUdfClient = thriftUdfClient;
}

public CompletableFuture<Block> executeFunction(ThriftScalarFunctionImplementation functionImplementation, Page input, List<Integer> channels, List<Type> argumentTypes, Type returnType)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

curious: why not just return ListenableFuture? I see guava library is included in presto-function-namepsaces-manager?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The API in FunctionNamespaceManager is in SPI.

return toCompletableFuture(thriftUdfClient.get(Optional.of(functionImplementation.getLanguage().getLanguage())).invokeUdf(
new ThriftFunctionHandle(
functionId.getFunctionName().toString(),
functionId.getArgumentTypes().stream()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This comment doesn't request any change. Just purely a comment: at first glance seeing "argument types" contained in FunctionId looks weird 😃

I can totally imagine how does this happen. -- We are running out of words such as FunctionSigature, etc. ... So FunctionId becomes the best word. 😂

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm now you mention it, it does feel non-obvious. But the unique identifier of a function is name + argument types so that's why the name is FunctionId and it includes the name and the argument types.

@rongrong
Copy link
Contributor Author

rongrong commented Sep 14, 2020

Don't have enough context about what RoutineCharacteristics.Language represents...are we plan to use hive in production?

Language is from the SQL syntax CREATE FUNCTION... LANGUAGE language. The spec defined some languages like SQL, C, etc. We decided to extend the language to be an identifier so people don't need to pollute the syntax to add support to different external function APIs. The design of the syntax should not be affected by what Facebook plan or not plan to introduce in production.

A side note is there seems to be two similar "RoutineCharacteristics" exists in com.facebook.presto.spi.function and com.facebook.presto.sql.tree?

yes, one is a syntax tree construct directly from ast, the other is what we use in plan.

@rongrong rongrong force-pushed the remote-plan-execution branch 2 times, most recently from a54911c to c7666be Compare September 14, 2020 19:17
@rongrong
Copy link
Contributor Author

Comment addressed. @caithagoras @wenleix

@rongrong rongrong force-pushed the remote-plan-execution branch from c7666be to 5520210 Compare September 15, 2020 00:58
The default Locality when not specified is UNKNOWN, which should be illegal
after planning. We have a plan sanity check rule to make sure no ProjectNode
has Locality UNKOWN. Unfortunately this code is added after plan sanity check
so we didn't catch the error.
@rongrong rongrong force-pushed the remote-plan-execution branch from 5520210 to aaeff3f Compare September 15, 2020 19:38
@rongrong rongrong closed this Sep 15, 2020
@rongrong rongrong deleted the remote-plan-execution branch September 15, 2020 19:38
@rongrong rongrong merged commit aaeff3f into prestodb:master Sep 15, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Remote UDF execution
4 participants