Support for multi-threaded Group By reducer for SQL. #6044

mayankshriv · 2020-09-22T04:56:18Z

The existing implementation of Broker reduce phase is single-threaded.
For group-by queries where large response are being sent back from multiple servers,
this could become a bottlenect.

Given that brokers are generally light on CPU usage, making the reduce phase
multi-threaded would be a good idea to boost performance. This PR adds a multi-threaded
implementation for the Group-By reducer for SQL.

Added an executor service in BrokerReduceService that can be used by multi-threaded
execution of the broker reduce phase. This is initialized with number of threads as:
Runtime.getRuntime().availableProcessors().
Added a broker side config to specify max number of threads per query to be used for reduce phase.
pinot.broker.max.reduce.threads. This has a same default value as server side combine phase:
Math.max(1, Math.min(10, Runtime.getRuntime().availableProcessors() / 2))

For reverting to single threaded reduce, set this config to 1.
The GroupByDataTableReducer uses the following algorithm for determining number of
threads to use in reduce phase (per query):
- If there are less than 2 data tables to reduce, it uses single threaded run.
- Else, it uses `Math.min(pinot.broker.max.reduce.threads, numDataTables).
For testing, explicitly sets num threads to reduce to be > 1 to ensure functional
correctness is tested.

Description

Add a description of your PR here.
A good description should include pointers to an issue or design document, etc.

Upgrade Notes

Does this PR prevent a zero down-time upgrade? (Assume upgrade order: Controller, Broker, Server, Minion)

Yes (Please label as backward-incompat, and complete the section below on Release Notes)

Does this PR fix a zero-downtime upgrade introduced earlier?

Yes (Please label this as backward-incompat, and complete the section below on Release Notes)

Does this PR otherwise need attention when creating release notes? Things to consider:

New configuration options
Deprecation of configurations
Signature changes to public methods/interfaces
New plugins added or old plugins removed

Yes (Please label this PR as release-notes and complete the section on Release Notes)

Release Notes

If you have tagged this as either backward-incompat or release-notes,
you MUST add text here that you would like to see appear in release notes of the
next release.

If you have a series of commits adding or enabling a feature, then
add this section only in final commit that marks the feature completed.
Refer to earlier release notes to see examples of text

Documentation

If you have introduced a new feature or configuration, please add it to the documentation as well.
See https://docs.pinot.apache.org/developers/developers-and-contributors/update-document

kishoreg

LGTM. Thanks

mayankshriv · 2020-09-22T15:59:08Z

pinot-common/src/main/java/org/apache/pinot/common/utils/CommonConstants.java

@@ -161,6 +161,10 @@
    public static final double DEFAULT_BROKER_MIN_RESOURCE_PERCENT_FOR_START = 100.0;
    public static final String CONFIG_OF_ENABLE_QUERY_LIMIT_OVERRIDE = "pinot.broker.enable.query.limit.override";

+    // Config for number of threads to use for Broker reduce-phase.
+    public static final String CONFIG_OF_NUM_REDUCE_THREADS = "pinot.broker.num.reduce.threads";
+    public static final int DEFAULT_NUM_REDUCE_THREADS = 1; // TBD: Change to a more appropriate default (eg numCores).


Actually, on second thought, the default value of 1 is not good, as it will make reduce across concurrent queries as sequential. Moreover, if we add more threads, then it may cause contention in case of high qps use cases.

While we tune this, perhaps the behavior should be:

If config not explicitly specified, then preserve current behavior without executor service, or perhaps using MoreExecutors.newDirectExecutorService() that uses the calling thread to execute the Runnable.

If config specified, use executor service with num threads specified in the config.

Thoughts @kishoreg @Jackie-Jiang ?

(I have updated the PR with the approach above).

This config is right but the implementation can be changed. This should be something similar to what we have in combine operator - Executor pool is cached or capped at a high number based on the number of cores. But the number of callables we create be based on this config.

pinot-broker/src/main/java/org/apache/pinot/broker/requesthandler/BaseBrokerRequestHandler.java

pinot-common/src/main/java/org/apache/pinot/common/utils/CommonConstants.java

Jackie-Jiang · 2020-10-10T01:21:28Z

pinot-core/src/main/java/org/apache/pinot/core/query/reduce/BrokerReduceService.java

+  public BrokerReduceService(PinotConfiguration config) {
+    _maxReduceThreadsPerQuery = config.getProperty(CommonConstants.Broker.CONFIG_OF_MAX_REDUCE_THREADS_PER_QUERY,
+        CommonConstants.Broker.MAX_REDUCE_THREADS_PER_QUERY);
+    LOGGER.info("Initializing BrokerReduceService with {} reduce threads.", _maxReduceThreadsPerQuery);


Log both number or worker threads and threads per query?
Also, if it is single-threaded, no need to launch the executor service.

Initially, I had Guava's MoreExecutor.directorExecutor() that uses the current thread to run the task, in case of single thread. I decided to just keep it simple and have the exact same code in case of single vs multi-thread (with exception of index table). We can revisit that if needed.

pinot-core/src/main/java/org/apache/pinot/core/query/reduce/BrokerReduceService.java

pinot-core/src/main/java/org/apache/pinot/core/query/reduce/GroupByDataTableReducer.java

Jackie-Jiang · 2020-10-10T01:35:13Z

pinot-core/src/main/java/org/apache/pinot/core/query/reduce/GroupByDataTableReducer.java

+      long timeOutMs = reducerContext.getReduceTimeOutMs() - (System.currentTimeMillis() - start);
+      countDownLatch.await(timeOutMs, TimeUnit.MILLISECONDS);
+    } catch (InterruptedException e) {
+      for (Future future : futures) {


(Critical) You need to put the timeout exception into the query response, or the response will be wrong and there is no way to detect that

pinot-core/src/main/java/org/apache/pinot/core/query/reduce/GroupByDataTableReducer.java

pinot-core/src/test/java/org/apache/pinot/queries/DistinctQueriesTest.java

codecov-io · 2020-10-13T04:19:46Z

Codecov Report

Merging #6044 into master will increase coverage by 6.41%.
The diff coverage is 60.00%.

@@            Coverage Diff             @@
##           master    #6044      +/-   ##
==========================================
+ Coverage   66.44%   72.86%   +6.41%     
==========================================
  Files        1075     1225     +150     
  Lines       54773    57848    +3075     
  Branches     8168     8528     +360     
==========================================
+ Hits        36396    42150    +5754     
+ Misses      15700    12948    -2752     
- Partials     2677     2750      +73

Flag	Coverage Δ
#integration	`45.28% <49.14%> (?)`
#unittests	`64.03% <38.09%> (?)`

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files	Coverage Δ
...ot/broker/broker/AllowAllAccessControlFactory.java	`100.00% <ø> (ø)`
.../helix/BrokerUserDefinedMessageHandlerFactory.java	`52.83% <0.00%> (-13.84%)`	⬇️
...ava/org/apache/pinot/client/AbstractResultSet.java	`53.33% <0.00%> (-3.81%)`	⬇️
.../main/java/org/apache/pinot/client/Connection.java	`44.44% <0.00%> (-4.40%)`	⬇️
.../org/apache/pinot/client/ResultTableResultSet.java	`24.00% <0.00%> (-10.29%)`	⬇️
...not/common/assignment/InstancePartitionsUtils.java	`78.57% <ø> (+5.40%)`	⬆️
.../apache/pinot/common/exception/QueryException.java	`90.27% <ø> (+5.55%)`	⬆️
...pinot/common/function/AggregationFunctionType.java	`98.27% <ø> (-1.73%)`	⬇️
.../pinot/common/function/DateTimePatternHandler.java	`83.33% <ø> (ø)`
...ot/common/function/FunctionDefinitionRegistry.java	`88.88% <ø> (+44.44%)`	⬆️
... and 974 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 86ce7c6...bb955b5. Read the comment docs.

Jackie-Jiang

I still feel using fixed thread pool for single-threaded case can add overhead to the reducer, but if the perf shows no difference, I'm okay with it.

pinot-core/src/test/java/org/apache/pinot/queries/DistinctQueriesTest.java

pinot-core/src/main/java/org/apache/pinot/core/query/reduce/GroupByDataTableReducer.java

The existing implementation of Broker reduce phase is single-threaded. For group-by queries where large response are being sent back from multiple servers, this could become a bottlenect. Given that brokers are generally light on CPU usage, making the reduce phase multi-threaded would be a good idea to boost performance. This PR adds a multi-threaded implementation for the Group-By reducer for SQL. - Added an executor service in BrokerReduceService that can be used by multi-threaded execution of the broker reduce phase. This is initialized with number of threads as: `Runtime.getRuntime().availableProcessors()`. - Added a broker side config to specify max number of threads per query to be used for reduce phase. `pinot.broker.max.reduce.threads.per.query`. This has a same default value as server side combine phase: `Math.max(1, Math.min(10, Runtime.getRuntime().availableProcessors() / 2))` For reverting to single threaded reduce, set this config to 1. - The GroupByDataTableReducer uses the following algorithm for determining number of threads to use in reduce phase (per query): - If there are less than 2 data tables to reduce, it uses single threaded run. - Else, it uses `Math.min(pinot.broker.max.reduce.threads.per.query, numDataTables). - For testing, explicitly sets num threads to reduce to be > 1 to ensure functional correctness is tested.

mayankshriv · 2020-10-14T02:22:07Z

I still feel using fixed thread pool for single-threaded case can add overhead to the reducer, but if the perf shows no difference, I'm okay with it.

I agree that it will add overhead. The question is how much, and is it worth complicating the code with multiple implementations (one for single and one for multi threads). Let's use this as baseline and improve if we see a degradation.

mayankshriv force-pushed the mt-broker branch 2 times, most recently from 0192716 to 8b72494 Compare September 22, 2020 05:08

mayankshriv mentioned this pull request Sep 22, 2020

Make BrokerReduceService.reduceOnDataTable Multi Threaded to increase aggregation performance #6028

Closed

kishoreg approved these changes Sep 22, 2020

View reviewed changes

mayankshriv commented Sep 22, 2020

View reviewed changes

mayankshriv force-pushed the mt-broker branch 3 times, most recently from 105fe42 to 5174dce Compare September 24, 2020 16:11

mayankshriv force-pushed the mt-broker branch from 5174dce to c736046 Compare October 8, 2020 04:45

Jackie-Jiang reviewed Oct 10, 2020

View reviewed changes

mayankshriv force-pushed the mt-broker branch from c736046 to 1ad015b Compare October 13, 2020 03:35

mayankshriv force-pushed the mt-broker branch 3 times, most recently from bb955b5 to e20f784 Compare October 13, 2020 23:06

Jackie-Jiang approved these changes Oct 14, 2020

View reviewed changes

mayankshriv force-pushed the mt-broker branch from e20f784 to 283c8e3 Compare October 14, 2020 02:20

mayankshriv merged commit a910f5d into apache:master Oct 14, 2020

mayankshriv deleted the mt-broker branch October 14, 2020 03:20

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Support for multi-threaded Group By reducer for SQL. #6044

Support for multi-threaded Group By reducer for SQL. #6044

mayankshriv commented Sep 22, 2020 •

edited

Loading

kishoreg left a comment

mayankshriv Sep 22, 2020 •

edited

Loading

kishoreg Sep 22, 2020

Jackie-Jiang Oct 10, 2020

mayankshriv Oct 13, 2020

Jackie-Jiang Oct 10, 2020

codecov-io commented Oct 13, 2020 •

edited

Loading

Jackie-Jiang left a comment

mayankshriv commented Oct 14, 2020

Support for multi-threaded Group By reducer for SQL. #6044

Support for multi-threaded Group By reducer for SQL. #6044

Conversation

mayankshriv commented Sep 22, 2020 • edited Loading

Description

Upgrade Notes

Release Notes

Documentation

kishoreg left a comment

Choose a reason for hiding this comment

mayankshriv Sep 22, 2020 • edited Loading

Choose a reason for hiding this comment

kishoreg Sep 22, 2020

Choose a reason for hiding this comment

Jackie-Jiang Oct 10, 2020

Choose a reason for hiding this comment

mayankshriv Oct 13, 2020

Choose a reason for hiding this comment

Jackie-Jiang Oct 10, 2020

Choose a reason for hiding this comment

codecov-io commented Oct 13, 2020 • edited Loading

Codecov Report

Jackie-Jiang left a comment

Choose a reason for hiding this comment

mayankshriv commented Oct 14, 2020

mayankshriv commented Sep 22, 2020 •

edited

Loading

mayankshriv Sep 22, 2020 •

edited

Loading

codecov-io commented Oct 13, 2020 •

edited

Loading