New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

Sign up for GitHub

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Jump to bottom

Support yield in bucket sort table write getout to prevent stuck driver detection #11229

Closed

xiaoxmeng wants to merge 1 commit into facebookincubator:main from xiaoxmeng:export-D64159781

Contributor

xiaoxmeng commented Oct 11, 2024

Summary:
Support yield in the middle of sort writer get output processing to prevent stuck driver detection as well
as friendly to other concurrent running queries or threads. We found in production that the long running get
output from sort writer can trigger alerts as it does sort, potential read spilled data from remote storage
and, encode and flush to remote storage through file writer. This can take hour in case of a small bucket
table which only has 64 buckets such as only 64 threads in the whole
cluster for running the query.

This PR adds finish API to data sink and file writer for table writer to do incremental sort and flush processing.
The data sink finish API call each file writer's finish API and both check the configured finish time slice limit
which are configured through a hive config. Both API returns false if finish needs continue processing or true
when finishes. Correspondingly, when table writer get output it returns null if finish data sink has more work
to do and set the ready block future and yield reason for driver framework to check and yield.

This PR also changes data sink and file writer interface with a new finish state. A new hive config
added for finish time slice limit. The driver framework adds to report the yield from a operator which
currently only reports the yield metric when the yield is triggered by the driver framework itself. A new
histogram metric is added to track the sort writer finish time distribution to monitoring

Differential Revision: D64159781

xiaoxmeng requested a review from majetideepak as a code owner

October 11, 2024 00:06

facebook-github-bot added the CLA Signed label

Contributor

facebook-github-bot commented Oct 11, 2024

This pull request was exported from Phabricator. Differential Revision: D64159781

facebook-github-bot added the fb-exported label

netlify bot commented Oct 11, 2024 •

edited

Loading

✅ Deploy Preview for meta-velox canceled.

Name	Link
🔨 Latest commit	`bb8e621`
🔍 Latest deploy log	https://app.netlify.com/sites/meta-velox/deploys/670a16d959ce9a0008bed086

xiaoxmeng added a commit to xiaoxmeng/velox that referenced this pull request


          Support yield in bucket sort table write getout to prevent stuck driv…

011477f

…er detection (facebookincubator#11229)

Summary:

Support yield in the middle of sort writer get output processing to prevent stuck driver detection as well
as friendly to other concurrent running queries or threads. We found in production that the long running get
output from sort writer can trigger alerts as it does sort, potential read spilled data from remote storage
and, encode and flush to remote storage through file writer. This can take hour in case of a small bucket
table which only has 64 buckets such as only 64 threads in the whole
cluster for running the query.

This PR adds finish API to data sink and file writer for table writer to do incremental sort and flush processing.
The data sink finish API call each file writer's finish API and both check the configured finish time slice limit
which are configured through a hive config. Both API returns false if finish needs continue processing or true
when finishes. Correspondingly, when table writer get output it returns null if finish data sink has more work
to do and set the ready block future and yield reason for driver framework to check and yield.

This PR also changes data sink and file writer interface with a new finish state. A new hive config
added for finish time slice limit. The driver framework adds to report the yield from a operator which
currently only reports the yield metric when the yield is triggered by the driver framework itself. A new
histogram metric is added to track the sort writer finish time distribution to monitoring

Differential Revision: D64159781

xiaoxmeng force-pushed the export-D64159781 branch from a97d7de to 011477f Compare

October 11, 2024 00:34

Contributor

facebook-github-bot commented Oct 11, 2024

This pull request was exported from Phabricator. Differential Revision: D64159781

xiaoxmeng added a commit to xiaoxmeng/velox that referenced this pull request


          Support yield in bucket sort table write getout to prevent stuck driv…

039b198

…er detection (facebookincubator#11229)

Summary:

Support yield in the middle of sort writer get output processing to prevent stuck driver detection as well
as friendly to other concurrent running queries or threads. We found in production that the long running get
output from sort writer can trigger alerts as it does sort, potential read spilled data from remote storage
and, encode and flush to remote storage through file writer. This can take hour in case of a small bucket
table which only has 64 buckets such as only 64 threads in the whole
cluster for running the query.

This PR adds finish API to data sink and file writer for table writer to do incremental sort and flush processing.
The data sink finish API call each file writer's finish API and both check the configured finish time slice limit
which are configured through a hive config. Both API returns false if finish needs continue processing or true
when finishes. Correspondingly, when table writer get output it returns null if finish data sink has more work
to do and set the ready block future and yield reason for driver framework to check and yield.

This PR also changes data sink and file writer interface with a new finish state. A new hive config
added for finish time slice limit. The driver framework adds to report the yield from a operator which
currently only reports the yield metric when the yield is triggered by the driver framework itself. A new
histogram metric is added to track the sort writer finish time distribution to monitoring

Differential Revision: D64159781

xiaoxmeng force-pushed the export-D64159781 branch from 011477f to 039b198 Compare

October 11, 2024 03:59

Contributor

facebook-github-bot commented Oct 11, 2024

This pull request was exported from Phabricator. Differential Revision: D64159781

xiaoxmeng added a commit to xiaoxmeng/velox that referenced this pull request


          Support yield in bucket sort table write getout to prevent stuck driv…

cce4b5d

…er detection (facebookincubator#11229)

Summary:

Support yield in the middle of sort writer get output processing to prevent stuck driver detection as well
as friendly to other concurrent running queries or threads. We found in production that the long running get
output from sort writer can trigger alerts as it does sort, potential read spilled data from remote storage
and, encode and flush to remote storage through file writer. This can take hour in case of a small bucket
table which only has 64 buckets such as only 64 threads in the whole
cluster for running the query.

This PR adds finish API to data sink and file writer for table writer to do incremental sort and flush processing.
The data sink finish API call each file writer's finish API and both check the configured finish time slice limit
which are configured through a hive config. Both API returns false if finish needs continue processing or true
when finishes. Correspondingly, when table writer get output it returns null if finish data sink has more work
to do and set the ready block future and yield reason for driver framework to check and yield.

This PR also changes data sink and file writer interface with a new finish state. A new hive config
added for finish time slice limit. The driver framework adds to report the yield from a operator which
currently only reports the yield metric when the yield is triggered by the driver framework itself. A new
histogram metric is added to track the sort writer finish time distribution to monitoring

Differential Revision: D64159781

xiaoxmeng force-pushed the export-D64159781 branch from 039b198 to cce4b5d Compare

October 11, 2024 04:16

Contributor

facebook-github-bot commented Oct 11, 2024

This pull request was exported from Phabricator. Differential Revision: D64159781

spershin approved these changes

View reviewed changes

Contributor

spershin left a comment

Thanks for fixing this!
It makes the engine more stable.

xiaoxmeng added a commit to xiaoxmeng/velox that referenced this pull request


          Support yield in bucket sort table write getout to prevent stuck driv…

c9658db

…er detection (facebookincubator#11229)

Summary:

Support yield in the middle of sort writer get output processing to prevent stuck driver detection as well
as friendly to other concurrent running queries or threads. We found in production that the long running get
output from sort writer can trigger alerts as it does sort, potential read spilled data from remote storage
and, encode and flush to remote storage through file writer. This can take hour in case of a small bucket
table which only has 64 buckets such as only 64 threads in the whole cluster for running the query.

This PR adds finish API to data sink and file writer for table writer to do incremental sort and flush processing.
The data sink finish API call each file writer's finish API and both check the configured finish time slice limit
which are configured through a hive config. Both API returns false if finish needs continue processing or true
when finishes. Correspondingly, when table writer get output it returns null if finish data sink has more work
to do and set the ready block future and yield reason for driver framework to check and yield.

This PR also changes data sink and file writer interface with a new finish state. A new hive config
added for finish time slice limit. The driver framework adds to report the yield from a operator which
currently only reports the yield metric when the yield is triggered by the driver framework itself. A new
histogram metric is added to track the sort writer finish time distribution to monitoring

Reviewed By: spershin

Differential Revision: D64159781

xiaoxmeng force-pushed the export-D64159781 branch from cce4b5d to c9658db Compare

October 11, 2024 04:43

Contributor

facebook-github-bot commented Oct 11, 2024

This pull request was exported from Phabricator. Differential Revision: D64159781

xiaoxmeng added a commit to xiaoxmeng/velox that referenced this pull request


          Support yield in bucket sort table write getout to prevent stuck driv…

8ec4bfb

…er detection (facebookincubator#11229)

Summary:

Support yield in the middle of sort writer get output processing to prevent stuck driver detection as well
as friendly to other concurrent running queries or threads. We found in production that the long running get
output from sort writer can trigger alerts as it does sort, potential read spilled data from remote storage
and, encode and flush to remote storage through file writer. This can take hour in case of a small bucket
table which only has 64 buckets such as only 64 threads in the whole cluster for running the query.

This PR adds finish API to data sink and file writer for table writer to do incremental sort and flush processing.
The data sink finish API call each file writer's finish API and both check the configured finish time slice limit
which are configured through a hive config. Both API returns false if finish needs continue processing or true
when finishes. Correspondingly, when table writer get output it returns null if finish data sink has more work
to do and set the ready block future and yield reason for driver framework to check and yield.

This PR also changes data sink and file writer interface with a new finish state. A new hive config
added for finish time slice limit. The driver framework adds to report the yield from a operator which
currently only reports the yield metric when the yield is triggered by the driver framework itself. A new
histogram metric is added to track the sort writer finish time distribution to monitoring

Reviewed By: spershin

Differential Revision: D64159781

xiaoxmeng force-pushed the export-D64159781 branch from c9658db to 8ec4bfb Compare

October 11, 2024 05:04

Contributor

facebook-github-bot commented Oct 11, 2024

This pull request was exported from Phabricator. Differential Revision: D64159781

xiaoxmeng added a commit to xiaoxmeng/velox that referenced this pull request


          Support yield in bucket sort table write getout to prevent stuck driv…

a88988b

…er detection (facebookincubator#11229)

Summary:

Support yield in the middle of sort writer get output processing to prevent stuck driver detection as well
as friendly to other concurrent running queries or threads. We found in production that the long running get
output from sort writer can trigger alerts as it does sort, potential read spilled data from remote storage
and, encode and flush to remote storage through file writer. This can take hour in case of a small bucket
table which only has 64 buckets such as only 64 threads in the whole cluster for running the query.

This PR adds finish API to data sink and file writer for table writer to do incremental sort and flush processing.
The data sink finish API call each file writer's finish API and both check the configured finish time slice limit
which are configured through a hive config. Both API returns false if finish needs continue processing or true
when finishes. Correspondingly, when table writer get output it returns null if finish data sink has more work
to do and set the ready block future and yield reason for driver framework to check and yield.

This PR also changes data sink and file writer interface with a new finish state. A new hive config
added for finish time slice limit. The driver framework adds to report the yield from a operator which
currently only reports the yield metric when the yield is triggered by the driver framework itself. A new
histogram metric is added to track the sort writer finish time distribution to monitoring

Reviewed By: Yuhta, spershin

Differential Revision: D64159781

xiaoxmeng force-pushed the export-D64159781 branch from 8ec4bfb to a88988b Compare

October 11, 2024 16:54

Contributor

facebook-github-bot commented Oct 11, 2024

This pull request was exported from Phabricator. Differential Revision: D64159781

xiaoxmeng added a commit to xiaoxmeng/velox that referenced this pull request


          Support yield in bucket sort table write getout to prevent stuck driv…

f408efa

…er detection (facebookincubator#11229)

Summary:

Support yield in the middle of sort writer get output processing to prevent stuck driver detection as well
as friendly to other concurrent running queries or threads. We found in production that the long running get
output from sort writer can trigger alerts as it does sort, potential read spilled data from remote storage
and, encode and flush to remote storage through file writer. This can take hour in case of a small bucket
table which only has 64 buckets such as only 64 threads in the whole cluster for running the query.

This PR adds finish API to data sink and file writer for table writer to do incremental sort and flush processing.
The data sink finish API call each file writer's finish API and both check the configured finish time slice limit
which are configured through a hive config. Both API returns false if finish needs continue processing or true
when finishes. Correspondingly, when table writer get output it returns null if finish data sink has more work
to do and set the ready block future and yield reason for driver framework to check and yield.

This PR also changes data sink and file writer interface with a new finish state. A new hive config
added for finish time slice limit. The driver framework adds to report the yield from a operator which
currently only reports the yield metric when the yield is triggered by the driver framework itself. A new
histogram metric is added to track the sort writer finish time distribution to monitoring

Reviewed By: Yuhta, spershin

Differential Revision: D64159781

xiaoxmeng force-pushed the export-D64159781 branch from a88988b to f408efa Compare

October 11, 2024 16:54

Contributor

facebook-github-bot commented Oct 11, 2024

This pull request was exported from Phabricator. Differential Revision: D64159781

xiaoxmeng added a commit to xiaoxmeng/velox that referenced this pull request


          Support yield in bucket sort table write getout to prevent stuck driv…

6da9bbe

…er detection (facebookincubator#11229)

Summary:

Support yield in the middle of sort writer get output processing to prevent stuck driver detection as well
as friendly to other concurrent running queries or threads. We found in production that the long running get
output from sort writer can trigger alerts as it does sort, potential read spilled data from remote storage
and, encode and flush to remote storage through file writer. This can take hour in case of a small bucket
table which only has 64 buckets such as only 64 threads in the whole cluster for running the query.

This PR adds finish API to data sink and file writer for table writer to do incremental sort and flush processing.
The data sink finish API call each file writer's finish API and both check the configured finish time slice limit
which are configured through a hive config. Both API returns false if finish needs continue processing or true
when finishes. Correspondingly, when table writer get output it returns null if finish data sink has more work
to do and set the ready block future and yield reason for driver framework to check and yield.

This PR also changes data sink and file writer interface with a new finish state. A new hive config
added for finish time slice limit. The driver framework adds to report the yield from a operator which
currently only reports the yield metric when the yield is triggered by the driver framework itself. A new
histogram metric is added to track the sort writer finish time distribution to monitoring

Reviewed By: Yuhta, spershin, oerling

Differential Revision: D64159781

xiaoxmeng force-pushed the export-D64159781 branch from f408efa to 6da9bbe Compare

October 11, 2024 17:14

Contributor

facebook-github-bot commented Oct 11, 2024

This pull request was exported from Phabricator. Differential Revision: D64159781

Contributor

facebook-github-bot commented Oct 11, 2024

This pull request was exported from Phabricator. Differential Revision: D64159781

xiaoxmeng added a commit to xiaoxmeng/velox that referenced this pull request


          Support yield in bucket sort table write getout to prevent stuck driv…

ef40fad

…er detection (facebookincubator#11229)

Summary:

Support yield in the middle of sort writer get output processing to prevent stuck driver detection as well
as friendly to other concurrent running queries or threads. We found in production that the long running get
output from sort writer can trigger alerts as it does sort, potential read spilled data from remote storage
and, encode and flush to remote storage through file writer. This can take hour in case of a small bucket
table which only has 64 buckets such as only 64 threads in the whole cluster for running the query.

This PR adds finish API to data sink and file writer for table writer to do incremental sort and flush processing.
The data sink finish API call each file writer's finish API and both check the configured finish time slice limit
which are configured through a hive config. Both API returns false if finish needs continue processing or true
when finishes. Correspondingly, when table writer get output it returns null if finish data sink has more work
to do and set the ready block future and yield reason for driver framework to check and yield.

This PR also changes data sink and file writer interface with a new finish state. A new hive config
added for finish time slice limit. The driver framework adds to report the yield from a operator which
currently only reports the yield metric when the yield is triggered by the driver framework itself. A new
histogram metric is added to track the sort writer finish time distribution to monitoring

Reviewed By: Yuhta, spershin, oerling

Differential Revision: D64159781

xiaoxmeng force-pushed the export-D64159781 branch from 3758539 to ef40fad Compare

October 11, 2024 21:31

Contributor

facebook-github-bot commented Oct 11, 2024

This pull request was exported from Phabricator. Differential Revision: D64159781

xiaoxmeng added a commit to xiaoxmeng/velox that referenced this pull request


          Support yield in bucket sort table write getout to prevent stuck driv…

26e103c

…er detection (facebookincubator#11229)

Summary:

Support yield in the middle of sort writer get output processing to prevent stuck driver detection as well
as friendly to other concurrent running queries or threads. We found in production that the long running get
output from sort writer can trigger alerts as it does sort, potential read spilled data from remote storage
and, encode and flush to remote storage through file writer. This can take hour in case of a small bucket
table which only has 64 buckets such as only 64 threads in the whole cluster for running the query.

This PR adds finish API to data sink and file writer for table writer to do incremental sort and flush processing.
The data sink finish API call each file writer's finish API and both check the configured finish time slice limit
which are configured through a hive config. Both API returns false if finish needs continue processing or true
when finishes. Correspondingly, when table writer get output it returns null if finish data sink has more work
to do and set the ready block future and yield reason for driver framework to check and yield.

This PR also changes data sink and file writer interface with a new finish state. A new hive config
added for finish time slice limit. The driver framework adds to report the yield from a operator which
currently only reports the yield metric when the yield is triggered by the driver framework itself. A new
histogram metric is added to track the sort writer finish time distribution to monitoring

Reviewed By: Yuhta, spershin, oerling

Differential Revision: D64159781

xiaoxmeng force-pushed the export-D64159781 branch from ef40fad to 26e103c Compare

October 11, 2024 22:36

Contributor

facebook-github-bot commented Oct 11, 2024

This pull request was exported from Phabricator. Differential Revision: D64159781

1 similar comment

Contributor

facebook-github-bot commented Oct 11, 2024

This pull request was exported from Phabricator. Differential Revision: D64159781

xiaoxmeng added a commit to xiaoxmeng/velox that referenced this pull request


          Support yield in bucket sort table write getout to prevent stuck driv…

9ea576f

…er detection (facebookincubator#11229)

Summary:
Pull Request resolved: facebookincubator#11229

Support yield in the middle of sort writer get output processing to prevent stuck driver detection as well
as friendly to other concurrent running queries or threads. We found in production that the long running get
output from sort writer can trigger alerts as it does sort, potential read spilled data from remote storage
and, encode and flush to remote storage through file writer. This can take hour in case of a small bucket
table which only has 64 buckets such as only 64 threads in the whole cluster for running the query.

This PR adds finish API to data sink and file writer for table writer to do incremental sort and flush processing.
The data sink finish API call each file writer's finish API and both check the configured finish time slice limit
which are configured through a hive config. Both API returns false if finish needs continue processing or true
when finishes. Correspondingly, when table writer get output it returns null if finish data sink has more work
to do and set the ready block future and yield reason for driver framework to check and yield.

This PR also changes data sink and file writer interface with a new finish state. A new hive config
added for finish time slice limit. The driver framework adds to report the yield from a operator which
currently only reports the yield metric when the yield is triggered by the driver framework itself. A new
histogram metric is added to track the sort writer finish time distribution to monitoring

Reviewed By: Yuhta, spershin, oerling

Differential Revision: D64159781

xiaoxmeng force-pushed the export-D64159781 branch from 26e103c to 9ea576f Compare

October 11, 2024 22:40

xiaoxmeng added a commit to xiaoxmeng/velox that referenced this pull request


          Support yield in bucket sort table write getout to prevent stuck driv…

f3cfc93

…er detection (facebookincubator#11229)

Summary:

Support yield in the middle of sort writer get output processing to prevent stuck driver detection as well
as friendly to other concurrent running queries or threads. We found in production that the long running get
output from sort writer can trigger alerts as it does sort, potential read spilled data from remote storage
and, encode and flush to remote storage through file writer. This can take hour in case of a small bucket
table which only has 64 buckets such as only 64 threads in the whole cluster for running the query.

This PR adds finish API to data sink and file writer for table writer to do incremental sort and flush processing.
The data sink finish API call each file writer's finish API and both check the configured finish time slice limit
which are configured through a hive config. Both API returns false if finish needs continue processing or true
when finishes. Correspondingly, when table writer get output it returns null if finish data sink has more work
to do and set the ready block future and yield reason for driver framework to check and yield.

This PR also changes data sink and file writer interface with a new finish state. A new hive config
added for finish time slice limit. The driver framework adds to report the yield from a operator which
currently only reports the yield metric when the yield is triggered by the driver framework itself. A new
histogram metric is added to track the sort writer finish time distribution to monitoring

Reviewed By: Yuhta, spershin, oerling

Differential Revision: D64159781

xiaoxmeng force-pushed the export-D64159781 branch from 9ea576f to f3cfc93 Compare

October 11, 2024 23:00

Contributor

facebook-github-bot commented Oct 11, 2024

This pull request was exported from Phabricator. Differential Revision: D64159781

xiaoxmeng added a commit to xiaoxmeng/velox that referenced this pull request


          Support yield in bucket sort table write getout to prevent stuck driv…

4cc56f8

…er detection (facebookincubator#11229)

Summary:

Support yield in the middle of sort writer get output processing to prevent stuck driver detection as well
as friendly to other concurrent running queries or threads. We found in production that the long running get
output from sort writer can trigger alerts as it does sort, potential read spilled data from remote storage
and, encode and flush to remote storage through file writer. This can take hour in case of a small bucket
table which only has 64 buckets such as only 64 threads in the whole cluster for running the query.

This PR adds finish API to data sink and file writer for table writer to do incremental sort and flush processing.
The data sink finish API call each file writer's finish API and both check the configured finish time slice limit
which are configured through a hive config. Both API returns false if finish needs continue processing or true
when finishes. Correspondingly, when table writer get output it returns null if finish data sink has more work
to do and set the ready block future and yield reason for driver framework to check and yield.

This PR also changes data sink and file writer interface with a new finish state. A new hive config
added for finish time slice limit. The driver framework adds to report the yield from a operator which
currently only reports the yield metric when the yield is triggered by the driver framework itself. A new
histogram metric is added to track the sort writer finish time distribution to monitoring

bypass-github-export-checks

Reviewed By: Yuhta, spershin, oerling

Differential Revision: D64159781

xiaoxmeng force-pushed the export-D64159781 branch from f3cfc93 to 4cc56f8 Compare

October 12, 2024 03:01

Contributor

facebook-github-bot commented Oct 12, 2024

This pull request was exported from Phabricator. Differential Revision: D64159781

xiaoxmeng added a commit to xiaoxmeng/velox that referenced this pull request


          Support yield in bucket sort table write getout to prevent stuck driv…

9a349e6

…er detection (facebookincubator#11229)

Summary:

Support yield in the middle of sort writer get output processing to prevent stuck driver detection as well
as friendly to other concurrent running queries or threads. We found in production that the long running get
output from sort writer can trigger alerts as it does sort, potential read spilled data from remote storage
and, encode and flush to remote storage through file writer. This can take hour in case of a small bucket
table which only has 64 buckets such as only 64 threads in the whole cluster for running the query.

This PR adds finish API to data sink and file writer for table writer to do incremental sort and flush processing.
The data sink finish API call each file writer's finish API and both check the configured finish time slice limit
which are configured through a hive config. Both API returns false if finish needs continue processing or true
when finishes. Correspondingly, when table writer get output it returns null if finish data sink has more work
to do and set the ready block future and yield reason for driver framework to check and yield.

This PR also changes data sink and file writer interface with a new finish state. A new hive config
added for finish time slice limit. The driver framework adds to report the yield from a operator which
currently only reports the yield metric when the yield is triggered by the driver framework itself. A new
histogram metric is added to track the sort writer finish time distribution to monitoring

bypass-github-export-checks

Reviewed By: Yuhta, spershin, oerling

Differential Revision: D64159781

xiaoxmeng force-pushed the export-D64159781 branch from 4cc56f8 to 9a349e6 Compare

October 12, 2024 04:29

Contributor

facebook-github-bot commented Oct 12, 2024

This pull request was exported from Phabricator. Differential Revision: D64159781

xiaoxmeng added a commit to xiaoxmeng/velox that referenced this pull request


          Support yield in bucket sort table write getout to prevent stuck driv…

de567c5

…er detection (facebookincubator#11229)

Summary:

Support yield in the middle of sort writer get output processing to prevent stuck driver detection as well
as friendly to other concurrent running queries or threads. We found in production that the long running get
output from sort writer can trigger alerts as it does sort, potential read spilled data from remote storage
and, encode and flush to remote storage through file writer. This can take hour in case of a small bucket
table which only has 64 buckets such as only 64 threads in the whole cluster for running the query.

This PR adds finish API to data sink and file writer for table writer to do incremental sort and flush processing.
The data sink finish API call each file writer's finish API and both check the configured finish time slice limit
which are configured through a hive config. Both API returns false if finish needs continue processing or true
when finishes. Correspondingly, when table writer get output it returns null if finish data sink has more work
to do and set the ready block future and yield reason for driver framework to check and yield.

This PR also changes data sink and file writer interface with a new finish state. A new hive config
added for finish time slice limit. The driver framework adds to report the yield from a operator which
currently only reports the yield metric when the yield is triggered by the driver framework itself. A new
histogram metric is added to track the sort writer finish time distribution to monitoring

bypass-github-export-checks

Reviewed By: Yuhta, spershin, oerling

Differential Revision: D64159781

xiaoxmeng force-pushed the export-D64159781 branch from 9a349e6 to de567c5 Compare

October 12, 2024 04:40

Contributor

facebook-github-bot commented Oct 12, 2024

This pull request was exported from Phabricator. Differential Revision: D64159781


          Support yield in bucket sort table write getout to prevent stuck driv…

bb8e621

…er detection (facebookincubator#11229)

Summary:

Support yield in the middle of sort writer get output processing to prevent stuck driver detection as well
as friendly to other concurrent running queries or threads. We found in production that the long running get
output from sort writer can trigger alerts as it does sort, potential read spilled data from remote storage
and, encode and flush to remote storage through file writer. This can take hour in case of a small bucket
table which only has 64 buckets such as only 64 threads in the whole cluster for running the query.

This PR adds finish API to data sink and file writer for table writer to do incremental sort and flush processing.
The data sink finish API call each file writer's finish API and both check the configured finish time slice limit
which are configured through a hive config. Both API returns false if finish needs continue processing or true
when finishes. Correspondingly, when table writer get output it returns null if finish data sink has more work
to do and set the ready block future and yield reason for driver framework to check and yield.

This PR also changes data sink and file writer interface with a new finish state. A new hive config
added for finish time slice limit. The driver framework adds to report the yield from a operator which
currently only reports the yield metric when the yield is triggered by the driver framework itself. A new
histogram metric is added to track the sort writer finish time distribution to monitoring

bypass-github-export-checks

Reviewed By: Yuhta, spershin, oerling

Differential Revision: D64159781

xiaoxmeng force-pushed the export-D64159781 branch from de567c5 to bb8e621 Compare

October 12, 2024 06:27

Contributor

facebook-github-bot commented Oct 12, 2024

This pull request was exported from Phabricator. Differential Revision: D64159781

facebook-github-bot closed this in

b00751e

facebook-github-bot added the Merged label

Contributor

facebook-github-bot commented Oct 12, 2024

This pull request has been merged in b00751e.

conbench-facebook bot commented Oct 12, 2024

Conbench analyzed the 1 benchmark run on commit b00751e2.

There were no benchmark performance regressions. 🎉

The full Conbench report has more details.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

CLA Signed fb-exported Merged