Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support fragment result caching #15155

Merged
merged 5 commits into from
Oct 1, 2020
Merged

Conversation

shixuan-fan
Copy link
Contributor

@shixuan-fan shixuan-fan commented Sep 10, 2020

Test plan - Multiple unit test for CanonicalPlanGenerator, FileFragmentResultCacheManager and Driver. Ran shadow test for internal queries. Ran verifier test for internal queries.

== RELEASE NOTES ==

General Changes
* Add support for fragment result caching. When enabled, if the same plan fragment and same connector split hit the same worker, engine would directly fetch result from cache and skip computation. Currently only partial aggregation is supported. Cache could be enabled by setting ``fragment-result-cache.enabled`` to ``true`` and tuned by other configs started with ``fragment-result-cache``. 
 Query could use fragment result cache by setting config ``experimental.fragment-result-caching-enabled`` or session property ``fragment_result_caching_enabled`` to ``true``.

SPI changes
* Add ``getSplitIdentifier`` to ``ConnectorSplit``. Split identifier is used in fragment result caching to identify if splits are identical.
* Add ``getIdentifier`` to ``ConnectorTableLayoutHandle``. Layout identifier is used in fragment result caching to construct canonical plan. 

Copy link
Contributor

@highker highker left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

both Driver & LocalExecutionPlanner look clean

@shixuan-fan shixuan-fan force-pushed the frc branch 2 times, most recently from 803fa1b to 85bb0de Compare September 13, 2020 18:29
@shixuan-fan shixuan-fan force-pushed the frc branch 9 times, most recently from 1ee7806 to 001db72 Compare September 19, 2020 17:02
@shixuan-fan shixuan-fan force-pushed the frc branch 9 times, most recently from 22c01b6 to a421eb9 Compare September 23, 2020 22:31
@shixuan-fan shixuan-fan changed the title [WIP] Support fragment result caching Support fragment result caching Sep 23, 2020
@shixuan-fan shixuan-fan marked this pull request as ready for review September 23, 2020 23:04
@shixuan-fan shixuan-fan requested a review from a team September 24, 2020 17:05
Copy link
Contributor

@highker highker left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"Introduce CanonicalPlanGenerator": one comment

Copy link
Contributor

@highker highker left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"Introduce split identifier" LGTM

Copy link
Contributor

@highker highker left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"Introduce fragment result cache manager": minor comments

Copy link
Contributor

@highker highker left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"Use static import and lambda in TestDriver": LGTM

Copy link
Contributor

@highker highker left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"Support fragment result caching": overall logic looks good to me. Question on missing aggregation support

@shixuan-fan shixuan-fan force-pushed the frc branch 8 times, most recently from 353aea4 to 3d9b1e4 Compare October 1, 2020 04:37
@shixuan-fan shixuan-fan requested a review from highker October 1, 2020 05:30
Copy link
Contributor

@highker highker left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

}
}

private static <T> Iterator<T> closeWhenExhausted(Iterator<T> iterator, Closeable resource)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this part of code looks so similar to FileSingleStreamSpiller lol.

@zhengxingmao
Copy link
Contributor

Will this feature have the same performance improvement when it is migrated to iceberg connector?
Are there plans to support iceberg connector?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants