Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement Spilling Strategies #15173

Merged
merged 1 commit into from
Sep 29, 2020
Merged

Conversation

sachdevs
Copy link
Contributor

@sachdevs sachdevs commented Sep 15, 2020

This PR adds multiple different spilling strategies that can be swapped between using the config experimental.spiller.task-spilling-strategy.

ORDER_BY_CREATE_TIME - current default strategy. Watch memory pools for revocable memory exceeding threshold, sort tasks by create time, revoke individual operators until we reach the lower threshold.

ORDER_BY_REVOCABLE_BYTES - NEW. Watch memory pools for revocable memory exceeding threshold, sort tasks by most allocated revocable bytes, revoke individual operators until we reach the lower threshold.

PER_TASK_MEMORY_THRESHOLD - NEW. Watch revocable memory pool for memory exceeding per task memory threshold defined by experimental.spiller.max-revocable-task-memory. Spill operators in task until it lowers to this threshold.

TODO

  • finish integration tests
  • better commit message
  • Release note
== RELEASE NOTES ==

General Changes
* Add config `experimental.spiller.task-spilling-strategy` for choosing different spilling strategy to use.

@sachdevs sachdevs changed the title Prototype different task spilling strategies for performance testing [WIP] Prototype different task spilling strategies for performance testing Sep 15, 2020
@sachdevs sachdevs requested a review from highker September 15, 2020 15:48
@sachdevs
Copy link
Contributor Author

I'll actually update this with PER_TASK_MEMORY_THRESHOLD as it's own class. Right now MemoryRevokingScheduler is doing too much conditional logic.

@sachdevs
Copy link
Contributor Author

Still finishing up integration tests, updating revocable memory at the task level without having an SqlTaskManager has been a bit of a challenge. Everything else is complete however.

@sachdevs sachdevs requested a review from highker September 24, 2020 03:54
@sachdevs sachdevs changed the title [WIP] Prototype different task spilling strategies for performance testing [WIP] Implement Spilling Strategies Sep 24, 2020
import static java.util.Objects.requireNonNull;
import static java.util.concurrent.TimeUnit.SECONDS;

public class TaskThresholdMemoryRevokingScheduler
Copy link
Contributor

@wenleix wenleix Sep 25, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry if it's a dumb question. I just realized memory revoking scheduler (and MemoryRevokingScheduler) doesn't implement any interface. Curious how are they get called? I assume it's related to the @PostConstruct and @PreDestroy annotation?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yup you got it. They run as a separate thread and are created at injection time. On creation (@PostConstruct), we register a periodically executing method to check if revoking is necessary (based on whichever spilling strategy we are using). As for which scheduler is created, that is determined in ServerMainModule, using installModuleIf. I was wondering too if it's worth having an interface but figured there wasn't enough reason to abstract it yet.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@sachdevs : I see. Is this a mechanism from Guice? (e.g. automatically call methods annotated with @PostConstruct)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@sachdevs sachdevs force-pushed the task-based-spilling branch 2 times, most recently from 0b33842 to 46ed059 Compare September 28, 2020 18:48
@sachdevs sachdevs changed the title [WIP] Implement Spilling Strategies Implement Spilling Strategies Sep 28, 2020
public enum TaskSpillingStrategy
{
ORDER_BY_CREATE_TIME, // When spilling is triggered, revoke tasks in order of oldest to newest
ORDER_BY_REVOCABLE_BYTES, // When spilling is triggered, revoke tasks by most allocated revocable memory to least allocated revocable memory
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we treat join and aggregation spill the same way? The cost of spilling them is not uniform.

With join if you spill it then the entire probe side has to be spilled incurring a 3x IO cost.

With aggregation if you spill it will have to merge in one more run, but it can continue to use memory to avoid creating more sorted runs.

In practice whether this distinction matters often???

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are two levels of spilling strategy:

  1. Prioritizing which operator to spill within a given task.
    This would be easy to add on top of the existing implementation. Modifying the VoidTraversingQueryContextVisitor to make a list of operators and then rank them by priority would accomplish this. This refers to how we choose to spill operators within a task. So far, these implementations do not try to distinguish between operators when choosing to spill which isn't ideal. I can look into adding this in as we see fit depending on how spilling works in production for our workload as we start rolling out soon.

  2. Prioritizing which task to spill in a list of currently running tasks.
    As for distinguishing between revocable bytes allocated by join operator vs agg operator and using that to prioritize which tasks to spill, that would be a bit more work. Let's circle back on this if we see issues with this in during shadow.

@sachdevs sachdevs requested a review from wenleix September 28, 2020 19:33
Copy link
Contributor

@highker highker left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

minor comments only. Can @wenleix also help to take a look?

private final long maxRevocableMemoryPerTask;

@Nullable
private ScheduledFuture<?> scheduledFuture;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess this variable is not thread-safe? Maybe have synchronized methods/blocks + "GuardedBy("this")"

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like it's just copied from existing MemoryRevokingScheduler 😂

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See comment on lines 95-98

Comment on lines +95 to +102
if (scheduledFuture != null) {
scheduledFuture.cancel(true);
scheduledFuture = null;
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not thread-safe

Copy link
Contributor Author

@sachdevs sachdevs Sep 29, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since this class is injected as a SINGLETON + this part is only called on @PreDestroy, this method won't be called until this class is destroyed (We never call stop() manually). This won't happen until Presto shuts down? I can't imagine any other case in which we will destroy these schedulers. This method only exists to stop leaking this future for tests I'm guessing (since a similar method is also present in MemoryRevokingScheduler). Let me know if we still need thread safety for this variable because of this context.

@highker highker self-assigned this Sep 29, 2020
@sachdevs sachdevs force-pushed the task-based-spilling branch 2 times, most recently from c56cc07 to 43a9b80 Compare September 29, 2020 17:10
Copy link
Contributor

@wenleix wenleix left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Publish some comments since there are changes to the commit

private final long maxRevocableMemoryPerTask;

@Nullable
private ScheduledFuture<?> scheduledFuture;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like it's just copied from existing MemoryRevokingScheduler 😂

Copy link
Contributor

@wenleix wenleix left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.

@sachdevs sachdevs force-pushed the task-based-spilling branch 2 times, most recently from d6e2c5b to 4be43f5 Compare September 29, 2020 19:12
@sachdevs
Copy link
Contributor Author

Rebase master due to unrelated test fail.

This adds ordering tasks by revocable bytes and per task memory
threshold alongside ordering by create time.
@sachdevs
Copy link
Contributor Author

Tests are green, only added comment. Good to go 👍

@highker highker merged commit 544b5a4 into prestodb:master Sep 29, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants