-
Notifications
You must be signed in to change notification settings - Fork 118
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Promoting or inlining platform properties #166
Comments
Interesting side note: Platform used to be in Action, and was moved out here
<f42e4bb#diff-4153f76ba92d8d30764c0251177105e8>
citing potential performance improvements. I suspect those benefits were
material for output files, but incidental for Platform.
The Platform/Command split was designed to conserve bandwidth, on the
theory that the Command (arguments, output files, platform) are both large
and relatively stable, while the Action (particularly input root) changes
frequently. It's much more bandwidth-efficient (and storage-efficient) to
reference stable parameters by digest, although the vast majority of the
savings here is from output files. (Indeed, taken to the limit the Platform
should be its own message because it's frequently constant for an entire
build (and in some cases even an entire RE deployment), but even our own
analysis points out that Platform is also generally small, so it doesn't
really matter how we handle it.)
In general, I'm hesitant about optimizing the API structure around specific
scheduling implementations because I think it's opening up a can of
worms--I could imagine schedulers that would want to route based on
arguments or output files, for example. That said, I agree that Platform
may be special because one of its chief functions is to enable routing. I'm
curious what the actual impact is here--in our case, we're generally
talking about Actions with an expected duration of O(seconds), such that
the overhead of a one-time fetch of the command is trivial.
Of the proposed options, I think inlining the Platform into the Action is
the best. There are very good reasons to keep Action and Command separate,
and I think that any option that leaves an optional feature (e.g., allowing
Platform in either Action or Command) in place long-term is worse than
settling on a single location.
…On Fri, Aug 28, 2020 at 8:55 AM Ed Baunton ***@***.***> wrote:
We discussed this during a monthly meeting (I seem to recall that @ulfjack
<https://github.com/ulfjack> initially raised it), adding here for
tracking.
The current design of platform properties is that they are indirectly
embedded in the Command property of the Action. The Action does not send
the Command directly but rather sends a Command digest.
The upshot of this is that additional blob uploads are required from the
client before submitting and action as well as additional CAS interactions
on the server side if any data is required from the Command. For example,
platform properties.
I think the initial discussion specifically mentioned that if the server
wanted to making routing decisions for an action based on the platform
properties it would require and additional hit to the CAS to determine
those for the action.
It seems to me that some of this extra CAS interaction overhead could be
avoided if we inlined the command or platform properties into the action.
I can see that we could probably:
1. Extend the action message to have both command_digest or the actual
command
2. Inline the platform properties
3. Something else?
—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
<#166>, or unsubscribe
<https://github.com/notifications/unsubscribe-auth/ADMU232WBAE2SMXYQSDMT2TSC6SMVANCNFSM4QODHGJQ>
.
|
I am concerned about the schedulers having to fetch multiple blobs from the CAS sequentially, and having to parse them into local memory (especially because we expect the Command proto to be large). Both because this is awkward for our implementation, and also because it may enable various denial-of-service vectors. Especially for small clusters, having to overprovision schedulers to safeguard against this can incur significant compute / memory costs. (I'd also be slightly concerned about the Platform proto growing significantly in size.) I don't think this is over-designing for a specific scheduler implementation. The current protocol requires that the scheduler reads the Action & Command protos, which clearly imposes restrictions on the scheduler design. I think this may also tie into the proposal by @EdSchouten about making the scheduler untrusted - the more CAS reads are required in the scheduler for routing, the more holes you need to poke into the security model. Personally, I think untrusted schedulers are only viable from a security perspective if either a) the platform proto being sent to the scheduler explicitly, i.e., no direct CAS reads from the scheduler, or b) the Platform proto (and every proto between the execute proto and the platform proto) are stored in a separate CAS service (public CAS / private CAS distinction). Requiring the Command to be public seems like a fairly large information leak. I'm not sure what the best place to put the Platform proto is. I have a strong preference for moving it out of Command. Ideally, it would not be stored in the CAS at all: that would allow a scheduler design that does routing with the Platform proto only and is also untrusted, without requiring a separate public CAS (or complicating the CAS protocol to allow a public / private distinction). However, I'd be concerned about the platform proto growing to be significantly larger. Over time, I can see us define hundreds of settings in the platform proto. Maybe a compromise would be to allow the platform proto to be inlined in the execute request or referenced via digest? Too much flexibility? |
On Fri, Aug 28, 2020 at 11:05 AM Ulf Adams ***@***.***> wrote:
I am concerned about the schedulers having to fetch multiple blobs from
the CAS sequentially, and having to parse them into local memory
(especially because we *expect* the Command proto to be large). Both
because this is awkward for our implementation, and also because it may
enable various denial-of-service vectors. Especially for small clusters,
having to overprovision schedulers to safeguard against this can incur
significant compute / memory costs. (I'd also be slightly concerned about
the Platform proto growing significantly in size.)
I don't think this is over-designing for a specific scheduler
implementation. The current protocol *requires* that the scheduler reads
the Action & Command protos, which clearly imposes restrictions on the
scheduler design.
Yeah, I think I can get behind this specifically for the platform. The
original move to the Command was mostly incidental, not strongly principled.
I think this may also tie into the proposal by @EdSchouten
<https://github.com/EdSchouten> about making the scheduler untrusted -
the more CAS reads are required in the scheduler for routing, the more
holes you need to poke into the security model. Personally, I think
untrusted schedulers are only viable from a security perspective if either
a) the platform proto being sent to the scheduler explicitly, i.e., no
direct CAS reads from the scheduler, or b) the Platform proto (and every
proto between the execute proto and the platform proto) are stored in a
separate CAS service (public CAS / private CAS distinction). Requiring the
Command to be public seems like a fairly large information leak.
I'm not sure what the best place to put the Platform proto is. I have a
strong preference for moving it out of Command. Ideally, it would not be
stored in the CAS at all: that would allow a scheduler design that does
routing with the Platform proto only and is also untrusted, without
requiring a separate public CAS (or complicating the CAS protocol to allow
a public / private distinction).
In our experience, having the Action (and by extension, the Command)
explicitly stored in the CAS provides significant benefits. For example, it
allows re-triggering the same action, or downloading the entire Action to
re-create it locally (this is the basis for tools_remote). We use the fact
that it's stored in the CAS in many other ways, too.
However, I'd be concerned about the platform proto growing to be
significantly larger. Over time, I can see us define hundreds of settings
in the platform proto. Maybe a compromise would be to *allow* the
platform proto to be inlined in the execute request or referenced via
digest? Too much flexibility?
I believe the "allow" option is too much flexibility. I'd prefer to just
move the Platform (obviously with a temporary "allow" option during the
transition).
… —
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
<#166 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/ADMU234ZRPKXQ2OC3UA4LY3SC7BSJANCNFSM4QODHGJQ>
.
|
Platform properties are currently a member of the `command` message which is referred to in the action by digest. This requires the Execution Service to make a call to the CAS to retrieve the contents of the command if it wishes to inspect it. The platform properties are commonly used for making routing decision about the action. Therefore, in order to route an action common execution service implementations must introduce an additional call to the CAS to fully hydrate the action and determine where it should be routed. This commit promotes the platform properties from the command to the action. We deprecate the platform properties contained within the action and bump the minor version to version 2.2, following the model for `output_paths`. Fixes #166 Signed-off-by: Ed Baunton <ebaunton1@bloomberg.net>
I think #167 satisfies the above discussion except for the case of a large set of platform properties that @ulfjack mentions. I think we would need to do something much more sophisticated with platform properties to support this case: those that are placed in-line and those that reside in the command. |
Platform properties are currently a member of the `command` message which is referred to in the action by digest. This requires the Execution Service to make a call to the CAS to retrieve the contents of the command if it wishes to inspect it. The platform properties are commonly used for making routing decision about the action. Therefore, in order to route an action common execution service implementations must introduce an additional call to the CAS to fully hydrate the action and determine where it should be routed. This commit promotes the platform properties from the command to the action. We deprecate the platform properties contained within the action and bump the minor version to version 2.2, following the model for `output_paths`. Fixes #166 Signed-off-by: Ed Baunton <ebaunton1@bloomberg.net>
Platform properties are currently a member of the `command` message which is referred to in the action by digest. This requires the Execution Service to make a call to the CAS to retrieve the contents of the command if it wishes to inspect it. The platform properties are commonly used for making routing decision about the action. Therefore, in order to route an action common execution service implementations must introduce an additional call to the CAS to fully hydrate the action and determine where it should be routed. This commit promotes the platform properties from the command to the action. We deprecate the platform properties contained within the action and bump the minor version to version 2.2, following the model for `output_paths`. Fixes #166 Signed-off-by: Ed Baunton <ebaunton1@bloomberg.net>
Platform properties are currently a member of the `command` message which is referred to in the action by digest. This requires the Execution Service to make a call to the CAS to retrieve the contents of the command if it wishes to inspect it. The platform properties are commonly used for making routing decision about the action. Therefore, in order to route an action common execution service implementations must introduce an additional call to the CAS to fully hydrate the action and determine where it should be routed. This commit promotes the platform properties from the command to the action. We deprecate the platform properties contained within the action and bump the minor version to version 2.2, following the model for `output_paths`. Fixes #166 Signed-off-by: Ed Baunton <ebaunton1@bloomberg.net>
Platform properties are currently a member of the `command` message which is referred to in the action by digest. This requires the Execution Service to make a call to the CAS to retrieve the contents of the command if it wishes to inspect it. The platform properties are commonly used for making routing decision about the action. Therefore, in order to route an action common execution service implementations must introduce an additional call to the CAS to fully hydrate the action and determine where it should be routed. This commit promotes the platform properties from the command to the action. We deprecate the platform properties contained within the action and bump the minor version to version 2.2, following the model for `output_paths`. Fixes #166 Signed-off-by: Ed Baunton <ebaunton1@bloomberg.net>
Platform properties are currently a member of the `command` message which is referred to in the action by digest. This requires the Execution Service to make a call to the CAS to retrieve the contents of the command if it wishes to inspect it. The platform properties are commonly used for making routing decision about the action. Therefore, in order to route an action common execution service implementations must introduce an additional call to the CAS to fully hydrate the action and determine where it should be routed. This commit promotes the platform properties from the command to the action. We deprecate the platform properties contained within the action and bump the minor version to version 2.2, following the model for `output_paths`. Fixes #166 Signed-off-by: Ed Baunton <ebaunton1@bloomberg.net>
We discussed this during a monthly meeting (I seem to recall that @ulfjack initially raised it), adding here for tracking.
The current design of platform properties is that they are indirectly embedded in the Command property of the Action. The Action does not send the Command directly but rather sends a Command digest.
The upshot of this is that additional blob uploads are required from the client before submitting and action as well as additional CAS interactions on the server side if any data is required from the Command. For example, platform properties.
I think the initial discussion specifically mentioned that if the server wanted to making routing decisions for an action based on the platform properties it would require and additional hit to the CAS to determine those for the action.
It seems to me that some of this extra CAS interaction overhead could be avoided if we inlined the command or platform properties into the action.
I can see that we could probably:
command_digest
or the actualcommand
The text was updated successfully, but these errors were encountered: