Skip to content

Commit

Permalink
Daniel`s comments 1
Browse files Browse the repository at this point in the history
  • Loading branch information
wojtek-t committed May 12, 2017
1 parent bd94b4d commit aedd6c2
Showing 1 changed file with 56 additions and 50 deletions.
106 changes: 56 additions & 50 deletions contributors/design-proposals/bulk_watch.md
Original file line number Diff line number Diff line change
Expand Up @@ -37,12 +37,14 @@ appearing in the community and now we have good usecase to proceed with it.

Once a bulk watch is set up, we also need to periodically verify its ACLs.
Whenever a user (in our case Kubelet) loses access to a given resource, the
watch should be closed within some bounded time. the rationaly behind this
requirement is that by using (bulk) watch we can't regress the current behavior
where a polling Kubelet would eventually be unable to access a secret.
watch should be closed within some bounded time. The rationale behind this
requirement is that by using (bulk) watch we still want to enforce ACLs
similarly to how we enforce them with get operations (and that Kubelet would
eventually be unable to access a secret no matter if it is watching or
polling it).

That said, periodic verification of ACLs isn't specific to bulk watch and
needs to be solved also in `reqular` watch (e.g. user watching just a single
needs to be solved also in `regular` watch (e.g. user watching just a single
secret may also lose access to it and such watch should also be closed in this
case). So this requirement is common for both regular and bulk watch. We
just need to solve this problem on low enough level that would allow us to
Expand All @@ -56,11 +58,12 @@ As a result, below we are describing requirements that the end solution has to
meet to satisfy our needs
- a single bulk requests has to support multiple resource types (e.g. get a
node and all pods associated with it)
- if the bulk requests filters asks only for object of a single resource type,
the returned result ideally should be compatible with what is returned from
the regular API
- the API has to support aggregation if different resource types are served
by different apiservers
- the wrappers for aggregating multiple objects (in case of list we can return
a number of objects of different kinds) should be `similar` to lists in core
API (by lists I mean e.g. `PodList` object)
- the API has to be implemented also in aggregator so that bulk operations
are supported also if different resource types are served by different
apiservers
- clients has to be able to alter their watch subscribtions incrementally (it
may not be implemented in the initial version though, but has to be designed)

Expand All @@ -75,9 +78,11 @@ Spanning multiple resources, resources types or conditions will be more and more
important for large number of watches. As an example, federation will be adding
watches for every type it federates. With that in mind, bypassing aggregation
at the resource type level and going to aggregation over objects with different
resource types will allow us to more aggresively optimize in the future.
resource types will allow us to more aggresively optimize in the future (it
doesn't mean you have to watch resources of different types in a single watch,
but we would like to make it possible).

Moreover, out current REST API doesn't even offer an easy way to handle
Moreover, our current REST API doesn't even offer an easy way to handle
"multiple watches of a given type" within a single request. As a result, instead
of inventing new top level pattern per type, we can introduce a new resource
type that follows normal RESTful rules and solves even more generic problem
Expand All @@ -90,36 +95,37 @@ of spanning multiple different resource types.
that underneath will have a completely separate implementation.

In all text below, we are assuming v1 version of the API, but it will obviously
start as v1alpha1 version.
go through alpha and beta stages before (it will start as v1alpha1).

In this design, we will focus only on bulk list (get) and watch operations.
Later, we would like to introduce new resources to support bulk create, update
and delete operations, but that's not part of this design.

We will start with introducing `list` resource and supporting the following
two operation:
operation:
```
POST /apis/bulk.k8s.io/v1/list <body defines filtering>
POST /apis/bulk.k8s.io/v1/list?watch=1 <body defines filtering>
```
We can't simply make this an http GET request, due to limitations of GET for
the size (length) of the url (in which we would have to pass filter options).
This makes the API look similar to our regular API.
Note that once we have list fully implemented, adding this kind of watch will
be faily simple.

The main drawback of this approach is that it won't allow for dynamic altering
of watch subscriptions (which we would like to also support).
We could consider adding `watch` operation using the same pattern with just
`?watch=1` parameter. However, the main drawback of this approach is that it
won't allow for dynamic altering of watch subscriptions (which we definitely
also need to support).
As a result, we need another API for watch that will also support incremental
subscribtions. That's why we will also introduce the following operation:
subscriptions - it will look as following:
```
websocket /apis/bulk.k8s.io/v1/list?watch=1
```
The more detailed description of this protocol is in the followin subsection.

*TODO: For consistency, we also considered introducing websocket API for
*Note: For consistency, we also considered introducing websocket API for
handling LIST requests, where first client sends a filter definition over the
channel and then server sends back the response, but we dropped this for now.
channel and then server sends back the response, but we dropped this for now.*

*Note: We aso considered implementing the POST-based watch handler that doesn't
allow for altering subsriptions, which should be very simple once we have list
implemented. But since websocket API is needed anyway, we also dropped it.*


### Filtering definition
Expand Down Expand Up @@ -164,33 +170,6 @@ bound to a given node.
ACLs for watches and breaking the watch whenever ACLs change and user is no
longer allowed to watch all requested objects.

In the "http POST" approach, whenever a user wants to start watching for
another object (or set of objects) or drop one (or a set), he needs to break the
watch and initiate a new one with a different selector. That isn't perfect, but
still can solve different usecases.
As an example, we can solve watching secrets in Kubelet as following:
- create a store that will also support the following checks on add, update and
delete operations: "if the resource version of object is older than what is
currently cached, ignore it"
- use this store underneath the reflector framework (though the reflector will
require some changes)
- whenever a set of secrets to watch changes:
- break the current watch
- retrieve current state of secrets that are new to be watched and add them
to the local cache
- remove from local cache secrets that are no longer supposed to be watched
- send new watch request with the updated filter (with RV from the moment
when we broke the previous watch)
- even if watch is lagging and we will get some previous version of secret
(after it was retrieved), it won't affect the local cache
Depending on the load on the cluster, this approach may either be better or
worse that periodic polling that we currently have (it will be less http
requests, but may be more ACL checking).

We can get much better performance in the websocket-based approach where we
will allow for dynamic changing of what (sets of) objects are watched. This
approach is descibed in Dynamic Watch subsection below.


### Watch semantics

Expand Down Expand Up @@ -451,3 +430,30 @@ events for different object types for us. Having exactly one watch responsible
for delivering all objects (pods, secrets, ...) will guarantee that if we are
currently at resource version "rv", we processed objects of all types up to rv
and nothing with resource version greater than rv. Which is exactly what we need.


## Other Notes

If we decide on the "http POST" approach as an additional way to implement
watch (not supporting altering subscriptions), whenever a user wants to start
watching for another object (or set of objects) or drop one (or a set), he needs
to break the watch and initiate a new one with a different selector.
That isn't perfect, but still can solve different usecases.
As an example, we can solve watching secrets in Kubelet as following:
- create a store that will also support the following checks on add, update and
delete operations: "if the resource version of object is older than what is
currently cached, ignore it"
- use this store underneath the reflector framework (though the reflector will
require some changes)
- whenever a set of secrets to watch changes:
- break the current watch
- retrieve current state of secrets that are new to be watched and add them
to the local cache
- remove from local cache secrets that are no longer supposed to be watched
- send new watch request with the updated filter (with RV from the moment
when we broke the previous watch)
- even if watch is lagging and we will get some previous version of secret
(after it was retrieved), it won't affect the local cache
Depending on the load on the cluster, this approach may either be better or
worse that periodic polling that we currently have (it will be less http
requests, but may be more ACL checking).

0 comments on commit aedd6c2

Please sign in to comment.