Skip to content
Merged
Changes from 2 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
176 changes: 176 additions & 0 deletions docs/proposals/1023-data-layer-architecture/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,176 @@
# Data Layer Architecture Proposal

Author(s): @elevran @nirrozenbaum

## Proposal Status

***Draft***

## Summary

The EPP Architecture proposal identifies the need for an extensible
[Data Layer](../0683-epp-architecture-proposal/README.md#data-layer).
Recently, the scheduling subsystem underwent a major [architecure change](../0845-scheduler-architecture-proposal/README.md)
to allow easier extension and pluggability. This proposal aims to apply
similar extensibility to the Data Layer subsystem, allowing custom inference
gateways to extend the Gateway API Inference extension (GIE) for their use
cases without modifying the core GIE code base.

See [this document](https://docs.google.com/document/d/1eCCuyB_VW08ik_jqPC1__z6FzeWO_VOlPDUpN85g9Ww/edit?usp=sharing) for additional context amd reference.

## Goals

The Data Layer pluggability effort aims to address the folllowing goals and
requirements:

- Make endpoint attributes used by GIE components accessible via well defined
Data Layer interfaces.
- Enable collection of additional (or different) subset of attributes from an
existing data source (e.g., the `/metrics` endpoint scraper).
- Add a new data source that collects attributes not already collected.
- Follow best practices and experience from the Scheduling subsystem
pluggability effort. For example, extending the system to support the above
should be through implementing well defined Plugin interfaces and registering
them in the GIE Data Layer subsystem; any configuration would be done in the
same way (e.g., code and/or configuration file), etc.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

configuration file

Just to sanity check, we would extend the same API, correct?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, that's the intent.
Any data layer plugin configuration would be done via the same API and configuration mechanism.

- Be efficient (RAM, CPU, concurrency) in collecting and storing attributes.
- Limit change blast radius in GIE when making above changes. Core GIE code
should not need to be modified in order to support collecting and storing new
attributes. Affected code should be scoped only to modules that make use of
the new attributes.
- The extensions should not increase coupling between GIE subsystems and
Kubernetes (i.e., the environment specific code should be encapsulated and
not “leaked” into the subsystem and its users).
- (Future) Allow non-uniform data collection (i.e., not all endpoints share the
same data).

## Non-Goals
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Worth clarifying as a non-goal: I assume we don't intent to introduce a new mechanism for scraping, meaning the one scraper thread per endpoint is a common framework for collecting data from the endpoint, and the extensibility we are introducing here is not meant to allow other ways to scrape, but to extend what can be scraped via this framework.

Copy link
Contributor Author

@elevran elevran Jun 25, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I understand your statement correctly - then yes: the DataSource implementation would spawn a go-routine per endpoint.
So we would have 1 goroutine per data source, per endpoint.
Or did you mean that the same go-routine would be used for all data sources operating on an endpoint?

I think the shared go-routine per endpoint is doable but adds complexity (i.e., since it is running each source in turn, delays/errors in one could impact other sources). It's a complexity vs resource overheads tradeoff.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was hoping for the latter, one go-routine per endpoint. But we can discuss on the implementation.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds good. Will implement one go-routine for all DataSources operating on an endpoint and we can refine if needed on the implementation.


- Modify existing GIE abstractions, such as `InferencePool`, to conform to the
Data Layer pluggability design. They are to remain first class concepts, as
today.
- Enable reconciliation or modification of external state. The data sources are
strictly read-only. For example, data source accessing Kubernetes state as part of
the data collection would registered for `Watch()` notifications and shall not
receive access to a k8s client.
- Inference scheduler Plugins, that rely on custom data collection, accept that
the [Model Server Protocol](../003-model-server-protocol/README.md) no longer
provides guarantees on portability of a model server out of the box.

## Proposal

### Overview

There are two existing Data Sources in the Data Layer: a Pod reconciler that
collects Pod IP address(es) and labels, copying them to endpoint attributes,
and a metrics scraper that collects a defined set of metric values from the
`/metrics` endpoint of each Pod. Note that the `InferencePool` reconciler is
*not* considered part of the Data Layer.

### Components

The proposal is to make the Data Layer more extensible approaching by introducing
these two interfaces:

- An **Attribute Collection** plugin interface respobsible for extratcing relevant
attributes from a data source and stoing them into the Data Layer for consumption
by other components. The plugin can be registered with existing or new
*Data Sources* (see below) and sources would call their registered plugins
periodically or on change to process attributes.
- A **Data source** plugin interface that can be added to an inference gateway
system, and on which *Attribute Collection* plugins can be registered to enrich
the data model.

### Implementation Phases

In order to make iterative process and validate the design alongside, we
propose to implement and evelove the Data Layer extensibility over several
phases:

1. Extend the backend, per endpoint, storage with a map from a name (i.e., the
attribute collection interface) to the data it collected. Existing attributes,
such as IP address or Pod labels, are not modified.
1. Introduce a Data Source registry where new data sources can be registered and
bootstrap it by wrapping the existing `/metrics` with a Data Source API. At this
point, the metrics scraping code includes only the `Data Source` interface and the
`Data Collection` interface is not used/exposed.
1. Refactor the metrics scraping code into separate Data Source and Data Collection
plugin interfaces.
1. Following that, and based on any lessons learnt, we’ll refactor the existing
Kubernetes Pod recocilation loop to the new plugin interfaces.

### Suggested Data Layer Plugin Interfaces

```go
// DataCollection interface consumes data updates from sources, stores
// it in the data layer for consumption.
// The plugin should not assume a deterministic invocation behavior beyond
// "the data layer believes the state should be updated"
type DataCollection interface {
// Extract is called by data sources with (possibly) updated
// data per endpoint. Extracted attributes are added to the
// Endpoint.
Extract(ep Endpoint, data interface{}) error // or Collect?
Copy link
Contributor

@ahg-g ahg-g Jun 25, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't this be a generic Source type instead of Endpoint? The collector would know what to cast it to based on the data source it registered for. This is to allow data sources other than endpoints.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A generic source could be more flexible - I'll think about that more.

However, I realize that there's possibly use of confusing naming on my part - let me try to clarify.

Regarding the specific interface function:
Extract() receives relevant data from source (e.g., response to GET /metrics), exctracts the needed attributes (e.g., the specific metrics), and then stores them for later use on ep.

The current design uses a shared storage layer, with standard data per endpoint (e.g., scheduling/types.Pod referencing backendmetrics.MetricsState and backend.Pod, IIUC).

If we want to continue using this approach (and I'm suggesting we do, at least initially), the extended data would need to be stored in the same place (e.g., a new backend.Extended object or whatever we converge on).

Another option is for each new DataCollection to maintain its own internal storage scheme and only store "handles" (or tags) on the shared object (e.g., backend.Pod). The caller would use the handle (a data collection name and a tag) to retrieve the data from the relevant DataCollection.
This adds flexibility but also complexity (more moving parts, more complicated mental model, more indirection and calls). It could also make non Endpoint Sources easier to support (Plugins would retrieve a reference to DataCollection and get whatever data they need).

Thoughts?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am ok to start with endpoint focused interfaces and adjust them as a follow up as we learn how the implementation will look like. My sense is that to support datasource types other than Endpoints we will need to change this.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1. I think there is an agreement on the general direction. I’m in favor of starting to implement and iterate as needed.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same, I share the same hesitation with Endpoint, but I think its fine for initial implementation

}

// Endpoint interface allows setting and retrieving of attributes
// by a data collector.
// Note that actual endpoint structure would be something like (pseudeocide)
// type EndpointState struct {
// address
// ...
// data map[string]interface{}
// }
// The plugin interface would only mutata the `data` map
type Endpoint interface {
// StoreAttributes sets the data for the Endpoint on behalf
// of the named collection Plugin
StoreAttributes(collector string, data interface{}) error

// GetAttributes retrieves the attributes of the named collection
// plugin for the Endpoint
GetAttributes(collector string) (interface{}, error)
}

// DataLayerSourcesRegistry include the list of available
// Data Sources (interface defined below) in the system.
// It is accompanied by functions (not shown) to register
// and retrieve sources
type DataLayerSourcesRegistry map[string]DataSource
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just thinking aloud. Should we generate a new string type here to be very concrete about valid DataLayerSources? i.e. type DataSource string

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sounds reasonable. What assurances would we want from the type?

We expect a new DataLayer source to define its "tag" when registered in the system and the same name would be expected by Plugins (to access the extended data).
I can see using hierarchical name (e.g., with "DNS domains" prefix) to reduce chance of collisions (not sure we'll reach that point...), so maybe something like k8s NamespaceName could be useful?


// DataSource interface represents a data source that tracks
// pods/resources and notifies data collection plugins to
// extract relevant attributes.
type DataSource interface {
// Type of data available from this source
Type() string

// Start begins the data collection and notification loop
Start(ctx context) error

// Stop terminates data collection
Stop() error

// Subscribe a collector to receive updates for tracked endpoints
Subscribe(collector DataCollection) error

// UpdateEndpoints replaces the set of pods/resources tracked by
// this source.
// Alternative: add/remove individual endpoints?
UpdateEndpoints(epIDs []string) error
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Out of scope for this proposal since we are going to iterate, but should we always assume that data collection is for endpoints? That seems reasonable for now. I cant think of something explicit that would be collected pool-wide. Again, just open thoughts.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seen some discussion of using CRDs (to define LoRA loading and availability) or receiving scheduling hints from an external optimizer process (over k8s CRs or GRPC, possibly providing hints for which inference pool is optimal in case of multiple pools on different HW).
These would not be directly from endpoints or even attached to endpoints

Having all of our data layer storage keyed off of endpoints, would be a challenge for those use cases.
Not sure how to model that yet, and we probably need to see the real uses cases and refactor accordingly.

}
```

## Open Questions

1. Should we include `Plugin` in interface names? While seems redundant,
we may want to consistent with Scheduling layer.
1. Type safety in extensible data colletion: `map[string]interface{}` seems
like the simplest option to start, but may want to evolve to support
type safety using generics or codegen.
1. Should we design a separate interface specifically for k8s object watching
under GIE control or do we want these to be managed as yet another data source?
This affects the design (e.g., who owns the k8s caches, clients, etc.).
With a GIE controlled data source, collectors just register the types (and
other constraints? Labels, namespaces, …) with GIE core, and all k8s
functionality is under GIE control.