Description
The current product-config architecture has a few issues that we keep bumping into:
- It's confusing to understand what values come from where
- Validation error messages are produced in the context of the product's configuration format, which is confusing to users who configure it in term of our
ProductCluster
abstraction - PC only works on unstructured key/value maps that correspond directly to the product's configuration structure, making it difficult to feed config back from PC to the operator itself (for example: port values need to match between the configuration's listen option and the generated discovery configmaps)
- This often leads us to bypass P-C for these options
- We have no solid way to get back typed Rust values when using the
Configuration
mechanism to allow fields to be overridden across theProductCluster
/Role
/RoleGroup
hierarchy (without implementing the same precedence mechanism in the operator itself)
- There is a lot of repetitive boilerplate when linking Rust structs up to the PC machinery
To work this out I propose reorient PC so that the Rust structs are the source of truth for the structure, and so that precedence rules and validations are applied while still working with typed Rust values. We would still need to serialize to the product's native configuration format (and apply overrides to that) eventually, but this would be delayed until it is required.
From the definition side, the API is currently expected to look something like this:
#[derive(ProductConfig)]
struct ZookeeperConfig {
#[pc(default = 1000)]
#[pc(file("hdfs-site.xml", "tick.limit.ms"))]
tick_limit_ms: i32,
}
This would expand to something like the following:
struct ZookeeperConfigFragment {
tick_limit_ms: Option<i32>,
}
enum ZookeeperConfigValidationError {
NoTickLimitMs,
}
impl ProductConfigFragment for ZookeeperConfigFragment {
type Validated = ZookeeperConfig;
type ValidationError = ZookeeperConfigValidationError;
fn merge(self, other: &Self) -> Self {
Self {
tick_limit_ms: self.tick_limit_ms.or_else(other.tick_limit_ms),
...
}
}
fn default() -> Self {
Self {
tick_limit_ms: Some(1000),
}
}
fn validate(self) -> Result<ZookeeperConfig, ZookeeperConfigValidationError> {
match self {
ZookeeperConfigFragment { tick_limit_ms: Some(tick_limit_ms) } => Ok(ZookeeperConfig {tick_limit_ms}),
ZookeeperConfigFragment { tick_limit_ms: None, .. } => Err(ZookeeperConfigValidationError::NoTickLimitMs),
}
}
}
This would allow the reconciler to process already fully merged and validated ZookeeperConfig
objects, while still preserving the flexibility of the override hierarchy.
Product-config YAMLs would be relegated to serving as progressive enhancements and overrides to the metadata. For example, this could be used to improve backwards compatibility with older product versions, or to supply more thorough documentation. The current proposal for these looks like this, but this part is still fairly up in the air:
mixins:
- appliesTo:
product:
- kafka
versionRange:
- product: ">=1.0.0"
properties:
- tick_limit_ms
apply:
description: |
The maximum allowed clock skew between cluster members
default: foo
This is a part of stackabletech/issues#198.