Add an ID field to filters #174

iffyio · 2021-01-27T17:14:03Z

We want to be able to assign an id (name is already taken) to filters - this would allow us to e.g filter in metrics only coming from a particular filter.

Update all current filter configs to include id field of type string - the field is optional and we use a default uuid otherwise.
We should enforce that any user provided id is at least a valid prometheus label
https://prometheus.io/docs/concepts/data_model/#metric-names-and-labels
Update all metrics exported by current filters to include an id set to the filter's assigned ID.

The text was updated successfully, but these errors were encountered:

markmandel · 2021-01-27T17:24:57Z

Update all metrics exported by current filters to include an id set to the filter's assigned ID.

Should it be a label on the Filter's assigned metric(s), rather than part of the name? Might be good to see the total metrics in for a given filter, then dig down into each individual one?

I'm assuming here that ids for Filters are probably going to be a small set -- but we can't guarantee it. WDYT?

iffyio · 2021-01-27T17:36:52Z

Not entirely sure I follow, do you mean rather than e.g packets_dropped{id="foo"} (label), it'll be foo_packets_dropped (name)? Both cases generate the same number of time series so they'll have the same effect with the same number of filters

markmandel · 2021-01-27T18:21:22Z

Ah ignore me - you were already saying to use a label, I just didn't quite grok that. We're on the same page. 👍

XAMPPRocky · 2021-05-11T06:46:04Z

Update all current filter configs to include id field of type string - the field is optional and we use a default uuid otherwise.

Looking at this now, would it make sense if instead there was some kind of wrapper newtype struct around a filter? I'm just trying to think if there are cases where you explicitly wouldn't want a filter to have an ID, and if not it might be easier to keep it in one place, rather than duplicating the field across each filter. I was thinking something like the following:

struct TrackedFilter<F> {
  id: Option<Uuid>,
  filter: F
}

impl<F: Filter> Filter for TrackedFilter<F> {
    fn read(&self, ctx: ReadContext) -> Option<ReadResponse> {
        self.filter.read(ctx)
    }
    
    fn write(&self, ctx: WriteContext) -> Option<WriteResponse> {
        self.filter.write(ctx)
    }
}

iffyio · 2021-05-11T08:03:15Z

Yeah it would be nice if we can have it in one place. At least all of the builtin filters would include an ID either to tag metrics/log info and a filter that doesn't need it can ignore the value. though custom filters won't necessarily have one.
I'm not sure if the wrapper in this case would be sufficient though since the underlying filter implementation needs access to the ID? both during normal flow but also in the filter factory when initializing metrics

XAMPPRocky · 2021-05-11T08:37:05Z

Good points.

though custom filters won't necessarily have one.

Well we could make it a requirement for all filters by adding fn id(&self) -> &Uuid to the Filter trait as a required method, then you would have to always provide an ID, even when you're writing a custom one. That doesn't reduce the boilerplate on the filter creation side as we'll still need fields but it does let us access the ID of the filter without knowing what filter it is.

I think it probably depends on whether you want the encapsulation to be in the filters or in metrics. Like do we want metrics to work with potentially any filter? Or do we want filters supposed to be mostly opaque and have complete control over metrics?

iffyio · 2021-05-11T13:51:15Z

I think it probably depends on whether you want the encapsulation to be in the filters or in metrics. Like do we want metrics to work with potentially any filter? Or do we want filters supposed to be mostly opaque and have complete control over metrics?

Its up to the filter to export the metric they choose to - we pass in the metrics registry and they register any implementation specific metrics, in that sense its mostly opaque yeah.

Currently its not required that a filter has an id since the current use cases of metrics/logs are implementation specific (e.g a filter that doesn't export any metrics doesn't need one). Not sure if there'll be a use case that requires quilkin to know an ID of a filter and we can add it to the trait if that shows up

XAMPPRocky · 2021-05-17T07:45:27Z

Something that came up when I was implementing this, is "What ID should a FilterChain get?" since it itself is a Filter.

iffyio · 2021-05-17T07:56:13Z

The FilterChain doesn't need an ID since its mostly an implementation detail, the ID isn't part of the filter api though so that should be fine to ignore it?

XAMPPRocky · 2021-05-17T08:26:50Z

The FilterChain doesn't need an ID since its mostly an implementation detail, the ID isn't part of the filter api though so that should be fine to ignore it?

Well FilterChain is part of the public API, and it not having an ID would mean that wouldn't be able to filter the chain out, only its components. I was imagining something like the following "pipeline" of filters, you wouldn't be able to hide all of the filters in the inner filter chain without also hiding the Debug that comes at the start.

FilterChain::new([
    Debug::new(),
    LoadBalancer::new(),
    Authentication::new(),
    FilterChain::new([
        Compression::new(),
        Debug::new(),
        Encryption::new(),
    ]),
    Scriptable::new(),
])

Where as if Filter::id is part of the public API, and FilterChain's also have IDs, they can be treated like any other filter, and you can filter/hide "chains" of filters just as easily. This introspect-ability also relates to #167 as it would be quite easy to implement processing metrics in terms of FilterChain if it can inspect a Filter's ID to map elapsed time stats to an ID, than needing every filter to implement processing metrics manually, it can be given for free if we can inspect them.

iffyio · 2021-05-17T08:52:17Z

That makes sense, I don't think FilterChain was intended to be public (if it is today it shouldn't be since its not mentioned in the docs for example) but yeah we could expose it.
If we do that, then we should do so as a proper filter that isn't special in anyway (so to move it under filters/ etc) - we'll essentially be saying filters are nestable. Also since this isn't dependent on the id issue, I think we can tackle it separately and e.g give filterchain an empty id or so to begin with.

markmandel · 2021-05-18T17:32:41Z

Should the id be a String or &str - rather an a Uuid ? Just so that users can specify their own human readable ids?

XAMPPRocky · 2021-07-05T12:45:22Z

Is this resolved now?

markmandel · 2021-07-09T21:22:20Z

Just peeking through this - we have a name field on the Filter Config:

(source)

/// Filter is the configuration for a single filter
#[derive(Debug, Deserialize, Serialize, Clone)]
#[serde(deny_unknown_fields)]
pub struct Filter {
    pub name: String,
    pub config: Option<serde_yaml::Value>,
}

But I don't think this information makes its way to the Filter when it gets created, at first glance it looks like it should be added to CreateFilterArgs

(source)

/// Arguments needed to create a new filter.
pub struct CreateFilterArgs<'a> {
    /// Configuration for the filter.
    pub config: Option<ConfigType<'a>>,
    /// metrics_registry is used to register filter metrics collectors.
    pub metrics_registry: Registry,
}

So it looks like that work is outstanding at this point.

markmandel · 2021-07-09T21:26:14Z

....and as soon as I hit "submit" I realise that was wrong. The name field is for the Filter is for the filter name 🤦🏻

So it doesn't look like any of this work has been done yet.

XAMPPRocky · 2021-07-10T01:17:15Z

@markmandel yes but I thought from the discussion in #263 we didn’t want instance specific IDs, and Filter::name already serves as a type identifier.

markmandel · 2021-07-10T02:35:56Z

Oh that's right! If a user did something like dynamically generating a uuid or something for each id, it becomes an possible footgun if they are running lots of proxies.

It would be useful though to be able to look at a specific configuration of a filter in a chain and see that that is the one that's causing a performance issue (for example).

But maybe this is one of those things where we close this for now and revisit if people want to explore it in the future -- it's entirely possible that in practice people won't actually use multiple instances of a filter that often.

@jahrlin any suggestions on how to prevent users from creating their own cardinality bomb accidentally? Do we think documentation is good enough (put some big warnings on it)?

XAMPPRocky · 2021-07-10T08:47:35Z

It would be useful though to be able to look at a specific configuration of a filter in a chain and see that that is the one that's causing a performance issue (for example).

One way would be to key by the values of the config rather than by the instance, that would provide some kind bound as long as the configuration itself isn’t unbounded.

XAMPPRocky · 2021-10-12T12:56:47Z

Closing this issue, as it seems resolved, and we can always open in a new issue if there's more work.

iffyio added kind/feature New feature or request good first issue Good for newcomers labels Jan 27, 2021

markmandel added the area/operations Installation, updating, metrics etc label Jan 27, 2021

iffyio mentioned this issue May 20, 2021

review metrics to prevent high-cardinality data #263

Closed

XAMPPRocky removed the good first issue Good for newcomers label Jul 16, 2021

XAMPPRocky closed this as completed Oct 12, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add an ID field to filters #174

Add an ID field to filters #174

iffyio commented Jan 27, 2021

markmandel commented Jan 27, 2021

iffyio commented Jan 27, 2021

markmandel commented Jan 27, 2021

XAMPPRocky commented May 11, 2021 •

edited

Loading

iffyio commented May 11, 2021

XAMPPRocky commented May 11, 2021 •

edited

Loading

iffyio commented May 11, 2021

XAMPPRocky commented May 17, 2021 •

edited

Loading

iffyio commented May 17, 2021

XAMPPRocky commented May 17, 2021

iffyio commented May 17, 2021

markmandel commented May 18, 2021

XAMPPRocky commented Jul 5, 2021

markmandel commented Jul 9, 2021

markmandel commented Jul 9, 2021

XAMPPRocky commented Jul 10, 2021

markmandel commented Jul 10, 2021

XAMPPRocky commented Jul 10, 2021

XAMPPRocky commented Oct 12, 2021

Add an ID field to filters #174

Add an ID field to filters #174

Comments

iffyio commented Jan 27, 2021

markmandel commented Jan 27, 2021

iffyio commented Jan 27, 2021

markmandel commented Jan 27, 2021

XAMPPRocky commented May 11, 2021 • edited Loading

iffyio commented May 11, 2021

XAMPPRocky commented May 11, 2021 • edited Loading

iffyio commented May 11, 2021

XAMPPRocky commented May 17, 2021 • edited Loading

iffyio commented May 17, 2021

XAMPPRocky commented May 17, 2021

iffyio commented May 17, 2021

markmandel commented May 18, 2021

XAMPPRocky commented Jul 5, 2021

markmandel commented Jul 9, 2021

markmandel commented Jul 9, 2021

XAMPPRocky commented Jul 10, 2021

markmandel commented Jul 10, 2021

XAMPPRocky commented Jul 10, 2021

XAMPPRocky commented Oct 12, 2021

XAMPPRocky commented May 11, 2021 •

edited

Loading

XAMPPRocky commented May 11, 2021 •

edited

Loading

XAMPPRocky commented May 17, 2021 •

edited

Loading