Skip to content

Commit

Permalink
Refactor metric provider implementation
Browse files Browse the repository at this point in the history
* Remove metric provider config to avoid introducing new public
  interfaces. Since there is only one provider (prometheus) and it
  doesn't have any configurable settings, remove the configuration
  changes for now. We can always add these in the future.

* Remove dummy metric provider implementation. This isn't needed now
  that we're using the metrics.Metrics interface instead of
  metrics.GlobalMetrics.

* Remove metrics.GlobalMetrics in favour of metrics.Metrics. Move the
  HTTP handler instrumentation interfaces into the server package to
  avoid coupling the metrics package to the net/http package.

* Refactor the prometheus provider to implement the metrics.Metrics
  interface. Since the prometheus registry can error on Gather()
  calls, the provider has been updated to accept a logger and use ti
  when the Gather() call fails. This doesn't affect any public
  interfaces so it can be revisited in future if needed. Alteratnively
  we could add a Gather() interface onto metrics.Metrics which could
  return the error.

* Refactor status plugin to include metrics in status update by
  default. Users implementing the status API are likely to need
  performance metrics to gauge the OPA's health. Moreover if they are
  implementing the status API it's unlikely they will want to poll the
  /metrics endpoint on the OPA HTTP API (which may not even be
  exposed.)

* Move the prometheus endpoint test case into the e2e package so the
  server package has no dependencies on prometheus anymore.

Signed-off-by: Torin Sandall <torinsandall@gmail.com>
  • Loading branch information
tsandall committed Aug 15, 2019
1 parent f8442c1 commit b052346
Show file tree
Hide file tree
Showing 16 changed files with 340 additions and 305 deletions.
11 changes: 0 additions & 11 deletions config/config.go
Original file line number Diff line number Diff line change
Expand Up @@ -27,13 +27,6 @@ type Config struct {
Plugins map[string]json.RawMessage `json:"plugins"`
DefaultDecision *string `json:"default_decision"`
DefaultAuthorizationDecision *string `json:"default_authorization_decision"`
MetricsProvider *MetricsProviderConfig `json:"metrics_provider"`
}

// MetricsProviderConfig represents metrics_provider config section
type MetricsProviderConfig struct {
Name string `json:"name"`
Config json.RawMessage `json:"config"`
}

// ParseConfig returns a valid Config object with defaults injected. The id
Expand Down Expand Up @@ -93,10 +86,6 @@ func (c *Config) validateAndInjectDefaults(id string) error {
c.Labels["id"] = id
c.Labels["version"] = version.Version

if c.MetricsProvider == nil {
c.MetricsProvider = &MetricsProviderConfig{Name: "prometheus"}
}

return nil
}

Expand Down
16 changes: 0 additions & 16 deletions docs/content/configuration.md
Original file line number Diff line number Diff line change
Expand Up @@ -52,9 +52,6 @@ status:
service: acmecorp

default_decision: /http/example/authz/allow

metrics_provider:
name: prometheus
```
## Environment Variable Substitution
Expand Down Expand Up @@ -346,7 +343,6 @@ server provenance, etc.
| --- | --- | --- | --- |
| `status.service` | `string` | Yes | Name of service to use to contact remote server. |
| `status.partition_name` | `string` | No | Path segment to include in status updates. |
| `status.include_metrics` | `boolean` | (default: `false`) | Include Prometheus metrics in status updates. |

## Decision Logs

Expand All @@ -371,15 +367,3 @@ server provenance, etc.
| `discovery.decision` | `string` | No (default: value of `discovery.name` configuration field) | Name of the OPA query that will be used to calculate the configuration |
| `discovery.polling.min_delay_seconds` | `int64` | No (default: `60`) | Minimum amount of time to wait between configuration downloads. |
| `discovery.polling.max_delay_seconds` | `int64` | No (default: `120`) | Maximum amount of time to wait between configuration downloads. |


## Metrics provider

| Field | Type | Required | Description |
| --- | --- | --- | --- |
| `metrics_provider.name` | `string` | No | Name of the metrics provider to use. |
| `metrics_provider.config` | `object` | No | Provider-specific configuration (not used as of now). |

Available metrics providers:
* `prometheus` (default)
* (empty string): do not collect metrics
181 changes: 112 additions & 69 deletions docs/content/status.md
Original file line number Diff line number Diff line change
Expand Up @@ -47,75 +47,119 @@ on the agent, updates will be sent to `/status`.
"last_successful_activation": "2018-01-01T00:00:00.000Z"
}
},
"metrics": {
"prometheus": [
{
"help": "A summary of the GC invocation durations.",
"name": "go_gc_duration_seconds",
"type": 2,
"metric": [
{
"summary": {
"quantile": [
{
"quantile": 0,
"value": 0.000044358
},
{
"quantile": 0.25,
"value": 0.000045003
},
{
"quantile": 0.5,
"value": 0.000049726
},
{
"quantile": 0.75,
"value": 0.000219553
},
{
"quantile": 1,
"value": 0.000219553
}
],
"sample_count": 4,
"sample_sum": 0.00035864
}
}
]
},
{
"help": "Number of goroutines that currently exist.",
"name": "go_goroutines",
"type": 1,
"metric": [
{
"gauge": {
"value": 11
}
}
]
},
{
"help": "Information about the Go environment.",
"name": "go_info",
"type": 1,
"metric": [
{
"gauge": {
"value": 1
},
"label": [
{
"name": "version",
"value": "go1.12.7"
}
]
}
]
"metrics": {
"prometheus": {
"go_gc_duration_seconds": {
"help": "A summary of the GC invocation durations.",
"metric": [
{
"summary": {
"quantile": [
{
"quantile": 0,
"value": 0.000011799
},
{
"quantile": 0.25,
"value": 0.000011905
},
{
"quantile": 0.5,
"value": 0.000040002
},
{
"quantile": 0.75,
"value": 0.000065238
},
{
"quantile": 1,
"value": 0.000104897
}
],
"sample_count": 7,
"sample_sum": 0.000309117
}
]
}
],
"name": "go_gc_duration_seconds",
"type": 2
},
------------------------------8< SNIP 8<------------------------------
"http_request_duration_seconds": {
"help": "A histogram of duration for requests.",
"metric": [
{
"histogram": {
"bucket": [
{
"cumulative_count": 1,
"upper_bound": 0.005
},
{
"cumulative_count": 1,
"upper_bound": 0.01
},
{
"cumulative_count": 1,
"upper_bound": 0.025
},
{
"cumulative_count": 1,
"upper_bound": 0.05
},
{
"cumulative_count": 1,
"upper_bound": 0.1
},
{
"cumulative_count": 1,
"upper_bound": 0.25
},
{
"cumulative_count": 1,
"upper_bound": 0.5
},
{
"cumulative_count": 1,
"upper_bound": 1
},
{
"cumulative_count": 1,
"upper_bound": 2.5
},
{
"cumulative_count": 1,
"upper_bound": 5
},
{
"cumulative_count": 1,
"upper_bound": 10
}
],
"sample_count": 1,
"sample_sum": 0.003157399
},
"label": [
{
"name": "code",
"value": "200"
},
{
"name": "handler",
"value": "v1/data"
},
{
"name": "method",
"value": "get"
}
]
}
],
"name": "http_request_duration_seconds",
"type": 4
}
}
}
}
```

Expand All @@ -137,8 +181,7 @@ Status updates contain the following fields:
| `discovery.active_revision` | `string` | Opaque revision identifier of the last successful discovery activation. |
| `discovery.last_successful_download` | `string` | RFC3339 timestamp of last successful discovery bundle download. |
| `discovery.last_successful_activation` | `string` | RFC3339 timestamp of last successful discovery bundle activation. |
| `metrics` | `object` | Application metrics. Optional, single key object. |
| `metrics[provider_name]` | JSON (`interface{}`) | Metrics in provider-dependent format. |
| `metrics.prometheus` | `object` | Global performance metrics for the OPA instance. |

If the bundle download or activation failed, the status update will contain
the following additional fields.
Expand Down
25 changes: 0 additions & 25 deletions internal/metrics/dummy.go

This file was deleted.

26 changes: 0 additions & 26 deletions internal/metrics/metrics.go

This file was deleted.

Loading

0 comments on commit b052346

Please sign in to comment.