|
| 1 | +--- |
| 2 | +layout: post |
| 3 | +toc: true |
| 4 | +title: Eventstore Observability |
| 5 | +description: Eventstore Observability with Micrometer, Prometheus and Grafana |
| 6 | +date: 2025-11-29 08:00:00 |
| 7 | +categories: [Eventstore Documentation,Eventstore deployment] |
| 8 | +tags: [observability,monitoring,micrometer,prometheus,grafana] |
| 9 | +--- |
| 10 | + |
| 11 | +## EventStore Observability |
| 12 | + |
| 13 | +Understanding how your EventStore deployment performs in production is critical for maintaining a healthy event-sourced system. Observability enables you to: |
| 14 | + |
| 15 | +- **Track operational health**: Monitor append and query rates to detect unusual activity patterns |
| 16 | +- **Identify performance bottlenecks**: Measure operation durations to find slow queries or contention |
| 17 | +- **Optimize resource usage**: Understand which event streams are most active and resource-intensive |
| 18 | +- **Debug production issues**: Correlate metrics with application behavior during incident investigation |
| 19 | +- **Capacity planning**: Use historical metrics to predict growth and plan infrastructure scaling |
| 20 | + |
| 21 | +## Micrometer Integration |
| 22 | + |
| 23 | +EventStore uses [Micrometer](https://micrometer.io/) as its metrics collection framework. Micrometer provides a vendor-neutral facade similar to SLF4J for logging, allowing you to emit metrics once and send them to various monitoring backends (Prometheus, Grafana Cloud, Datadog, etc.). |
| 24 | + |
| 25 | +When creating an EventStore instance, provide a `MeterRegistry` to enable metrics collection: |
| 26 | + |
| 27 | +```java |
| 28 | +// Option 1: Use the global registry (simplest approach) |
| 29 | +EventStorage storage = PostgresEventStorage.newBuilder().build(); |
| 30 | +EventStore store = EventStoreFactory.get().eventStore(storage); |
| 31 | + |
| 32 | +// Option 2: Provide a custom registry with specific configuration |
| 33 | +MeterRegistry registry = new SimpleMeterRegistry(); |
| 34 | +EventStore store = EventStoreFactory.get().eventStore(storage, registry); |
| 35 | +``` |
| 36 | + |
| 37 | +### Adding Custom Tags for Drill-Down Analysis |
| 38 | + |
| 39 | +To enable drill-down analysis by deployment context, add common tags to your registry: |
| 40 | + |
| 41 | +```java |
| 42 | +MeterRegistry registry = new SimpleMeterRegistry(); |
| 43 | + |
| 44 | +// Add tags for deployment context |
| 45 | +registry.config().commonTags( |
| 46 | + "instance", "eventstore-01", // Instance identifier |
| 47 | + "deployment", "production-eu-west", // Deployment unit/region |
| 48 | + "app.version", "1.2.3", // Application version |
| 49 | + "environment", "production" // Environment name |
| 50 | +); |
| 51 | + |
| 52 | +EventStore store = EventStoreFactory.get().eventStore(storage, registry); |
| 53 | +``` |
| 54 | + |
| 55 | +These tags are automatically applied to all metrics, enabling you to: |
| 56 | +- Compare performance across different instances |
| 57 | +- Identify version-specific issues after deployments |
| 58 | +- Separate production from staging metrics |
| 59 | +- Analyze regional performance differences |
| 60 | + |
| 61 | +## Available Metrics |
| 62 | + |
| 63 | +EventStore exposes the following metrics through Micrometer. All metrics include these automatic tags: |
| 64 | + |
| 65 | +| Tag | Description | Example Values | |
| 66 | +|-----|-------------|----------------| |
| 67 | +| `context` | Event stream context | `"customer"`, `"order"`, `""` (empty for any-context) | |
| 68 | +| `purpose` | Event stream purpose | `"123"`, `"aggregate-id"`, `""` (empty for any-purpose) | |
| 69 | +| `typed` | Whether stream uses typed or raw events | `"true"`, `"false"` | |
| 70 | +| `storage` | Storage backend name | `"postgres"`, `"inmemory"` | |
| 71 | + |
| 72 | +### Counters |
| 73 | + |
| 74 | +| Metric Name | Description | Unit | |
| 75 | +|-------------|-------------|------| |
| 76 | +| `sliceworkz.eventstore.stream.create` | Number of event stream objects created | count | |
| 77 | +| `sliceworkz.eventstore.append` | Number of successful append operations | count | |
| 78 | +| `sliceworkz.eventstore.append.event` | Total number of events appended | count | |
| 79 | +| `sliceworkz.eventstore.append.optimisticlock` | Number of append operations rejected due to optimistic locking conflicts | count | |
| 80 | +| `sliceworkz.eventstore.query` | Number of query operations executed | count | |
| 81 | +| `sliceworkz.eventstore.query.event` | Total number of events retrieved by queries | count | |
| 82 | +| `sliceworkz.eventstore.get.event` | Number of individual event lookups by ID | count | |
| 83 | +| `sliceworkz.eventstore.bookmark.place` | Number of bookmark updates | count | |
| 84 | +| `sliceworkz.eventstore.bookmark.get` | Number of bookmark retrievals | count | |
| 85 | + |
| 86 | +### Timers |
| 87 | + |
| 88 | +| Metric Name | Description | Unit | |
| 89 | +|-------------|-------------|------| |
| 90 | +| `sliceworkz.eventstore.append.duration` | Time taken to append events (including optimistic locking check) | milliseconds | |
| 91 | +| `sliceworkz.eventstore.query.duration` | Time taken to execute queries | milliseconds | |
| 92 | + |
| 93 | +### Gauges |
| 94 | + |
| 95 | +| Metric Name | Description | Unit | |
| 96 | +|-------------|-------------|------| |
| 97 | +| `sliceworkz.eventstore.append.position` | Highest event position appended to this stream | position | |
| 98 | + |
| 99 | +## Example Configuration: Prometheus |
| 100 | + |
| 101 | +To expose metrics to Prometheus, add the Prometheus Micrometer registry dependency and configure an HTTP endpoint. |
| 102 | + |
| 103 | +### Maven Dependencies |
| 104 | + |
| 105 | +```xml |
| 106 | +<dependency> |
| 107 | + <groupId>io.micrometer</groupId> |
| 108 | + <artifactId>micrometer-registry-prometheus</artifactId> |
| 109 | + <version>1.16.0</version> |
| 110 | +</dependency> |
| 111 | +<dependency> |
| 112 | + <groupId>io.javalin</groupId> |
| 113 | + <artifactId>javalin</artifactId> |
| 114 | + <version>6.4.0</version> |
| 115 | +</dependency> |
| 116 | +``` |
| 117 | + |
| 118 | +### Java Configuration with Javalin |
| 119 | + |
| 120 | +```java |
| 121 | +import io.javalin.Javalin; |
| 122 | +import io.micrometer.core.instrument.MeterRegistry; |
| 123 | +import io.micrometer.prometheus.PrometheusConfig; |
| 124 | +import io.micrometer.prometheus.PrometheusMeterRegistry; |
| 125 | + |
| 126 | +public class EventStoreApp { |
| 127 | + public static void main(String[] args) { |
| 128 | + // Create Prometheus registry |
| 129 | + PrometheusMeterRegistry prometheusRegistry = |
| 130 | + new PrometheusMeterRegistry(PrometheusConfig.DEFAULT); |
| 131 | + |
| 132 | + // Add common tags for drill-down |
| 133 | + prometheusRegistry.config().commonTags( |
| 134 | + "instance", System.getenv("HOSTNAME"), |
| 135 | + "app.version", "1.2.3" |
| 136 | + ); |
| 137 | + |
| 138 | + // Create EventStore with Prometheus metrics |
| 139 | + EventStorage storage = PostgresEventStorage.newBuilder().build(); |
| 140 | + EventStore eventStore = EventStoreFactory.get() |
| 141 | + .eventStore(storage, prometheusRegistry); |
| 142 | + |
| 143 | + // Expose metrics endpoint via Javalin |
| 144 | + Javalin app = Javalin.create().start(8080); |
| 145 | + |
| 146 | + app.get("/metrics", ctx -> { |
| 147 | + ctx.contentType("text/plain; version=0.0.4"); |
| 148 | + ctx.result(prometheusRegistry.scrape()); |
| 149 | + }); |
| 150 | + |
| 151 | + // Your application logic here... |
| 152 | + } |
| 153 | +} |
| 154 | +``` |
| 155 | + |
| 156 | +### Prometheus Scrape Configuration |
| 157 | + |
| 158 | +Add this job to your `prometheus.yml`: |
| 159 | + |
| 160 | +```yaml |
| 161 | +scrape_configs: |
| 162 | + - job_name: 'eventstore' |
| 163 | + static_configs: |
| 164 | + - targets: ['localhost:8080'] |
| 165 | + metrics_path: '/metrics' |
| 166 | + scrape_interval: 15s |
| 167 | +``` |
| 168 | +
|
| 169 | +## Example Reporting: Grafana |
| 170 | +
|
| 171 | +Grafana provides powerful visualization and alerting capabilities for EventStore metrics using Prometheus as a datasource. |
| 172 | +
|
| 173 | +### Setting Up Grafana with Prometheus |
| 174 | +
|
| 175 | +1. **Add Prometheus datasource** in Grafana: |
| 176 | + - Navigate to Configuration → Data Sources |
| 177 | + - Select "Prometheus" |
| 178 | + - Set URL to your Prometheus instance (e.g., `http://localhost:9090`) |
| 179 | + - Click "Save & Test" |
| 180 | + |
| 181 | +2. **Create EventStore dashboard** with useful panels: |
| 182 | + |
| 183 | +**Panel: Append Rate by Stream Context** |
| 184 | +```promql |
| 185 | +rate(sliceworkz_eventstore_append_total[5m]) |
| 186 | +``` |
| 187 | + |
| 188 | +**Panel: Query Duration (95th Percentile)** |
| 189 | +```promql |
| 190 | +histogram_quantile(0.95, |
| 191 | + rate(sliceworkz_eventstore_query_duration_seconds_bucket[5m]) |
| 192 | +) |
| 193 | +``` |
| 194 | + |
| 195 | +**Panel: Optimistic Locking Conflict Rate** |
| 196 | +```promql |
| 197 | +rate(sliceworkz_eventstore_append_optimisticlock_total[5m]) |
| 198 | +``` |
| 199 | + |
| 200 | +**Panel: Events Appended per Second** |
| 201 | +```promql |
| 202 | +rate(sliceworkz_eventstore_append_event_total[5m]) |
| 203 | +``` |
| 204 | + |
| 205 | +**Panel: Highest Event Position by Stream** |
| 206 | +```promql |
| 207 | +sliceworkz_eventstore_append_position |
| 208 | +``` |
| 209 | + |
| 210 | +### Key Metrics to Monitor |
| 211 | + |
| 212 | +- **High optimistic locking conflicts**: May indicate contention on specific aggregates requiring architectural review |
| 213 | +- **Slow query durations**: Could signal missing indexes, inefficient queries, or database resource constraints |
| 214 | +- **Append rate spikes**: Unusual activity patterns that might indicate bugs or attacks |
| 215 | +- **Event position growth**: Helps predict storage requirements and identify most active streams |
| 216 | + |
| 217 | +With Grafana, you can set up alerts on these metrics to proactively detect issues before they impact users. |
0 commit comments