EPA 4 builds upon EPA 3 while adopting ideas from E-Series SANtricity Collector.
EPA 4 is essentially an opinionated Prometheus exporter (or a solution stack, if provided 3rd party components are deployed together with it).
Each EPA Collector monitors one and only one SANtricity system using the lowest-privilege monitor account. Each storage administrator can spin their own and have metrics scraped by own or centralized Prometheus scraper.
You can find more about its positioning and direction in my post about EPA 4.
- NetApp E-Series SANtricity >=11.90
- Python >=3.12
NOTE: master branch may be ahead of Releases. You may download and decompress all EPA 4 and 3 releases from Releases.
| EPA version | Where to go |
|---|---|
| 4 | stay on this page |
| 3 | click here |
EPA Collector defaults to using SANtricity's built-in monitor account unless you override that in arguments or Compose.
It is suggested to just set a password for the SANtricity monitor account and use those credentials.
When you start Collector, check Prometheus exporter on EPA 4 host (localhost or other, with firewall allowing access).
EPA Collector defaults to running Prometheus exporter on HTTP port 9080, which can be changed in Compose or using Collector's Prometheus port option.
curl -v https://HOSTNAME:9080/metrics 2>&1 | grep -E "(Date:|Last-Modified:|< HTTP)"If you run multiple instances of Collector on same system, VM or Compose stack, make sure each exposes a different external Prometheus port.
Users are encouraged to run own Prometheus scraper, database and Grafana, but a ready-made stack is available.
TAG="v4.0.0"
git clone --depth 1 --branch ${TAG} https://github.com/scaleoutsean/eseries-perf-analyzer/
cd eseries-perf-analyzer
cat ./scripts/SCRIPTS.md # Read what these scripts do and how to use them; you need a venv, etc.
./scripts/gen_ca_tls_certs.py all # REQUIRED, unless you supply own TLS certificates. Answer "N" for E-Series with factory TLS certs
./scripts/setup-data-dirs.sh # REQUIRED; creates data directories for Grafana, VM
make vendor # REQUIRED; downloads SANtricity Client (Python) to ./epa/santricity_client
vim .env # optional, for Docker Compose Grafana version or non-default initial credentials
vim docker-compose.yml # REQUIRED; you must provide correct SANtricity API IP/FQDN, credentialsRun Compose:
docker compose up -dNote that make, TLS and data directories-generating scripts are mandatory for Docker Compose users without own Grafana or database.
If you want to use pre-created GHCR containers rather than build own, set the right version with :{TAG} (:4.0.0, for example) and use the same for both collector and grafana-init image version:
- collector
- grafana-init - this one just uploads reference dashboards to Grafana, so you don't need it if you upload own dashboards or do it manually
You can use the same docker-compose.yml file - just change these two images to use the GHCR links, and provide a password for your monitor user on SANtricity and the E-Series management IP address(es). The rest (not shown) should be able to remain as-is.
services:
collector:
image: ghcr.io/scaleoutsean/eseries-perf-analyzer/collector:4.0.0
environment:
- PASSWORD=monitor123 # non-production pass, thank you very much
- API=2.2.2.2 # your E-Series
grafana-init:
image: ghcr.io/scaleoutsean/eseries-perf-analyzer/grafana-init:4.0.0You still need to run the scripts, but you don't need "make vendor" because you won't build the container.
Services:
| Service | Exposed | URL | Note |
|---|---|---|---|
traefik |
Yes | https://HOSTNAME:9080/metrics, https://HOSTNAME:3443 | Reverse proxy for external access to collector, grafana (HTTPS-only) |
collector |
No | http://collector:9080/metrics | Accessible within Compose (i.e. vm, traefik) |
grafana |
No | https://grafana:3443 | Grafana service, proxied by Traefik |
grafana-init |
No | - | Deploys reference dashboards to Grafana |
vm |
No | https://vm:8428 | Victoria Metrics, scrapes metrics directly from collector |
For multiple E-Series systems, it's best to create multiple collector-only Docker Compose files, although you can have all of them in same place (but exposed Prometheus ports and container names must be different for each). And finally, you'd have to scrape each Prometheus metrics endpoint and start managing Victoria Metrics, either from the UI or API/CLI. See CONFIGURATION for more.
This only runs EPA Collector which gathers SANtricity metrics and serves them at http://HOSTNAME:9080/metrics. Open or close host's external HTTP access to the port as needed.
TAG="v4.0.0"
git clone --depth 1 --branch ${TAG} https://github.com/scaleoutsean/eseries-perf-analyzer/
cd eseries-perf-analyzer
make vendor # REQUIRED; downloads SANtricity client to epa/santricity_client directory
pip install -r ./epa/requirements.txt # REQUIRED for CLI (requests library, Prometheus client)
python3 ./epa/collector.py -h Using default username monitor with SANtricity Web/API address 2.2.2.2:
python3 ./epa/collector.py --api 2.2.2.2 --password monitor123 --prometheus-port 9080 --no-verify-sslOpen the browser and navigate to http://HOSTNAME:9080/metrics (or http://127.0.0.1:9080/metrics) to see if Collector's Prometheus exporter is working. If running from Docker, Traefik is used and HTTPS must be specified instead.
- SCRIPTS has more details on running the helper scripts
- CONFIGURATION has extra details about the configuration workflow
- SCREENSHOTS has example screenshots and some details about installing reference dashboards
- FAQs - mostly EPA 3-focused at the moment, with some EPA 4-related content
-
4.0.0 (May 15, 2026)
- Breaking changes: EPA 3 users cannot upgrade to EPA 4. Fresh installation is required. EPA 4 beta can be upgraded to EPA 4
- Export more volume performance metrics
- All changes from the 4.0.0 beta releases with the addition of Traefik-based reverse HTTPS proxy for Collector, Grafana and optionally Victoria Metrics
- Minor adjustments to the helper scripts to generate TLS certificates for Traefik
- Update README and other documents to reflect the addition of Traefik
-
4.0.0 (May 9, 2026)
- Breaking changes: EPA 3 users cannot upgrade to EPA 4. Fresh installation is required. EPA 4 beta can be upgraded to EPA 4
- Collector now only exports Prometheus metrics for any Prometheus scraper and Victoria Metrics is included in reference stack
- Live ("realtime") performance metrics (optional in EPA 3) replace averaged (default in EPA 3). Averaged are no longer available
- SANtricity Major Event Log ("MEL") no longer collected (users may use syslog forwarding for that; also current failures are available in Collector Prometheus metrics for alerting)
- Collector collects snapshot- and linked clone-related metrics and configuration information
- SANtricity Client library included in EPA to avoid duplication of API queries
- Grafana 13 (12.4+ recommended) with several dashboards included in reference EPA stack
-
4.0.0beta3 (May 02, 2026)
- Breaking changes: do not "upgrade" from EPA 3 - deploy version 4 alongside version 3 if you want to try EPA 4
- Fixes for packaging (fix auto-configuration of Grafana Data Source)
- Security: do not log SANtricity password in Docker container
-
4.0.0beta2 (April 27, 2026)
- Breaking changes: do not "upgrade" from EPA 3 - deploy version 4 alongside version 3 if you want to try EPA 4
- Fix automated Grafana dashboard upload in
grafana-init, updategraphana-clientdependency, which now supports Grafana 13 - Set Grafana version to 13 in
./grafana/Dockerfile - Exclude HDDs from SSD wear level metrics
- Add Victoria Metrics "EPA" data source configuration to Grafana container
- Add
Makefilefor easy SANtricity client library download and use newer SANtricity client library to work around bad SANtricty API response - Add missing, but required install steps to README, add SCRIPTS document
- Add installation instructions for Docker Compose with pre-made GHCR images
-
4.0.0beta1 (April 26, 2026)
- Breaking changes: EPA now provides Prometheus-only output with breaking changes compared to Prometheus output from EPA 3. Use any Prometheus-compatible scraper to scrape. EPA 3 users who want to keep data and dashboards from EPA 3 should not "upgrade". EPA 3 will be maintained for months and bugs fixed.
- Collector's direct dependencies are down to three (Requests, Prometheus Client, SANtricity Client)
- New feature: detailed collection of snapshots-related configuration and metrics
- Removed features: MEL events (which belong to logging, not performance or even configuration monitoring)
- Third party stack components: Docker Compose now includes Grafana 13 (and several reference dashboards which should work on v12 as well) and Victoria Metrics as reference Prometheus scraper and database
-
3.5.5 (April 27, 2026)
- Address Grafana dashboard initialization issues in
grafana-init, fix mappings - Address various linting errors in the Python collector
- Update InfluxDB container image to
1.12.4-alpine - Fix minor aesthetic issues in several dashboards (Controllers, Other, SSD Flash Cache)
- EPA 3 GHCR container image releases now tagged with version (e.g.
:3.5.5) since version 4 is also available - Source code: v3.5.5
- Address Grafana dashboard initialization issues in
-
3.5.4 (April 6, 2026)
- Upgrade Grafana from last v8 release to v12.4.1, update existing dashboards to work with v12
- Minor update to InfluxDB (from 1.12.2 to 1.12.3) and requests library (2.33.1)
- Collector Python base image update to
python:3.15.0a7-alpine3.23(fewer base image vulnerabilities) - Test stack with SANtricity 12.00 and 11.95
- Add
TLS_VERIFYoption to EPA collector and Docker Compose environment variables - Make Prometheus service port configurable
- Parse ports from Fibre Channel host objects
- Add SSD Flash Cache metrics and example dashboard
- Collect snapshot and volume count, and repository volumes' total capacity
- Bug fixes and improvements (better handling of unavailable metrics, drop repository volumes from volume collection, avoid duplicate upload of reference dashboards, re-fix SSD wear level stats (11.90 and 12.00, SAS and NVMe SSDs), initiator count, volume capacity)
-
3.5.3 (January 20, 2026)
- Add Prometheus alerts for downed interfaces
- Add optional "point-in-time" volume performance metrics (default: off) for use cases where default (rolling 5 minute average) is not enough. Enable with
--realtime - Minor bug fixes and improvements (including GHCR container builds)
-
3.5.2 (October 8, 2025)
- Update dependencies (container base image to 3.14-alpine3.22 and requests library v2.32.5)
-
3.5.1 (October 8, 2025)
- Export unresolved system failures as Prometheus alerts
- Upgrade InfluxDB to latest and greatest v1.12.2
-
3.5.0 (September 2, 2025)
- Add several array configuration objects: hosts, volumes, disk groupings, drives. Now the monitoring of hardware configuration and - more importantly - disk group/pool and volume capacity should be easy. Existing EPA 3 users with tight DB disk space upgrading to 3.5.0+ should use
--includeand addconfig_collectors only gradually until they're sure their DB can handle it - InfluxDB: expose RPC service on Docker-internal network for convenient access from the utilities container
- Collector: improve database down-sampling/pruning for records older than 30d. No real testing has been done (it'll take 31d to find out), but it's unlikely to be worse than in earlier releases. Still, 3.5.0+ collects a lot more with
config_measurements added, so monitor the size of yourinfluxdbvolume in Docker - Add experimental Prometheus exporter (enabled by default; disable with
--output influxdb). Docker Compose has Prometheus port closed by default - Grafana: sample dashboard for
config_measurements added to /epa/grafana-init/dashboards/ - Collector: add Prometheus client for export of only performance-related measurements. It is on by default, but can be disabled. When enabled, it requires open host firewall and/or expose Docker Compose port
- Collector: various small improvements and small fixes discovered in testing
- Docker Compose: raise maximum InfluxDB RAM to 4GB (docker-compose.yaml) as EPA may need to handle more data
- Docker Compose: "named" InfluxDB and Grafana volumes have been moved to directory-style volumes in the
epadirectory (although you can change that) because "named" didn't behave better
- Add several array configuration objects: hosts, volumes, disk groupings, drives. Now the monitoring of hardware configuration and - more importantly - disk group/pool and volume capacity should be easy. Existing EPA 3 users with tight DB disk space upgrading to 3.5.0+ should use
-
3.4.2 (August 29, 2025)
- Change Docker Compose network type to
bridgefor automated setup/tear-down by Docker - Fix missing
--include <all-measurements>resulted in nothing being collected - Add
--debugswitch to make troubleshooting bugs like the one with--includeeasier
- Change Docker Compose network type to
-
3.4.1 (August 27, 2025)
- Add volume group tag to physical disks (lets you filter disks by (RAID) group or (DPP) pool)
- Add
--include <measurement>for filtered writes to InfluxDB (non-included measurement(s) doesn't get written). Default: include everything
-
3.4.0 (August 24, 2025)
- Remove
dbmanagercontainer and its JSON configuration file (one less container to worry about) - Add "create database" feature to Collector to replace
dbmanager - Minor update of version tags for various images (InfluxDB, Python, Alpine)
- Docker Compose with InfluxDB 1.11.8 necessitates
userkey addition and change to InfluxDB volume ownership - Complete removal of the pre-fork bloat (epa/Makefile, epa/ansible epa/blackduck and the rest of it)
- Merge two docker-compose.yaml files into one (
epa/docker-compose.yaml) - Add
grafana-initcontainer to replace what epa/ansible used to do in a more complicated way - Add
utilscontainer with InfluxDB v1 client for easy management of InfluxDB - Remove "internal images" build feature that epa/Makefile was using - builds are now much faster and easier to maintain
- Small error handling improvements in EPA Collector noted in Issues
- Add checks and fixes for handling inconsistent API responses from SANtricity API that may have caused dropped inserts in InfluxDB in some situations
- Multiple fixes related to built-in dashboards (Grafana data source set to
EPA,WSPhas been removed, dashboards can be imported without issues) - Dashboards are now imported to "EPA" folder. Find them in Grafana with Dashboards > Browse
- Remove direct import of
urllib3and letsrequestsdeal with it (and requests now defaults to v2.5.0). Prior versions of EPA useurllib3v1 which has minor vulnerability that doesn't impact EPA which connects to trusted SANtricity API endpoint over trusted network - See upgrade-related Q&A in the FAQs. There are no new features and apart from the weak
urllib3vulnerability there's no reason to install this if your EPA < 3.4.0 is running fine
- Remove
-
3.3.1 (June 1, 2024):
- Dependency update (requests library)
-
3.3.0 (April 15, 2024):
- collector now collects controller shelf's total power consumption metric (sum of PSUs' consumption) and temperature sensors' values
- Security-related updates of various components
-
3.2.0 (Jan 30, 2023):
- No new features vs. v3.1.0
- No changes to Grafana container, Grafana charts, and InfluxDB container
- collector and dbmanager are now completely independent of containers built by InfluxDB and Grafana Makefile
- New Kubernetes folder with Kubernetes-related instructions and sample YAML files
- collector and dbmanager can work on both AMD64 and ARM64 systems
-
3.1.0 (Jan 12, 2023):
- No changes to Grafana dashboards
- Updated Grafana v8 (8.5.15), Python Alpine image (3.10-alpine3.17) and certifi (2022.12.7)
- Remove SANtricity Web Services Proxy (WSP) and remove WSP-related code from collector
- Make InfluxDB listen on public (external) IP address, so that collectors from remote locations can send data in
- Add the ability to alternate between two E-Series controllers to collector (in upstream v3.0.0 the now-removed WSP would do that)
- Add collection of SSD wear level for flash media (panel(s) haven't been added, it's up to the user to add them if they need 'em)
- Expand the number of required arguments in
collector.pyto avoid unintentional mistakes - Collector can run in Kubernetes and Nomad
- Add dbmanager container for the purpose of uploading array configuration to InfluxDB (and potentially other DB-related tasks down the road)
- Add simple Makefile for collector containers (collector itself, and dbmanager)
- Old unit tests are no longer maintained
