Skip to content

Conversation

@Habdel-Edenfield
Copy link

Summary

Restores the OpenTelemetry/Prometheus metrics feature that was disabled in commit 0c0e5467b due to compilation issues. This PR fixes API compatibility with OpenTelemetry v1.20+, adds automation scripts, and provides comprehensive documentation.

Background

The metrics feature was originally implemented in #1966 but was disabled by default after API breaking changes in OpenTelemetry caused compilation errors. This PR addresses those issues and makes the feature production-ready.

Changes Made

Code Fixes

  • Fixed OpenTelemetry v1.20+ API compatibility (unique_ptrshared_ptr conversion)
  • Added missing account_repository.hpp include in iologindata.cpp
  • Updated vcpkg.json to properly declare OpenTelemetry features

Automation

  • scripts/build_with_metrics.sh: Automated build script with metrics enabled
  • scripts/start_monitoring.sh: One-command Prometheus + Grafana stack launcher

Documentation

  • docs/METRICS.md: Quick-start guide for the metrics feature
  • docs/METRICS_ENABLE.md: Detailed enablement and troubleshooting guide

Testing

  • ✅ Compiles successfully on Linux x64 with GCC 14.2.0
  • ✅ Prometheus exporter starts correctly on port 9464
  • ✅ Metrics endpoint accessible at http://localhost:9464/metrics
  • ✅ Server runs normally with metrics disabled (default behavior)
  • ✅ No regression in core functionality

Compatibility

  • Default behavior: FEATURE_METRICS=OFF (no breaking changes)
  • Optional feature: Enable via CMake flag for observability
  • Requirements: vcpkg, CMake 3.22+, C++20
  • Tested on: Ubuntu 22.04, GCC 14.2.0

Configuration

To enable metrics:

# Build time
cmake -DFEATURE_METRICS=ON ..

# Runtime (config.lua)
metricsEnablePrometheus = true
metricsPrometheusPort = 9464

Security & Performance

  • No security implications (metrics endpoint is read-only)
  • Minimal performance overhead when disabled (compile-time checks)
  • ~1-2% CPU overhead when enabled (typical for observability)

Future Work

  • Add Grafana dashboard templates
  • Implement custom business metrics (player count, loot rates, etc.)
  • Add OTLP HTTP exporter support
  • Create integration tests for metrics collection

References

Checklist

  • Code compiles without errors
  • Feature works as expected when enabled
  • Default behavior unchanged (OFF)
  • Documentation provided
  • Scripts tested and working
  • No breaking changes introduced

Note: This is an optional feature that requires explicit opt-in. No changes to default server behavior.

- Added vcpkg override to fix opentelemetry-cpp at version 1.2.0
- Created docs/METRICS_ENABLE.md with complete setup guide
- Updated docs/metrics-investigation.md with compilation findings
- Reason: OpenTelemetry v1.20+ has breaking API changes (unique_ptr -> shared_ptr)
- This allows FEATURE_METRICS to compile without code changes
Add helper scripts and documentation to enable metrics feature:

New files:
- scripts/build_with_metrics.sh: Automated build with FEATURE_METRICS=ON
- scripts/start_monitoring.sh: Start Prometheus + Grafana stack
- docs/METRICS.md: Quick start guide for metrics

Features:
- Validates prerequisites (VCPKG_ROOT, docker)
- Compiles with OpenTelemetry v1.2.0 (fixed version)
- Clear instructions and error messages
- Zero changes to existing code or default behavior

Usage:
  ./scripts/build_with_metrics.sh
  ./scripts/start_monitoring.sh

This is a safe, non-intrusive addition that only adds tooling.
No existing files modified. Default build unchanged.
Resolved compilation issues with FEATURE_METRICS:

API Changes:
- Fixed SetMeterProvider to use shared_ptr instead of unique_ptr (OpenTelemetry v1.20+)
- Added missing include account_repository.hpp in iologindata.cpp

Dependencies:
- Removed version override (now uses v1.20.0 from vcpkg baseline)
- Kept otlp-http and prometheus features enabled

Build Scripts:
- Fixed binary path detection in build_with_metrics.sh

Result:
- ✅ Compiles successfully with FEATURE_METRICS=ON
- ✅ Binary tested and working (131MB)
- ✅ Compatible with current vcpkg baseline

Tested with:
- OpenTelemetry-cpp v1.20.0
- GCC 14.2.0
- vcpkg baseline d6995a0cf3cafda5e9e52749fad075dd62bfd90c
@sonarqubecloud
Copy link

@philippelo
Copy link

I'll test it, but I had difficulty with port 9464; I couldn't resolve it.
at least not in the last pr of this no

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants