-
Notifications
You must be signed in to change notification settings - Fork 743
tmpnet
: Enable collection of logs and metrics
#2820
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Merged
Changes from all commits
Commits
Show all changes
14 commits
Select commit
Hold shift + click to select a range
a0f75c3
`tmpnet`: Write config enabling metrics collection by prometheus
maru-ava 2496a58
fixup: Add links to metrics
maru-ava 50b7e23
fixup: Further refine metrics links
maru-ava c99de8a
fixup: Enable filtering metrics by owner
maru-ava 0783d59
fixup: Ensure network-shutdown-delay reflects the prometheus scrape i…
maru-ava 90191fc
fixup: Add mention of metrics configuration to tmpnet README
maru-ava e88b83d
fixup: Avoid collecting prometheus.yaml in artifact
maru-ava d78c0cc
fixup: Fix shellcheck error
maru-ava 6d73484
`tmpnet`: Enable log collection with promtail
maru-ava bbeb4b3
fixup: s/grafana_url/prometheus_url/
maru-ava 9380137
fixup: More promtail cleanup
maru-ava fbd8386
fixup: Ensure the correct filename on x86
maru-ava 4d72be9
fixup: Update shutdown delay to use time.Duration
maru-ava 762032e
fixup: Respond to review feedback
maru-ava File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,19 @@ | ||
#!/usr/bin/env bash | ||
|
||
set -euo pipefail | ||
|
||
# Timestamps are in seconds | ||
from_timestamp="$(date '+%s')" | ||
monitoring_period=900 # 15 minutes | ||
to_timestamp="$((from_timestamp + monitoring_period))" | ||
|
||
# Grafana expects microseconds, so pad timestamps with 3 zeros | ||
metrics_url="${GRAFANA_URL}&var-filter=gh_job_id%7C%3D%7C${GH_JOB_ID}&from=${from_timestamp}000&to=${to_timestamp}000" | ||
|
||
# Optionally ensure that the link displays metrics only for the shared | ||
# network rather than mixing it with the results for private networks. | ||
if [[ -n "${FILTER_BY_OWNER:-}" ]]; then | ||
metrics_url="${metrics_url}&var-filter=network_owner%7C%3D%7C${FILTER_BY_OWNER}" | ||
fi | ||
|
||
echo "::notice links::metrics ${metrics_url}" |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,120 @@ | ||
#!/usr/bin/env bash | ||
|
||
set -euo pipefail | ||
|
||
# Starts a prometheus instance in agent-mode, forwarding to a central | ||
# instance. Intended to enable metrics collection from temporary networks running | ||
# locally and in CI. | ||
# | ||
# The prometheus instance will remain running in the background and will forward | ||
# metrics to the central instance for all tmpnet networks. | ||
# | ||
# To stop it: | ||
# | ||
# $ kill -9 `cat ~/.tmpnet/prometheus/run.pid` && rm ~/.tmpnet/prometheus/run.pid | ||
# | ||
|
||
# e.g., | ||
# PROMETHEUS_ID=<id> PROMETHEUS_PASSWORD=<password> ./scripts/run_prometheus.sh | ||
if ! [[ "$0" =~ scripts/run_prometheus.sh ]]; then | ||
echo "must be run from repository root" | ||
exit 255 | ||
fi | ||
|
||
PROMETHEUS_WORKING_DIR="${HOME}/.tmpnet/prometheus" | ||
PIDFILE="${PROMETHEUS_WORKING_DIR}"/run.pid | ||
|
||
# First check if an agent-mode prometheus is already running. A single instance can collect | ||
# metrics from all local temporary networks. | ||
if pgrep --pidfile="${PIDFILE}" -f 'prometheus.*enable-feature=agent' &> /dev/null; then | ||
echo "prometheus is already running locally with --enable-feature=agent" | ||
exit 0 | ||
fi | ||
|
||
PROMETHEUS_URL="${PROMETHEUS_URL:-https://prometheus-experimental.avax-dev.network}" | ||
if [[ -z "${PROMETHEUS_URL}" ]]; then | ||
echo "Please provide a value for PROMETHEUS_URL" | ||
exit 1 | ||
fi | ||
|
||
PROMETHEUS_ID="${PROMETHEUS_ID:-}" | ||
if [[ -z "${PROMETHEUS_ID}" ]]; then | ||
echo "Please provide a value for PROMETHEUS_ID" | ||
exit 1 | ||
fi | ||
|
||
PROMETHEUS_PASSWORD="${PROMETHEUS_PASSWORD:-}" | ||
if [[ -z "${PROMETHEUS_PASSWORD}" ]]; then | ||
echo "Plase provide a value for PROMETHEUS_PASSWORD" | ||
exit 1 | ||
fi | ||
|
||
# This was the LTS version when this script was written. Probably not | ||
# much reason to update it unless something breaks since the usage | ||
# here is only to collect metrics from temporary networks. | ||
VERSION="2.45.3" | ||
|
||
# Ensure the prometheus command is locally available | ||
CMD=prometheus | ||
if ! command -v "${CMD}" &> /dev/null; then | ||
# Try to use a local version | ||
CMD="${PWD}/bin/prometheus" | ||
if ! command -v "${CMD}" &> /dev/null; then | ||
echo "prometheus not found, attempting to install..." | ||
|
||
# Determine the arch | ||
if which sw_vers &> /dev/null; then | ||
echo "on macos, only amd64 binaries are available so rosetta is required on apple silicon machines." | ||
echo "to avoid using rosetta, install via homebrew: brew install prometheus" | ||
DIST=darwin | ||
else | ||
ARCH="$(uname -i)" | ||
if [[ "${ARCH}" != "x86_64" ]]; then | ||
echo "on linux, only amd64 binaries are available. manual installation of prometheus is required." | ||
exit 1 | ||
else | ||
DIST="linux" | ||
fi | ||
fi | ||
|
||
# Install the specified release | ||
PROMETHEUS_FILE="prometheus-${VERSION}.${DIST}-amd64" | ||
URL="https://github.com/prometheus/prometheus/releases/download/v${VERSION}/${PROMETHEUS_FILE}.tar.gz" | ||
curl -s -L "${URL}" | tar zxv -C /tmp > /dev/null | ||
mkdir -p "$(dirname "${CMD}")" | ||
cp /tmp/"${PROMETHEUS_FILE}/prometheus" "${CMD}" | ||
fi | ||
fi | ||
|
||
# Configure prometheus | ||
FILE_SD_PATH="${PROMETHEUS_WORKING_DIR}/file_sd_configs" | ||
mkdir -p "${FILE_SD_PATH}" | ||
|
||
echo "writing configuration..." | ||
cat >"${PROMETHEUS_WORKING_DIR}"/prometheus.yaml <<EOL | ||
# my global config | ||
global: | ||
# Make sure this value takes into account the network-shutdown-delay in tests/fixture/e2e/env.go | ||
scrape_interval: 10s # Default is every 1 minute. | ||
evaluation_interval: 10s # The default is every 1 minute. | ||
scrape_timeout: 5s # The default is every 10s | ||
|
||
scrape_configs: | ||
- job_name: "avalanchego" | ||
metrics_path: "/ext/metrics" | ||
file_sd_configs: | ||
- files: | ||
- '${FILE_SD_PATH}/*.json' | ||
|
||
remote_write: | ||
- url: "${PROMETHEUS_URL}/api/v1/write" | ||
basic_auth: | ||
username: "${PROMETHEUS_ID}" | ||
password: "${PROMETHEUS_PASSWORD}" | ||
EOL | ||
|
||
echo "starting prometheus..." | ||
cd "${PROMETHEUS_WORKING_DIR}" | ||
nohup "${CMD}" --config.file=prometheus.yaml --web.listen-address=localhost:0 --enable-feature=agent > prometheus.log 2>&1 & | ||
echo $! > "${PIDFILE}" | ||
echo "running with pid $(cat "${PIDFILE}")" |
Oops, something went wrong.
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It probably makes sense to simplify this so that the test scripts optionally start promtail and promtail internally. Having this many steps and envs to worry about does not seem ideal and each job in all our repos that want to collect metrics will need similar configuration.