Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
51 changes: 28 additions & 23 deletions .github/actions/run-monitored-tmpnet-cmd/action.yml
Original file line number Diff line number Diff line change
Expand Up @@ -41,26 +41,16 @@ inputs:
runs:
using: composite
steps:
# Ensure promtail and prometheus are available
- name: Install nix
uses: ./.github/actions/install-nix
- name: Start prometheus
# Only run for the original repo; a forked repo won't have access to the monitoring credentials
if: (inputs.prometheus_username != '')
shell: bash
# Assumes calling project has a nix flake that ensures a compatible prometheus
run: nix develop --impure --command bash -x ./scripts/run_prometheus.sh
env:
PROMETHEUS_USERNAME: ${{ inputs.prometheus_username }}
PROMETHEUS_PASSWORD: ${{ inputs.prometheus_password }}
- name: Start promtail
if: (inputs.prometheus_username != '')
# - Ensure promtail and prometheus are available
# - Avoid using the install-nix custom action since a relative
# path wouldn't be resolveable from other repos and an absolute
# path would require setting a version.
- uses: cachix/install-nix-action@v30
with:
github_access_token: ${{ inputs.github_token }}
- run: nix develop --command echo "dependencies installed"
shell: bash
# Assumes calling project has a nix flake that ensures a compatible promtail
run: nix develop --impure --command bash -x ./scripts/run_promtail.sh
env:
LOKI_USERNAME: ${{ inputs.loki_username }}
LOKI_PASSWORD: ${{ inputs.loki_password }}
# TODO(marun) Stop emitting this annotation so that any annotation that appears can be assumed to be actionable
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What annotation is this referring to?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The script that this step calls creates a github actions 'annotation' that appears on the summary page of the run. But I'm thinking it does more harm than good since the only other annotations are typically warnings from actions warning of deprecations that we need to notice and respond to.

- name: Notify of metrics availability
if: (inputs.prometheus_username != '')
shell: bash
Expand All @@ -71,17 +61,32 @@ runs:
FILTER_BY_OWNER: ${{ inputs.filter_by_owner }}
- name: Run command
shell: bash
run: ${{ inputs.run_env }} ${{ inputs.run }}
# --impure ensures the env vars are accessible to the command
run: ${{ inputs.run_env }} nix develop --impure --command bash -x ${{ inputs.run }}
env:
TMPNET_DELAY_NETWORK_SHUTDOWN: true # Ensure shutdown waits for a final metrics scrape
TMPNET_START_COLLECTORS: true
LOKI_USERNAME: ${{ inputs.loki_username }}
LOKI_PASSWORD: ${{ inputs.loki_password }}
PROMETHEUS_USERNAME: ${{ inputs.prometheus_username }}
PROMETHEUS_PASSWORD: ${{ inputs.prometheus_password }}
GH_REPO: ${{ inputs.repository_owner }}/${{ inputs.repository_name }}
GH_WORKFLOW: ${{ inputs.workflow }}
GH_RUN_ID: ${{ inputs.run_id }}
GH_RUN_NUMBER: ${{ inputs.run_number }}
GH_RUN_ATTEMPT: ${{ inputs.run_attempt }}
GH_JOB_ID: ${{ inputs.job }}
- name: Upload tmpnet network dir
uses: ./.github/actions/upload-tmpnet-artifact
# This step is duplicated from upload-tmpnet-artifact for the same
# reason as the nix installation. There doesn't appear to be an
# easy way to composee custom actions for use by other repos
# without running into versioning issues.
- name: Upload tmpnet data
uses: actions/upload-artifact@v4
if: always()
with:
name: ${{ inputs.artifact_prefix }}-tmpnet-data
path: |
~/.tmpnet/networks
~/.tmpnet/prometheus/prometheus.log
~/.tmpnet/promtail/promtail.log
if-no-files-found: error
# TODO(marun) Check that collection is working by querying prometheus and loki with the GH_* labels above
13 changes: 13 additions & 0 deletions bin/tmpnetctl
Original file line number Diff line number Diff line change
@@ -0,0 +1,13 @@
#!/usr/bin/env bash

set -euo pipefail

# Ensure the go command is run from the root of the repository
AVALANCHE_PATH=$(cd "$( dirname "${BASH_SOURCE[0]}" )"; cd .. && pwd )
cd "${AVALANCHE_PATH}"

# Build if needed
if [[ ! -f ./build/tmpnetctl ]]; then
./scripts/build_tmpnetctl.sh
fi
./build/tmpnetctl
93 changes: 0 additions & 93 deletions scripts/run_prometheus.sh

This file was deleted.

91 changes: 0 additions & 91 deletions scripts/run_promtail.sh

This file was deleted.

29 changes: 29 additions & 0 deletions tests/e2e/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -107,3 +107,32 @@ these bootstrap checks during development, set the
```bash
E2E_SKIP_BOOTSTRAP_CHECKS=1 ./bin/ginkgo -v ./tests/e2e ...
```

## Monitoring

It is possible to enable collection of logs and metrics from the
temporary networks used for e2e testing by:

- Supplying `--start-collectors` as an argument to the test suite
- Starting collectors in advance of a test run with `tmpnetctl
start-collectors`

Both methods require:

- Auth credentials to be supplied as env vars:
- `PROMETHEUS_USERNAME`
- `PROMETHEUS_PASSWORD`
- `LOKI_USERNAME`
- `LOKI_PASSWORD`
- The availability in the path of binaries for promtail and prometheus
- Starting a development shell with `nix develop` is one way to
ensure this and requires the [installation of
nix](https://github.com/DeterminateSystems/nix-installer?tab=readme-ov-file#install-nix).

Once started, the collectors will continue to run in the background
until stopped by `tmpnetctl stop-collectors`.

The results of collection will be viewable at
https://grafana-poc.avax-dev.network.

For more detail, see the [tmpnet docs](../tmpnet/README.md#monitoring).
10 changes: 10 additions & 0 deletions tests/fixture/e2e/env.go
Original file line number Diff line number Diff line change
Expand Up @@ -79,6 +79,11 @@ func (te *TestEnvironment) Marshal() []byte {
func NewTestEnvironment(tc tests.TestContext, flagVars *FlagVars, desiredNetwork *tmpnet.Network) *TestEnvironment {
require := require.New(tc)

// Start collectors for any command but stop
if flagVars.StartCollectors() && !flagVars.StopNetwork() {
require.NoError(tmpnet.StartCollectors(tc.DefaultContext(), tc.Log()))
}

var network *tmpnet.Network
// Need to load the network if it is being stopped or reused
if flagVars.StopNetwork() || flagVars.ReuseNetwork() {
Expand Down Expand Up @@ -147,6 +152,11 @@ func NewTestEnvironment(tc tests.TestContext, flagVars *FlagVars, desiredNetwork
)
}

// Once one or more nodes are running it should be safe to wait for promtail to report readiness
if flagVars.StartCollectors() {
require.NoError(tmpnet.WaitForPromtailReadiness(tc.DefaultContext(), tc.Log()))
}

if flagVars.StartNetwork() {
os.Exit(0)
}
Expand Down
Loading
Loading