Add Heartbeat workflow for health checks and sync#33
Conversation
This workflow checks the health of specified domains and syncs agent instructions to the central dashboard.
❌ Deploy Preview for nicholai failed.
|
There was a problem hiding this comment.
Pull request overview
Adds a scheduled “Heartbeat” job intended to (1) sync agent instructions to a central dashboard and (2) perform periodic health checks against two domains, alerting an external service on failures.
Changes:
- Introduces a new scheduled/manual workflow definition for instruction sync + dual-domain health checks.
- Posts sync/alert requests to an external Railway-hosted service.
- Fails the job when either monitored domain does not return HTTP 200.
| name: Dual-Domain Health & Agent Sync | ||
|
|
||
| on: | ||
| push: | ||
| branches: [ "main" ] | ||
| schedule: | ||
| - cron: '0 * * * *' # Automated health check every hour | ||
| workflow_dispatch: # Allows you to manually trigger a sync | ||
|
|
||
| jobs: | ||
| health-and-sync: | ||
| runs-on: ubuntu-latest |
There was a problem hiding this comment.
This file contains a GitHub Actions workflow, but it’s being added under .github/instructions/ with a .instructions.md extension. GitHub Actions will not load workflows from this path, so the health check/sync job will never run. Move this YAML to .github/workflows/ (e.g. heartbeat.yml) if the intent is to create an actual workflow.
| curl -X POST https://your-railway-app.up.railway.app/update-agent-logic \ | ||
| -H "Authorization: Bearer ${{ secrets.GOVERNANCE_TOKEN }}" \ | ||
| -F "instructions=@Agent.Instructions.pdf" |
There was a problem hiding this comment.
The sync step uploads Agent.Instructions.pdf, but that file doesn’t exist in this repository, so this step will fail every run. Either add the file to the repo/artifact, or update the step to point at the actual instructions source (and consider failing fast with a clear error if it’s missing).
| curl -X POST https://your-railway-app.up.railway.app/update-agent-logic \ | |
| -H "Authorization: Bearer ${{ secrets.GOVERNANCE_TOKEN }}" \ | |
| -F "instructions=@Agent.Instructions.pdf" | |
| INSTRUCTIONS_FILE=".github/instructions/Heartbeat.instructions.md" | |
| if [ ! -f "$INSTRUCTIONS_FILE" ]; then | |
| echo "Expected instructions file not found: $INSTRUCTIONS_FILE" | |
| exit 1 | |
| fi | |
| curl --fail -X POST https://your-railway-app.up.railway.app/update-agent-logic \ | |
| -H "Authorization: Bearer ${{ secrets.GOVERNANCE_TOKEN }}" \ | |
| -F "instructions=@${INSTRUCTIONS_FILE}" |
| status_code=$(curl -o /dev/null -s -w "%{http_code}" https://nicholai.org) | ||
| if [ $status_code -ne 200 ]; then | ||
| echo "Arcade (nicholai.org) is down! Status: $status_code" | ||
| curl -X POST https://your-railway-app.up.railway.app/alert \ | ||
| -d '{"domain": "nicholai.org", "status": "down", "contact": "nicholaimadias@gmail.com"}' | ||
| exit 1 | ||
| fi |
There was a problem hiding this comment.
In GitHub Actions, run: steps execute with bash -e, so if curl fails due to DNS/timeout/etc, this step will exit immediately and skip the alert POST. To ensure alerts fire on network failures, capture the curl exit code (or default the status code to 000) and branch on that instead of letting set -e abort the script.
| status_code=$(curl -o /dev/null -s -w "%{http_code}" https://amazinggracehomeliving.com) | ||
| if [ $status_code -ne 200 ]; then | ||
| echo "Listings (Amazing Grace) is down! Status: $status_code" | ||
| curl -X POST https://your-railway-app.up.railway.app/alert \ | ||
| -d '{"domain": "amazinggracehomeliving.com", "status": "down", "contact": "nicholaimadias@gmail.com"}' | ||
| exit 1 | ||
| fi |
There was a problem hiding this comment.
Same issue as above: if curl fails, the step will terminate before reaching the alert POST due to bash -e. Handle curl failures explicitly so the workflow reliably sends an alert when the domain is unreachable.
| curl -X POST https://your-railway-app.up.railway.app/alert \ | ||
| -d '{"domain": "nicholai.org", "status": "down", "contact": "nicholaimadias@gmail.com"}' | ||
| exit 1 |
There was a problem hiding this comment.
The alert payload includes a hard-coded personal email address and the workflow hard-codes the external alerting endpoint. If this repo is public, this leaks PII and makes it easy to accidentally notify the wrong contact. Prefer storing contact info and the alert base URL in repository/environment secrets or variables, and avoid committing personal contact details into the workflow.
| curl -X POST https://your-railway-app.up.railway.app/alert \ | ||
| -d '{"domain": "nicholai.org", "status": "down", "contact": "nicholaimadias@gmail.com"}' |
There was a problem hiding this comment.
The alert request sends a JSON string via curl -d but does not set Content-Type: application/json. Many endpoints will reject or mis-parse this. Add the appropriate Content-Type header (and consider -sS / timeouts) so failures are surfaced clearly and the payload is parsed as intended.
| curl -X POST https://your-railway-app.up.railway.app/alert \ | ||
| -d '{"domain": "amazinggracehomeliving.com", "status": "down", "contact": "nicholaimadias@gmail.com"}' |
There was a problem hiding this comment.
Same JSON alert issue here: the request body is JSON, but the Content-Type: application/json header isn’t set, which can cause the alerting service to misinterpret the payload. Set the correct content type and add basic curl hardening (timeouts / -sS) to improve reliability.
This workflow checks the health of specified domains and syncs agent instructions to the central dashboard.