Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
764 changes: 320 additions & 444 deletions .github/workflows/ci-coach.lock.yml

Large diffs are not rendered by default.

552 changes: 54 additions & 498 deletions .github/workflows/ci-coach.md

Large diffs are not rendered by default.

936 changes: 370 additions & 566 deletions .github/workflows/copilot-session-insights.lock.yml

Large diffs are not rendered by default.

385 changes: 20 additions & 365 deletions .github/workflows/copilot-session-insights.md

Large diffs are not rendered by default.

173 changes: 173 additions & 0 deletions .github/workflows/shared/ci-data-analysis.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,173 @@
---
# CI Data Analysis
# Shared module for analyzing CI run data
#
# Usage:
# imports:
# - shared/ci-data-analysis.md
#
# This import provides:
# - Pre-download CI runs and artifacts
# - Build and test the project
# - Collect performance metrics

imports:
- shared/jqschema.md

tools:
cache-memory: true
bash: ["*"]

steps:
- name: Download CI workflow runs from last 7 days
env:
GH_TOKEN: ${{ secrets.GITHUB_TOKEN }}
run: |
# Download workflow runs for the ci workflow
gh run list --repo ${{ github.repository }} --workflow=ci.yml --limit 100 --json databaseId,status,conclusion,createdAt,updatedAt,displayTitle,headBranch,event,url,workflowDatabaseId,number > /tmp/ci-runs.json

# Create directory for artifacts
mkdir -p /tmp/ci-artifacts

# Download artifacts from recent runs (last 5 successful runs)
echo "Downloading artifacts from recent CI runs..."
gh run list --repo ${{ github.repository }} --workflow=ci.yml --status success --limit 5 --json databaseId | jq -r '.[].databaseId' | while read -r run_id; do
echo "Processing run $run_id"
gh run download "$run_id" --repo ${{ github.repository }} --dir "/tmp/ci-artifacts/$run_id" 2>/dev/null || echo "No artifacts for run $run_id"
done

echo "CI runs data saved to /tmp/ci-runs.json"
echo "Artifacts saved to /tmp/ci-artifacts/"

- name: Set up Node.js
uses: actions/setup-node@v6
with:
node-version: "24"
cache: npm
cache-dependency-path: actions/setup/js/package-lock.json

- name: Set up Go
uses: actions/setup-go@v6
with:
go-version-file: go.mod
cache: true

- name: Install dev dependencies
run: make deps-dev

- name: Run linter
run: make lint

- name: Lint error messages
run: make lint-errors

- name: Install npm dependencies
run: npm ci
working-directory: ./actions/setup/js

- name: Build code
run: make build

- name: Rebuild lock files
env:
GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}
run: make recompile

- name: Run unit tests
continue-on-error: true
run: |
mkdir -p /tmp/gh-aw
go test -v -json -count=1 -timeout=3m -tags '!integration' -run='^Test' ./... | tee /tmp/gh-aw/test-results.json
---

# CI Data Analysis

Pre-downloaded CI run data and artifacts are available for analysis:

## Available Data

1. **CI Runs**: `/tmp/ci-runs.json`
- Last 100 workflow runs with status, timing, and metadata

2. **Artifacts**: `/tmp/ci-artifacts/`
- Coverage reports and benchmark results from recent successful runs

3. **CI Configuration**: `.github/workflows/ci.yml`
- Current CI workflow configuration

4. **Cache Memory**: `/tmp/cache-memory/`
- Historical analysis data from previous runs

5. **Test Results**: `/tmp/gh-aw/test-results.json`
- JSON output from Go unit tests with performance and timing data

## Test Case Locations

Go test cases are located throughout the repository:
- **Command tests**: `./cmd/gh-aw/*_test.go`
- **Workflow tests**: `./pkg/workflow/*_test.go`
- **CLI tests**: `./pkg/cli/*_test.go`
- **Parser tests**: `./pkg/parser/*_test.go`
- **Campaign tests**: `./pkg/campaign/*_test.go`
- **Other package tests**: Various `./pkg/*/test.go` files

## Environment Setup

The workflow has already completed:
- ✅ **Linting**: Dev dependencies installed, linters run successfully
- ✅ **Building**: Code built with `make build`, lock files compiled with `make recompile`
- ✅ **Testing**: Unit tests run (with performance data collected in JSON format)

This means you can:
- Make changes to code or configuration files
- Validate changes immediately by running `make lint`, `make build`, or `make test-unit`
- Ensure proposed optimizations don't break functionality before creating a PR

## Analyzing Run Data

Parse the downloaded CI runs data:

```bash
# Analyze run data
cat /tmp/ci-runs.json | jq '
{
total_runs: length,
by_status: group_by(.status) | map({status: .[0].status, count: length}),
by_conclusion: group_by(.conclusion) | map({conclusion: .[0].conclusion, count: length}),
by_branch: group_by(.headBranch) | map({branch: .[0].headBranch, count: length}),
by_event: group_by(.event) | map({event: .[0].event, count: length})
}'
```

**Metrics to extract:**
- Success rate per job
- Average duration per job
- Failure patterns (which jobs fail most often)
- Cache hit rates from step summaries
- Resource usage patterns

## Review Artifacts

Examine downloaded artifacts for insights:

```bash
# List downloaded artifacts
find /tmp/ci-artifacts -type f -name "*.txt" -o -name "*.html" -o -name "*.json"

# Analyze coverage reports if available
# Check benchmark results for performance trends
```

## Historical Context

Check cache memory for previous analyses:

```bash
# Read previous optimization recommendations
if [ -f /tmp/cache-memory/ci-coach/last-analysis.json ]; then
cat /tmp/cache-memory/ci-coach/last-analysis.json
fi

# Check if previous recommendations were implemented
# Compare current metrics with historical baselines
```
193 changes: 193 additions & 0 deletions .github/workflows/shared/ci-optimization-strategies.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,193 @@
---
# CI Optimization Analysis Strategies
# Reusable analysis patterns for CI optimization workflows
#
# Usage:
# imports:
# - shared/ci-optimization-strategies.md
#
# This import provides:
# - Test coverage analysis patterns
# - Performance bottleneck identification
# - Matrix strategy optimization techniques
---

# CI Optimization Analysis Strategies

Comprehensive strategies for analyzing CI workflows to identify optimization opportunities.

## Phase 1: CI Configuration Study

Read and understand the current CI workflow structure:

```bash
# Read the CI workflow configuration
cat .github/workflows/ci.yml

# Understand the job structure
# - lint (runs first)
# - test (depends on lint)
# - integration (depends on test, matrix strategy)
# - build (depends on lint)
# etc.
```

**Key aspects to analyze:**
- Job dependencies and parallelization opportunities
- Cache usage patterns (Go cache, Node cache)
- Matrix strategy effectiveness
- Timeout configurations
- Concurrency groups
- Artifact retention policies

## Phase 2: Test Coverage Analysis

### Critical: Ensure ALL Tests are Executed

**Step 1: Get complete list of all tests**
```bash
# List all test functions in the repository
go test -list='^Test' ./... 2>&1 | grep -E '^Test' > /tmp/all-tests.txt

# Count total tests
TOTAL_TESTS=$(wc -l < /tmp/all-tests.txt)
echo "Total tests found: $TOTAL_TESTS"
```

**Step 2: Analyze unit test coverage**
```bash
# Unit tests run all non-integration tests
# Verify the test job's command captures all non-integration tests
# Current: go test -v -parallel=8 -timeout=3m -tags '!integration' -run='^Test' ./...

# Get list of integration tests (tests with integration build tag)
grep -r "//go:build integration" --include="*_test.go" . | cut -d: -f1 | sort -u > /tmp/integration-test-files.txt

# Estimate number of integration tests
echo "Files with integration tests:"
wc -l < /tmp/integration-test-files.txt
```

**Step 3: Analyze integration test matrix coverage**
```bash
# The integration job has a matrix with specific patterns
# Each matrix entry targets specific packages and test patterns

# CRITICAL CHECK: Are there tests that don't match ANY pattern?

# Extract all integration test patterns from ci.yml
cat .github/workflows/ci.yml | grep -A 2 'pattern:' | grep 'pattern:' > /tmp/matrix-patterns.txt

# Check for catch-all groups
cat .github/workflows/ci.yml | grep -B 2 'pattern: ""' | grep 'name:' > /tmp/catchall-groups.txt
```

**Step 4: Identify coverage gaps**
```bash
# Check if each package with tests is covered by at least one matrix group
# Compare packages with tests vs. packages in CI matrix
# Identify any "orphaned" tests not executed by any job
```

**Required Action if Gaps Found:**
If any tests are not covered by the CI matrix, propose adding:
1. **Catch-all matrix groups** for packages with specific patterns but no catch-all
2. **New matrix entries** for packages not in the matrix at all

Example fix for missing catch-all (add to `.github/workflows/ci.yml`):
```yaml
# Add to the integration job's matrix.include section:
- name: "CLI Other" # Catch-all for tests not matched by specific patterns
packages: "./pkg/cli"
pattern: "" # Empty pattern runs all remaining tests
```

## Phase 3: Test Performance Optimization

### A. Test Splitting Analysis
- Review current test matrix configuration
- Analyze if test groups are balanced in execution time
- Suggest rebalancing to minimize longest-running group

### B. Test Parallelization Within Jobs
- Check if tests run sequentially when they could run in parallel
- Suggest using `go test -parallel=N` to increase parallelism
- Analyze if `-count=1` is necessary for all tests

### C. Test Selection Optimization
- Suggest path-based test filtering to skip irrelevant tests
- Recommend running only affected tests for non-main branch pushes

### D. Test Timeout Optimization
- Review current timeout settings
- Check if timeouts are too conservative or too tight
- Suggest adjusting per-job timeouts based on historical data

### E. Test Dependencies Analysis
- Examine test job dependencies
- Suggest removing unnecessary dependencies to enable more parallelism

### F. Selective Test Execution
- Suggest running expensive tests only on main branch or on-demand
- Recommend running security scans conditionally

### G. Matrix Strategy Optimization
- Analyze if all integration test matrix jobs are necessary
- Check if some matrix jobs could be combined or run conditionally
- Suggest reducing matrix size for PR builds vs. main branch builds

## Phase 4: Resource Optimization

### Job Parallelization
- Identify jobs that could run in parallel but currently don't
- Restructure dependencies to reduce critical path
- Example: Could some test jobs start earlier?

### Cache Optimization
- Analyze cache hit rates
- Suggest caching more aggressively (dependencies, build artifacts)
- Check if cache keys are properly scoped

### Resource Right-Sizing
- Check if timeouts are set appropriately
- Evaluate if jobs could run on faster runners
- Review concurrency groups

### Artifact Management
- Check if retention days are optimal
- Identify unnecessary artifacts
- Example: Coverage reports only need 7 days retention

### Dependency Installation
- Check for redundant dependency installations
- Suggest using dependency caching more effectively
- Example: Sharing `node_modules` between jobs

## Phase 5: Cost-Benefit Analysis

For each potential optimization:
- **Impact**: How much time/cost savings?
- **Effort**: How difficult to implement?
- **Risk**: Could it break the build or miss issues?
- **Priority**: High/Medium/Low

## Optimization Categories

1. **Job Parallelization** - Reduce critical path
2. **Cache Optimization** - Improve cache hit rates
3. **Test Suite Restructuring** - Balance test execution
4. **Resource Right-Sizing** - Optimize timeouts and runners
5. **Artifact Management** - Reduce unnecessary uploads
6. **Matrix Strategy** - Balance breadth vs. speed
7. **Conditional Execution** - Skip unnecessary jobs
8. **Dependency Installation** - Reduce redundant work

## Expected Metrics

Track these metrics before and after optimization:
- Total CI duration (wall clock time)
- Critical path duration
- Cache hit rates
- Test execution time
- Resource utilization
- Cost per CI run
Loading