Skip to content

Commit cb2690f

Browse files
Add automated workflow for pid (#855)
* Set up automated test for PID controller Reset Update experiment graphs Delete existing image Done Update experiment graphs Update experiment graphs Refactor experiments folder and CI Update scope of commit Update experiment graphs Update experiment graphs Update Gemfile Update experiment graphs Move windup file into test folder Update experiment graphs CI should only commit main graphs Update experiment graphs New bot Fail fast Test Delete comment Update experiment graphs test Revert test Add the commit Test Test * Rework CI check Done Final
1 parent 2cf2f23 commit cb2690f

File tree

82 files changed

+320
-242
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

82 files changed

+320
-242
lines changed
Lines changed: 66 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,66 @@
1+
---
2+
name: "Automated Experiment Result Checker"
3+
4+
# yamllint disable-line rule:truthy
5+
on:
6+
pull_request:
7+
types: [opened, reopened, synchronize]
8+
9+
concurrency:
10+
group: ${{ github.ref }}-automated-experiment-result-checker
11+
cancel-in-progress: true
12+
13+
permissions:
14+
contents: write
15+
pull-requests: write
16+
jobs:
17+
automated-experiment-result-checker:
18+
runs-on: ubuntu-latest
19+
steps:
20+
- name: Checkout
21+
uses: actions/checkout@08c6903cd8c0fde910a37f88322edcfb5dd907a8 # v5.0.0
22+
with:
23+
ref: ${{ github.event.pull_request.head.sha }}
24+
fetch-depth: 0
25+
26+
- name: Check for updated experiment result graphs
27+
run: |
28+
set -e
29+
cd "$(git rev-parse --show-toplevel)"
30+
31+
# TODO: Include lower bound windup experiment once we have a way to make it run in a reasonable time.
32+
# Find all PNGs, excluding those with "windup" in their filename
33+
mapfile -t all_pngs < <(find experiments/results/main_graphs experiments/results/throughput_graphs experiments/results/duration_graphs -type f -name '*.png' ! -name '*windup*.png' | sort)
34+
35+
# Find all changed PNGs in the latest commit
36+
mapfile -t changed_pngs < <(git diff --name-only --diff-filter=AM HEAD~1..HEAD | grep -E '^experiments/results/(main_graphs|throughput_graphs|duration_graphs)/.*\.png$' | grep -v windup | sort)
37+
38+
# Report any PNGs that are not updated in the latest commit
39+
declare -a not_updated=()
40+
for file in "${all_pngs[@]}"; do
41+
if ! printf "%s\n" "${changed_pngs[@]}" | grep -qx "$file"; then
42+
not_updated+=("$file")
43+
fi
44+
done
45+
46+
if [ ${#not_updated[@]} -gt 0 ]; then
47+
echo "❌ The following result graph PNG files have NOT been updated in the latest commit:"
48+
for f in "${not_updated[@]}"; do
49+
echo " - $f"
50+
done
51+
echo ""
52+
echo "Every commit must update all non-windup experiment result graphs. You may be missing updates."
53+
echo "Run:"
54+
echo ""
55+
echo " cd experiments"
56+
echo " bundle install"
57+
echo " bundle exec ruby run_all_experiments.rb"
58+
echo ""
59+
echo "Commit the updated graphs to resolve this check."
60+
exit 1
61+
fi
62+
63+
echo "✅ All non-windup experiment result graphs are up to date for this commit!"
64+
65+
66+
Binary file not shown.
Binary file not shown.

experiments/example_output.png

-81.3 KB
Binary file not shown.

experiments/example_with_circuit_breaker.rb

Lines changed: 0 additions & 93 deletions
This file was deleted.

experiments/test_helpers.rb renamed to experiments/experiment_helpers.rb

Lines changed: 31 additions & 21 deletions
Original file line numberDiff line numberDiff line change
@@ -2,8 +2,9 @@
22

33
module Semian
44
module Experiments
5-
# Test runner for circuit breaker experiments (both adaptive and classic)
5+
# Experiment runner for circuit breaker experiments (both adaptive and classic)
66
# Handles all the common logic: service creation, threading, monitoring, analysis, and visualization
7+
require "fileutils"
78
class DegradationPhase
89
attr_reader :healthy, :error_rate, :latency
910

@@ -14,11 +15,11 @@ def initialize(healthy: nil, error_rate: nil, latency: nil)
1415
end
1516
end
1617

17-
class CircuitBreakerTestRunner
18-
attr_reader :test_name, :resource_name, :degradation_phases, :phase_duration, :graph_title, :graph_filename, :service_count, :target_service
18+
class CircuitBreakerExperimentRunner
19+
attr_reader :experiment_name, :resource_name, :degradation_phases, :phase_duration, :graph_title, :graph_filename, :service_count, :target_service
1920

2021
def initialize(
21-
test_name:,
22+
experiment_name:,
2223
resource_name:,
2324
degradation_phases:,
2425
phase_duration:,
@@ -32,18 +33,24 @@ def initialize(
3233
graph_bucket_size: nil,
3334
base_error_rate: nil
3435
)
35-
@test_name = test_name
36+
@experiment_name = experiment_name
3637
@resource_name = resource_name
3738
@degradation_phases = degradation_phases
3839
@phase_duration = phase_duration
3940
@graph_title = graph_title
4041
@semian_config = semian_config
4142
@is_adaptive = semian_config[:adaptive_circuit_breaker] == true
4243
@graph_filename = graph_filename || "#{resource_name}.png"
44+
@main_results_path = File.join(File.dirname(__FILE__), "results/main_graphs")
45+
@duration_results_path = File.join(File.dirname(__FILE__), "results/duration_graphs")
46+
@throughput_results_path = File.join(File.dirname(__FILE__), "results/throughput_graphs")
47+
FileUtils.mkdir_p(@main_results_path) unless File.directory?(@main_results_path)
48+
FileUtils.mkdir_p(@duration_results_path) unless File.directory?(@duration_results_path)
49+
FileUtils.mkdir_p(@throughput_results_path) unless File.directory?(@throughput_results_path)
4350
@num_threads = num_threads
4451
@requests_per_second_per_thread = requests_per_second_per_thread
4552
@x_axis_label_interval = x_axis_label_interval || phase_duration
46-
@test_duration = degradation_phases.length * phase_duration
53+
@experiment_duration = degradation_phases.length * phase_duration
4754
@service_count = service_count
4855
@target_service = nil
4956
@graph_bucket_size = graph_bucket_size || (@is_adaptive ? 10 : 1)
@@ -232,12 +239,12 @@ def subscribe_to_state_changes
232239
end
233240

234241
def execute_phases
235-
puts "\n=== #{@test_name} (ADAPTIVE) ==="
242+
puts "\n=== #{@experiment_name} (ADAPTIVE) ==="
236243
puts "Error rate: #{@degradation_phases.map { |r| r.error_rate ? "#{(r.error_rate * 100).round(1)}%" : "N/A" }.join(" -> ")}"
237244
puts "Latency: #{@degradation_phases.map { |r| r.latency ? "#{(r.latency * 1000).round(1)}ms" : "N/A" }.join(" -> ")}"
238245
puts "Phase duration: #{@phase_duration} seconds (#{(@phase_duration / 60.0).round(1)} minutes) per phase"
239-
puts "Duration: #{@test_duration} seconds (#{(@test_duration / 60.0).round(1)} minutes)"
240-
puts "Starting test...\n"
246+
puts "Duration: #{@experiment_duration} seconds (#{(@experiment_duration / 60.0).round(1)} minutes)"
247+
puts "Starting experiment...\n"
241248

242249
@start_time = Time.now
243250

@@ -281,7 +288,7 @@ def wait_for_completion
281288
end
282289

283290
def generate_analysis
284-
puts "\n\n=== Test Complete ==="
291+
puts "\n\n=== Experiment Complete ==="
285292
puts "Actual duration: #{(@end_time - @start_time).round(2)} seconds"
286293
puts "\nGenerating analysis..."
287294

@@ -306,7 +313,7 @@ def display_summary_statistics
306313

307314
def display_time_based_analysis
308315
bucket_size = @phase_duration
309-
num_buckets = (@test_duration / bucket_size.to_f).ceil
316+
num_buckets = (@experiment_duration / bucket_size.to_f).ceil
310317

311318
puts "\n=== Time-Based Analysis (#{bucket_size}-second buckets) ==="
312319
(0...num_buckets).each do |bucket_idx|
@@ -351,7 +358,7 @@ def display_thread_timing_statistics
351358
avg_utilization = (avg_thread_time / total_wall_time * 100)
352359

353360
puts "Total threads: #{@thread_timings.size}"
354-
puts "Test wall clock duration: #{total_wall_time.round(2)}s"
361+
puts "Experiment wall clock duration: #{total_wall_time.round(2)}s"
355362
puts "\nTime spent making requests per thread:"
356363
puts " Min: #{min_thread_time.round(2)}s"
357364
puts " Max: #{max_thread_time.round(2)}s"
@@ -450,7 +457,7 @@ def generate_visualization
450457

451458
# Aggregate data into buckets for detailed visualization
452459
bucket_size = @graph_bucket_size
453-
num_buckets = (@test_duration / bucket_size.to_f).ceil
460+
num_buckets = (@experiment_duration / bucket_size.to_f).ceil
454461

455462
bucketed_data = []
456463
(0...num_buckets).each do |bucket_idx|
@@ -503,8 +510,9 @@ def generate_visualization
503510
add_state_transition_markers(graph, bucketed_data, bucket_size, num_buckets)
504511
end
505512

506-
graph.write(@graph_filename)
507-
puts "Graph saved to #{@graph_filename}"
513+
main_graph_path = File.join(@main_results_path, @graph_filename)
514+
graph.write(main_graph_path)
515+
puts "Graph saved to #{main_graph_path}"
508516

509517
# Generate duration graph
510518
duration_graph = Gruff::Line.new(1400)
@@ -518,8 +526,9 @@ def generate_visualization
518526
duration_graph.data("Total Request Duration", bucketed_data.map { |d| d[:sum_request_duration] })
519527

520528
duration_filename = @graph_filename.sub(%r{([^/]+)$}, 'duration-\1')
521-
duration_graph.write(duration_filename)
522-
puts "Duration graph saved to #{duration_filename}"
529+
duration_graph_path = File.join(@duration_results_path, duration_filename)
530+
duration_graph.write(duration_graph_path)
531+
puts "Duration graph saved to #{duration_graph_path}"
523532

524533
# Generate throughput graph
525534
throughput_graph = Gruff::Line.new(1400)
@@ -533,18 +542,19 @@ def generate_visualization
533542
throughput_graph.data("Total Request Throughput", bucketed_data.map { |d| d[:throughput] })
534543

535544
throughput_filename = @graph_filename.sub(%r{([^/]+)$}, 'throughput-\1')
536-
throughput_graph.write(throughput_filename)
537-
puts "Throughput graph saved to #{throughput_filename}"
545+
throughput_graph_path = File.join(@throughput_results_path, throughput_filename)
546+
throughput_graph.write(throughput_graph_path)
547+
puts "Throughput graph saved to #{throughput_graph_path}"
538548
end
539549

540550
def add_state_transition_markers(graph, bucketed_data, bucket_size, num_buckets)
541551
return if @state_transitions.empty?
542552

543-
test_start = @outcomes.keys[0]
553+
experiment_start = @outcomes.keys[0]
544554

545555
@state_transitions.each_with_index do |transition, idx|
546556
# Calculate which bucket this transition falls into
547-
elapsed = transition[:timestamp] - test_start
557+
elapsed = transition[:timestamp] - experiment_start
548558
bucket_idx = (elapsed / bucket_size).to_i
549559

550560
next if bucket_idx < 0 || bucket_idx >= num_buckets

experiments/test_error_spike_100.rb renamed to experiments/experiments/experiment_error_spike_100.rb

Lines changed: 8 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -1,15 +1,15 @@
11
# frozen_string_literal: true
22

3-
$LOAD_PATH.unshift(File.expand_path("../lib", __dir__))
3+
$LOAD_PATH.unshift(File.expand_path("../../lib", __dir__))
44

55
require "semian"
6-
require_relative "mock_service"
7-
require_relative "experimental_resource"
8-
require_relative "test_helpers"
6+
require_relative "../mock_service"
7+
require_relative "../experimental_resource"
8+
require_relative "../experiment_helpers"
99

10-
# Sudden error spike test: 1% -> 100% -> 1%
11-
runner = Semian::Experiments::CircuitBreakerTestRunner.new(
12-
test_name: "Sudden Error Spike Test (Classic) - 100% for 20 seconds",
10+
# Sudden error spike experiment: 1% -> 100% -> 1%
11+
runner = Semian::Experiments::CircuitBreakerExperimentRunner.new(
12+
experiment_name: "Sudden Error Spike Experiment (Classic) - 100% for 20 seconds",
1313
resource_name: "protected_service_sudden_error_spike_100",
1414
degradation_phases: [Semian::Experiments::DegradationPhase.new(healthy: true)] * 3 +
1515
[Semian::Experiments::DegradationPhase.new(error_rate: 1.00)] +
@@ -22,7 +22,7 @@
2222
error_timeout: 15,
2323
bulkhead: false,
2424
},
25-
graph_title: "Sudden Error Spike Test (Classic) - 100% for 20 seconds",
25+
graph_title: "Sudden Error Spike Experiment (Classic) - 100% for 20 seconds",
2626
graph_filename: "sudden_error_spike_100.png",
2727
x_axis_label_interval: 60,
2828
)

experiments/test_error_spike_100_adaptive.rb renamed to experiments/experiments/experiment_error_spike_100_adaptive.rb

Lines changed: 8 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -1,15 +1,15 @@
11
# frozen_string_literal: true
22

3-
$LOAD_PATH.unshift(File.expand_path("../lib", __dir__))
3+
$LOAD_PATH.unshift(File.expand_path("../../lib", __dir__))
44

55
require "semian"
6-
require_relative "mock_service"
7-
require_relative "experimental_resource"
8-
require_relative "test_helpers"
6+
require_relative "../mock_service"
7+
require_relative "../experimental_resource"
8+
require_relative "../experiment_helpers"
99

10-
# Sudden error spike test: 1% -> 100% -> 1%
11-
runner = Semian::Experiments::CircuitBreakerTestRunner.new(
12-
test_name: "Sudden Error Spike Test (Adaptive) - 100% for 20 seconds",
10+
# Sudden error spike experiment: 1% -> 100% -> 1%
11+
runner = Semian::Experiments::CircuitBreakerExperimentRunner.new(
12+
experiment_name: "Sudden Error Spike Experiment (Adaptive) - 100% for 20 seconds",
1313
resource_name: "protected_service_sudden_error_spike_100_adaptive",
1414
degradation_phases: [Semian::Experiments::DegradationPhase.new(healthy: true)] * 3 +
1515
[Semian::Experiments::DegradationPhase.new(error_rate: 1.00)] +
@@ -19,7 +19,7 @@
1919
adaptive_circuit_breaker: true,
2020
bulkhead: false,
2121
},
22-
graph_title: "Sudden Error Spike Test (Adaptive) - 100% for 20 seconds",
22+
graph_title: "Sudden Error Spike Experiment (Adaptive) - 100% for 20 seconds",
2323
graph_filename: "sudden_error_spike_100_adaptive.png",
2424
x_axis_label_interval: 60,
2525
)

0 commit comments

Comments
 (0)