Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
14 changes: 7 additions & 7 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -35,9 +35,9 @@ the workload and relay results back to a central master.
- `TEST_QUEUE_RELAY`: relay results back to a central master, specified as tcp `address:port`
- `TEST_QUEUE_STATS`: `path` to cache build stats in-build CI runs (default: `.test_queue_stats`)
- `TEST_QUEUE_FORCE`: comma separated list of suites to run
- `TEST_QUEUE_RELAY_TIMEOUT`: when using remote workers, the amount of time a worker will try to reconnect to start work
- `TEST_QUEUE_RELAY_TOKEN`: when using remote workers, this must be the same on both workers and the server for remote workers to run tests.
- `TEST_QUEUE_SLAVE_MESSAGE`: when using remote workers, set this on a slave worker and it will appear on the slave's connection message on the master.
- `TEST_QUEUE_RELAY_TIMEOUT`: when using distributed builds, the amount of time a remote master will try to reconnect to start work
- `TEST_QUEUE_RELAY_TOKEN`: when using distributed builds, this must be the same on remote masters and the central master for remote masters to be able to connect.
- `TEST_QUEUE_REMOTE_MASTER_MESSAGE`: when using distributed builds, set this on a remote master and it will appear in that master's connection message on the central master.
- `TEST_QUEUE_SPLIT_GROUPS`: split tests up by example rather than example group. Faster for tests with short setup time such as selenium. RSpec only. Add the :no_split tag to ExampleGroups you don't want split.

### usage
Expand Down Expand Up @@ -83,8 +83,8 @@ class MyAppTestRunner < TestQueue::Runner::MiniTest
# ...
end

# If this is a remote slave, tell the master something about us
@slave_message = "Output for slave 123: http://myhost.com/build/123"
# If this is a remote master, tell the central master something about us
@remote_master_message = "Output for remote master 123: http://myhost.com/build/123"
end

def around_filter(suite)
Expand All @@ -100,8 +100,8 @@ MyAppTestRunner.new.execute
### distributed mode

To use distributed mode, the central master must listen on a tcp port. Additional masters can be booted
in relay mode to connect to the central master. Workers must provide a TEST_QUEUE_RELAY_TOKEN to match
the master's.
in relay mode to connect to the central master. Remote masters must provide a TEST_QUEUE_RELAY_TOKEN
to match the central master's.

```
$ TEST_QUEUE_RELAY_TOKEN=123 TEST_QUEUE_SOCKET=0.0.0.0:12345 bundle exec minitest-queue ./test/sample_test.rb
Expand Down
25 changes: 12 additions & 13 deletions lib/test_queue/runner.rb
Original file line number Diff line number Diff line change
Expand Up @@ -85,7 +85,7 @@ def initialize(test_framework, concurrency=nil, socket=nil, relay=nil)
raise ArgumentError, "Worker count (#{@concurrency}) must be greater than 0"
end

@slave_connection_timeout =
@relay_connection_timeout =
(ENV['TEST_QUEUE_RELAY_TIMEOUT'] && ENV['TEST_QUEUE_RELAY_TIMEOUT'].to_i) ||
30

Expand All @@ -100,7 +100,7 @@ def initialize(test_framework, concurrency=nil, socket=nil, relay=nil)
relay ||
ENV['TEST_QUEUE_RELAY']

@slave_message = ENV["TEST_QUEUE_SLAVE_MESSAGE"] if ENV.has_key?("TEST_QUEUE_SLAVE_MESSAGE")
@remote_master_message = ENV["TEST_QUEUE_REMOTE_MASTER_MESSAGE"] if ENV.has_key?("TEST_QUEUE_REMOTE_MASTER_MESSAGE")

if @relay == @socket
STDERR.puts "*** Detected TEST_QUEUE_RELAY == TEST_QUEUE_SOCKET. Disabling relay mode."
Expand Down Expand Up @@ -265,10 +265,10 @@ def start_relay
return unless relay?

sock = connect_to_relay
message = @slave_message ? " #{@slave_message}" : ""
message = @remote_master_message ? " #{@remote_master_message}" : ""
message.gsub!(/(\r|\n)/, "") # Our "protocol" is newline-separated
sock.puts("TOKEN=#{@run_token}")
sock.puts("SLAVE #{@concurrency} #{Socket.gethostname} #{message}")
sock.puts("REMOTE MASTER #{@concurrency} #{Socket.gethostname} #{message}")
response = sock.gets.strip
unless response == "OK"
STDERR.puts "*** Got non-OK response from master: #{response}"
Expand Down Expand Up @@ -485,7 +485,7 @@ def distribute_queue
cmd = sock.gets.strip

token = token[TOKEN_REGEX, 1]
# If we have a slave from a different test run, respond with "WRONG RUN", and it will consider the test run done.
# If we have a remote master from a different test run, respond with "WRONG RUN", and it will consider the test run done.
if token != @run_token
message = token.nil? ? "Worker sent no token to master" : "Worker from run #{token} connected to master"
STDERR.puts "*** #{message} for run #{@run_token}; ignoring."
Expand All @@ -495,7 +495,6 @@ def distribute_queue

case cmd
when /^POP (\S+) (\d+)/
# If we have a slave from a different test run, don't respond, and it will consider the test run done.
hostname = $1
pid = Integer($2)
if awaiting_suites?
Expand All @@ -505,16 +504,16 @@ def distribute_queue
sock.write(data)
@assignments[obj] = [hostname, pid]
end
when /^SLAVE (\d+) ([\w\.-]+)(?: (.+))?/
when /^REMOTE MASTER (\d+) ([\w\.-]+)(?: (.+))?/
num = $1.to_i
slave = $2
slave_message = $3
remote_master = $2
remote_master_message = $3

sock.write("OK\n")
remote_workers += num

message = "*** #{num} workers connected from #{slave} after #{Time.now-@start_time}s"
message << " " + slave_message if slave_message
message = "*** #{num} workers connected from #{remote_master} after #{Time.now-@start_time}s"
message << " " + remote_master_message if remote_master_message
STDERR.puts message
when /^WORKER (\d+)/
data = sock.read($1.to_i)
Expand Down Expand Up @@ -546,12 +545,12 @@ def relay?
def connect_to_relay
sock = nil
start = Time.now
puts "Attempting to connect for #{@slave_connection_timeout}s..."
puts "Attempting to connect for #{@relay_connection_timeout}s..."
while sock.nil?
begin
sock = TCPSocket.new(*@relay.split(':'))
rescue Errno::ECONNREFUSED => e
raise e if Time.now - start > @slave_connection_timeout
raise e if Time.now - start > @relay_connection_timeout
puts "Master not yet available, sleeping..."
sleep 0.5
end
Expand Down
10 changes: 10 additions & 0 deletions test/minitest5.bats
Original file line number Diff line number Diff line change
Expand Up @@ -128,6 +128,16 @@ assert_test_queue_force_ordering() {
assert_output_contains "MiniTestFailure#test_fail"
}

@test "multi-master central master prints out remote master messages" {
export TEST_QUEUE_RELAY_TOKEN=$(date | cksum | cut -d' ' -f1)
TEST_QUEUE_RELAY=0.0.0.0:12345 TEST_QUEUE_REMOTE_MASTER_MESSAGE="hello from remote master" bundle exec minitest-queue ./test/samples/sample_minitest5.rb &
TEST_QUEUE_SOCKET=0.0.0.0:12345 run bundle exec minitest-queue ./test/samples/sample_minitest5.rb
wait

assert_status 0
assert_output_contains "hello from remote master"
}

@test "recovers from child processes dying in an unorderly way" {
export KILL=1
run bundle exec minitest-queue ./test/samples/sample_minitest5.rb
Expand Down