-
Notifications
You must be signed in to change notification settings - Fork 495
Distributed Parallel Tests on CI systems
Use the --only-group
feature if you want parallel_tests to handle grouping your tests, but need to run your tests across multiple machines. Using this option, parallel_tests will group your tests (based on filesize) into the number of groups you specify with the -n
option, but will only run the group specified with the only-group
option. Files are grouped by filesize to ensure the grouping is consistent across different machines.
For example, let's say you have access to many small machines (read: single core) to run your testing suite. It doesn't make sense to run the tests in parallel on one machine, since it probably wont speed up much given the single core. Travis recommends using 2 processes for their 1.5 cores per box. In this scenario you can use the --only-group
option to run the tests in parallel across a number of machines. So each machine would run a slightly different command:
Machine one:
parallel_test test -n 6 --only-group 1,2
Machine two:
parallel_test test -n 6 --only-group 3,4
Machine three:
parallel_test test -n 6 --only-group 5,6
Of course it's up to you to collect and aggregate the results of the tests at this point.
Note that enabling the --only-group
option means that EVERY group is treated as being the first parallel test (and TEST_ENV_NUMBER is blank). This is because if you're running the tests on separate machines, there's no need to configure a database, etc. for every test group.
The --only-group
option makes it extremely easy to parallelize your builds on Travis CI. Simply specify your matrix and script like so (example with rspec):
...
env:
- "TEST_GROUP=1"
- "TEST_GROUP=2"
- "TEST_GROUP=3"
- "TEST_GROUP=4"
- "TEST_GROUP=5"
- "TEST_GROUP=6"
script:
- bundle exec parallel_test spec/ -n 6 --only-group $TEST_GROUP --group-by filesize --type rspec
Now parallel_tests will take care of grouping your tests for you, and run one group per build worker.
You can also specify multiple groups per worker to execute 2 processes on travis at once (workers have ~1.5 cpus) with TEST_GROUP=1,2
for more details and code-coverage + runtime-logging Big and Fast Tests: Taking our Travis build from 4 hours to 13 minutes
Leveraging GitLab-CI's parallel
feature, introducing an advanced parallel_tests/rspec setup can be fairly straight forward.
.gitlab-ci.yml
file:
parallel_tests:
parallel: 8
variables:
# if you use https://docs.gitlab.com/ee/ci/yaml/#parallel
# instead of running parallel per-core, this setting allows you
# how many jobs should run per-container in parallel
# fallback/default: PARALLEL_STEPS=2
# which means that two parallel jobs are running per container
PARALLEL_STEPS: ""
image: ruby
stage: test
services:
- name: postgres:13
alias: postgresql
script:
- |
set -x
# CI_NODE_INDEX is a GitLab variable
# telling you at what CI_NODE you are currently running
# when you are using https://docs.gitlab.com/ee/ci/yaml/#parallel
INDEX="${CI_NODE_INDEX:-}"
# CI_NODE_TOTAL is always set by GitLab-CI-runner
# if you have not configured 'parallel' CI_NODE_TOTAL=1
# otherwise total will be either CPU cores or 1 if not parallelized
TOTAL="${CI_NODE_TOTAL:-1}"
# how many parallel_steps should run per CI_NODE-host
steps="${PARALLEL_STEPS:-2}"
# the index, corrected to count at 0 instead of 1
index="$((${INDEX}-1))"
# with steps we allow groups to run
START="$(($steps * $index + 1))"
STOP="$(($steps * $index + $steps))"
# generate a list of which range of groups should be run
# generates something like "1,2" for the first two groups, "3,4"
groups="$(seq -s, "${START}" 1 "${STOP}")"
# the total amount of job we are running
TOTAL_JOBS="$(( ${TOTAL} * ${steps} ))"
rake parallel:create["${steps}"]
rake parallel:rake[db:structure:load,"${steps}"]
rake parallel:seed["${steps}"]
export DB_SEED_ALREADY_DONE=1
rake parallel:rake[sphinx:parallel_setup,"${steps}"]
parallel_rspec -n "${TOTAL_JOBS}" --only-group "${groups}" ./spec
The --only-group
option makes it possible to parallelise your builds on Github Actions. Specify your strategy matrix and script like so (example with rspec on standard github hosted runners for private repositories), standard-github-hosted-runners-for-private-repositories offer 2 CPU of 7 GB RAM, github-hosted-runners-for-public-repositories offer 4 CPUs and 16 GB of RAM, adjust accordingly. Larger runners offer additional configurations at an additional cost:
...
rspec:
name: RSpec groups ${{ matrix.ci_job_index }}
runs-on: ubuntu-latest
services:
postgres:
...
env:
BUNDLE_WITHOUT: "development"
CI_TOTAL_JOBS: ${{ matrix.ci_total_jobs }}
CI_JOB_INDEX: ${{ matrix.ci_job_index }}
strategy:
fail-fast: false
matrix:
# Set N number of parallel jobs you want to run.
# Normally equal to the number of CPU cores but in this case relates to the total number of test groups to be run across all runners.
ci_total_jobs: [24]
# Remember to update ci_node_index below to 0..N-1
# When you run 2 parallel jobs then first job will have index 0, the second job will have index 1 etc. For larger runners runners with more CPU adjust accordingly.
ci_job_index:
[
"0, 1",
"2, 3",
.....
"22, 23"
]
steps:
- name: Checkout repository
uses: actions/checkout@v4
- name: Setup Ruby
uses: ruby/setup-ruby@v1
with:
bundler-cache: true
- name: Run RSpec test on group ${{ matrix.ci_job_index }}
env:
...
run: |
echo "::group::parallel:setup"
bundle exec rake parallel:setup[2]
echo "::endgroup::"
echo "::group::parallel:spec"
bundle exec parallel_rspec -n "${CI_TOTAL_JOBS}" --only-group "${CI_JOB_INDEX}" ./spec
echo "::endgroup::"
In the above example parallel_tests
will take care of grouping your tests for you, and run two groups per runner. The above example runs 15000
+ tests in 5m 47s
with a billable time of 56m
.