Runs per all tasks, i.e. for models A,B,C execute ABCABC (*not* AABBCC). - [x] add runs as CLI arg - [x] do multiple runs - [x] ensure the scores per runs are just summed up