Skip to content

Commit

Permalink
Always use SequentialSampler during evaluation
Browse files Browse the repository at this point in the history
When evaluating, shouldn't we always use the SequentialSampler instead of DistributedSampler? Evaluation only runs on 1 GPU no matter what, so if you use the DistributedSampler with N GPUs, I think you'll only evaluate on 1/N of the evaluation set. That's at least what I'm finding when I run an older/modified version of this repo.
  • Loading branch information
ethanjperez authored and LysandreJik committed Dec 3, 2019
1 parent 3b48806 commit 96e8350
Showing 1 changed file with 1 addition and 1 deletion.
2 changes: 1 addition & 1 deletion examples/run_squad.py
Original file line number Diff line number Diff line change
Expand Up @@ -216,7 +216,7 @@ def evaluate(args, model, tokenizer, prefix=""):

args.eval_batch_size = args.per_gpu_eval_batch_size * max(1, args.n_gpu)
# Note that DistributedSampler samples randomly
eval_sampler = SequentialSampler(dataset) if args.local_rank == -1 else DistributedSampler(dataset)
eval_sampler = SequentialSampler(dataset)
eval_dataloader = DataLoader(dataset, sampler=eval_sampler, batch_size=args.eval_batch_size)

# multi-gpu evaluate
Expand Down

0 comments on commit 96e8350

Please sign in to comment.