Mini-batch Size vs. Memory Limit

Currently mini-batch size N is subject to the memory limit. For example, for training a large model, I cannot use large mini-batch size, otherwise my GPU cannot N training sample at once. 

Is it possible that Caffe can support mini-batch size that can be a multiple of input data batch size? My understanding is that it just needs to accumulate the gradients over several batches of input data before doing a model update step. Right?

I wonder if Caffe will support this functionality, or it already does that (I am new to Caffe so I may have missed something)? Or is there any difficulty I overlooked in implementing this functionality?


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Mini-batch Size vs. Memory Limit #1929

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development