Skip to content

Mini-batch Size vs. Memory Limit #1929

Closed
@jimmie33

Description

Currently mini-batch size N is subject to the memory limit. For example, for training a large model, I cannot use large mini-batch size, otherwise my GPU cannot N training sample at once.

Is it possible that Caffe can support mini-batch size that can be a multiple of input data batch size? My understanding is that it just needs to accumulate the gradients over several batches of input data before doing a model update step. Right?

I wonder if Caffe will support this functionality, or it already does that (I am new to Caffe so I may have missed something)? Or is there any difficulty I overlooked in implementing this functionality?

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions