Support batch-level transformations in `Encoding`s

Sometimes encodings need to be able to take into account batch information, as in a sequence learning task where samples in a batch should be padded to the length of the longest sequence.

Currently, all `Encoding`s transform individual samples, which is great for simplicity and composability, but doesn't allow implementing these batch-level transformations.

A usage of encodings in basically every training loop is `taskdataloaders` which will always give batches of encoded data. We could have this use a new function `encodebatch(encoding, context, block, samples)` that transforms multiple samples at a time. This would operate on vectors of samples, _not a collated batch_, since not all kinds of data can be collated (e.g. different-sized images).

By default, it would simply delegate to the single-sample `encode` function:

```julia
function encodebatch(encoding, context, block, observations::AbstractVector)
    map(obs -> encode(encoding, context, block, obs), observations)
end
```

But it could be overwritten by individual encodings:

```julia
function encodebatch(encoding::PadSequences, context, block, observations::AbstractVector)
    # dummy padding code
    n  = maximum(length, observations)
    return map(obs, pad(obs, n), observations)
end
```

Tagging relevant parties @Chandu-4444 @darsnack @ToucheSir for discussion.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Support batch-level transformations in `Encoding`s #251

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Uh oh!

Support batch-level transformations in Encodings #251

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions

Support batch-level transformations in `Encoding`s #251