Skip to content

how to configure tokenization for inference time with rest api #155

@tmills

Description

@tmills

The hierarchical model has special data preparation to chunk the data into a certain number of chunks of a certain length each. The maximum sequence length is the product of these two numbers. But the length is constrained only by the base encoder (say ~512) and the number of chunks isn't built into the network because attention will average over them. So it isn't strictly required to process the data the same time at inference as during train, and so we don't even put those parameters in the model config. Without them in the config, it's hard to even suggest good numbers, but we maybe want to stay flexible enough to allow them to change? Eh, maybe not.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions