how to configure tokenization for inference time with rest api

The hierarchical model has special data preparation to chunk the data into a certain number of chunks of a certain length each. The maximum sequence length is the product of these two numbers. But the length is constrained only by the base encoder (say ~512) and the number of chunks isn't built into the network because attention will average over them. So it isn't strictly required to process the data the same time at inference as during train, and so we don't even put those parameters in the model config. Without them in the config, it's hard to even suggest good numbers, but we maybe want to stay flexible enough to allow them to change? Eh, maybe not.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

how to configure tokenization for inference time with rest api #155

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

how to configure tokenization for inference time with rest api #155

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions