Multiple beam results slowdown: "return_beams=True" increases decode time

Currently, the T2T serving (export.py + query.py) returns one result per query. 

How would you go about changing it to returns all the beam results? For example, if the beam is 4, return the 4 results, together with their log probabilities, in a similar way to the HParam write_beam_scores=True which is used in t2t_decoder.py

I assume the change should be both in export.py + query.py. My question is what should be changed to support it.

Thanks.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Multiple beam results slowdown: "return_beams=True" increases decode time #601

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Multiple beam results slowdown: "return_beams=True" increases decode time #601

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions