This repository was archived by the owner on Jul 7, 2023. It is now read-only.
This repository was archived by the owner on Jul 7, 2023. It is now read-only.
Multiple beam results slowdown: "return_beams=True" increases decode time #601
Open
Description
Currently, the T2T serving (export.py + query.py) returns one result per query.
How would you go about changing it to returns all the beam results? For example, if the beam is 4, return the 4 results, together with their log probabilities, in a similar way to the HParam write_beam_scores=True which is used in t2t_decoder.py
I assume the change should be both in export.py + query.py. My question is what should be changed to support it.
Thanks.