You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Add an hparam use_global_position_in_packed_sequence in mtf_transformer2.
If True (default), then we use the global position in the packed example as the input to the positional embedding. If False, then we use the position in the individual sequence.
It is counterintuitive why we want to make True the default, since False seems to make more sense.
However, the previous submitted CL had the effect of changing from True to False, which caused some models to diverge. This CL restores the previous working state.
TODO(noam): investigate why the models diverge with False.
PiperOrigin-RevId: 233427027
0 commit comments