ref: https://github.com/google-research/google-research/blob/master/dselect_k_moe/dselect_k_moe.py#L254