We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
论文里面的“content”使用的是两个语音输入,但代码里面只用了一个语音输入去生成两个“content”,即hidden_states_cont1和hidden_states_cont12。请问能说明一下这个差异吗