Skip to content

Mask in TokenAttentionPooling #226

Closed
@lhqing

Description

@lhqing

Report

Hi, very nice work!

I'm trying to understand the code of TokenAttentionPooling, it seems to me the class token will always attend to every payload token. And since there is only a single attention call, the class token is return as the result, the input provide mask should not have any effect? I used some different masks and the result of this class is identical. Or maybe I missed something here?

class TokenAttentionPooling(BaseModule):

Besides, I wonder in practice, how do you choose between TokenAttentionPooling and SeedAttentionPooling?

Thanks!

Version information

No response

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions