efficient-attention applly in Cross Attention 

Hello, I have recently implemented a cross attention application with multi-modal fusion, but because the image resolution is too large, cuda OOM occurs when calculating q and k, so I found your paper and hope to use it to reduce the consumption of computing resources. May I ask? Can your concept be applied to cross attention? Is it equivalent to calculating k and v of input2 in advance, and then using a matrix to calculate qw of input1? thank you

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

efficient-attention applly in Cross Attention #13

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

efficient-attention applly in Cross Attention #13

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions