Skip to content

Linformer: Self-Attention with Linear Complexity #46

Open
@jinglescode

Description

@jinglescode

Paper

Link: https://arxiv.org/abs/2006.04768
Year: 2020

Summary

  • self-attention mechanism can be approximated by a low-rank matrix, reduces the overall self-attention complexity from O(n^2) to O(n) in both time and space.

Contributions and Distinctions from Previous Works

  • standard self-attention mechanism of the Transformer uses O(n^2) time and
    space with respect to sequence length

Methods

image

  • if we can estimate the attention weights, we can reduce the number needed

Results

  • for transformer, increasing sequence length will increase inference time, but linformer, stays constant, only increasing k will increase inference time.

image

  • results shows that, linformer can perform well on some tasks, and not on some other tasks. some linformer does better than some other linformer varient

image

Metadata

Metadata

Assignees

No one assigned

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions