Skip to content

Bottleneck Transformers for Visual Recognition #64

Open
@jinglescode

Description

@jinglescode

Paper

Link: https://arxiv.org/pdf/2101.11605.pdf
Year: 2021

Summary

  • incorporates self-attention in ResNet's bottleneck blocks, improves instance segmentation and object detection while reducing the parameters.
  • convolution and self-attention can beat ImageNet benchmark, pure attention ViT models struggle in small data regime, but shine in large data regime.

image

Methods

image

uses relative position encodings, seeing gains over the absolute position encodings

Results

image

Metadata

Metadata

Assignees

No one assigned

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions