-
Notifications
You must be signed in to change notification settings - Fork 19
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
attention实现的问题 #2
Comments
是的,我也发现了这个现象,只要保证这种层级结构性能都会不错!感谢你的回复 |
哈哈哈哈哈。没错,这就是本文的重要发现,你用什么efficient version都无所谓,所以代码里面也提供了block-wise的做法,两者性能差不多,可视化的图也都一样。 |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
您好,您提到的层次注意力是不是指的是band attention(如下图所示),只不过随着层数增加,窗口大小指数递增。这样的话model.py里这个函数里的那个for循环内容,是不是应该改为window_mask[:, i, i:i+self.bl] = 1
The text was updated successfully, but these errors were encountered: