I have test self-attention in FashionMnist classification,and Basic Model Accuracy=0.913, Self-Attention Model=0.912
Maybe there exist some details that I dont watch,if you feel some errors welcome to contact me!!! simple implements Non-Local Neural Networks for image classification(Fashion-Mnist)