-
Notifications
You must be signed in to change notification settings - Fork 295
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
GEMM operator calculates the c
output shape incorrectly when input a
is transpose?
#1583
Comments
Hello, |
Hi @minhthuc2502, thanks for answering.
I would argue that this is not true. Take a look at this line. Notice how
I don't doubt that, but I still think that What do you think? |
You are right, I think we never handle the case where a is transpose. I will do the fix asap. Thank you for your report. |
I have to reconfirm that your initial idea is correct. However, the problem is gemm operation in this project is used for |
Hey everyone!
I believe I have found a bug in the GEMM operator.
To the best of my knowledge, the output shape of the
c
StorageView in the GEMM operator should always be:{m, n}
.However,
Gemm::compute()
calcultes the shape ofc
incorrectly ifa
is transpose (a
has shape{k, m}
instead of{m, k}
).Take a look at this piece of code:
CTranslate2/src/ops/gemm.cc
Lines 84 to 88 in 4f8a4f3
Shape output_shape(a.shape());
: setsoutput_shape
to:{k, m}
output_shape[output_shape.size() - 1] = n;
: setsoutput_shape
to:{k, n}
c.resize(std::move(output_shape));
: resizesc
to{k, n}
.Am I missing something here?
Bare in mind that there are no unit tests to catch this currently.
The text was updated successfully, but these errors were encountered: