Open
Description
(Related issue: mozilla/translations#249)
When I run:
/firefox-translations-training/3rd_party/browsermt-marian-dev/build/marian-decoder -m /data/models/spoken-signed/spoken_to_signed/student-finetuned/final.model.npz.best-chrf.npz -v /data/models/spoken-signed/spoken_to_signed/vocab/vocab.spm /data/models/spoken-signed/spoken_to_signed/vocab/vocab.spm -c decoder.yml -i /data/data/spoken-signed/spoken_to_signed/original/devset.spoken.gz -o /data/models/spoken-signed/spoken_to_signed/speed/output.signed --shortlist /data/data/spoken-signed/spoken_to_signed/alignment/lex.s2t.pruned.gz false --dump-quantmult
I get:
[2023-11-08 12:41:15] Error: Rows of matrix: param must be multiple of 8.
[2023-11-08 12:41:15] Error: Aborted from marian::cpu::integer::PrepareBNodeOp::PrepareBNodeOp(marian::Expr, marian::Expr, float, bool) [with marian::Type vtype = (marian::Type)257u; marian::Expr = IntrusivePtr<marian::Chainable<IntrusivePtrmarian::TensorBase > >] in /firefox-translations-training/3rd_party/browsermt-marian-dev/src/tensors/cpu/intgemm_interface.h:92
I suspect that this is the case because my vocab size is 1668 which is not divisible by 8.
Is there a way to pad the vocabulary for this step particularly? Up until this step, I can train browermt fully.