Closed
Description
Hi great work so far. Just one thing, i've noticed that even though the original wav2vec2 base model did not have normalization (but did have it enabled for large), the author has suggested future models even small ones to use normalization.
So i would suggest if a future training run for CLSRIL-23 be done with normalization as suggested. I tried using CLSRIL-23 as a base for pretraining and it seems to work fine with normalization=True
but if you also add in model.extractor_mode='layer_norm'
as suggested I got an error.