This directory contains CTC posterior labeling distributions for a Spanish phoneme recognition task. These were generated using the Persephone toolkit trained on a subset of a crowdsourced high-quality Argentinian Spanish speech data set and evaluated on a small held-out subset.
The labeling distributions are presented in two equivalent formats:
- In the directory
logitsas 2-dimensional NumPy arrays of shape (F, L) where F is the number of frames and L is the size of the labeling alphabet. Each row holds the per-frame logits of the L-dimensional categorical distribution over labels. - In the directory
fstsas serialized finite automata in OpenFst format.