Joseph Ho, Karina Buttram
A. The speech recordings are located here, which consist of a reading of the alphabet, and the sounds /f/ and /k/ for comparison.
B. The code for the speech analysis is located here.
C. In the code for the speech analysis, there are 10 speech features that are extracted from the speech recordings. These features include average amplitude of a frequency range determined by 10 fifth-order Butter bandpass filters from 0 to 8000 Hz. Every 5 ms, the Fast Fourier Transform (FFT) function is performed on the speech signal, and the average amplitude of the frequency range is calculated.
Shown below are the spectrogram, average amplitude of individual frequency ranges, and average amplitude of combined frequency ranges for the alphabet.
D. /f/ and /k/ sounds have major differences in their frequencies and amplitudes.
For the /f/ sound, the highest average amplitudes occur in the range of 7200-8000 Hz, 6400-7200 Hz and 5600-6400 Hz. In these signals, the peak of the average amplitude is in the middle of the /f/ sound. Comparatively, the /k/ sound has the highest average amplitudes in the range of 120-800 Hz, 800-1600 Hz, and 4000-4800 Hz. The peak of the average amplitude occurs near the beginning of the /k/ sound.
/f/ sound spectrogram, average amplitude of individual frequency ranges, and average amplitude of combined frequency ranges.
/k/ sound spectrogram, average amplitude of individual frequency ranges, and average amplitude of combined frequency ranges.
E. Conceptual Algorithm for a Cochlear Implant