This repository has been archived by the owner on Sep 10, 2022. It is now read-only.
-
Notifications
You must be signed in to change notification settings - Fork 0
/
_STEPS TO FOLLOW.txt
53 lines (52 loc) · 3.3 KB
/
_STEPS TO FOLLOW.txt
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
ASSIGNMENT 3: Vowel Recognition
Pre requisite:
I am attaching a sample file of 320 samples. For this, you should calculate Ai, Ri and Ci and then validate.
Required Ai, Ri and Ci values for the given sample file is also uploaded.
The output is obtained directly i.e. no Hamming window or DC shift is applied (only normalization is done).
You can check the Ai, Ri and Ci values are correct or not.
(Note: Here the sample Ci’s which are given in the file was calculated without inverting Ai’s value.
So, just for checking purpose whether your Ci’s are correct or not don’t invert Ai’s.
But in the assignment you have to invert Ai’s).
For vowel recognition assignment, you need to record several utterances of each vowel.
Procedure to be followed:
Record each vowel at least 20 times. 10 will be used for training and 10 for testing.
Trim your vowel part manually through Cooledit, keeping little bit of silence before and after the vowel.
Name your recording properly.
Eg: RollNo_vowel_utteranceNo.txt
All of you must follow this naming convention. So if your roll number is 214XXXXXX , vowel is u and utter count is 7. Use 214XXXXXX_u_7.txt
....................................................................
Steps for vowel recognition:
Steps to generate reference file for 1 vowel
1) Take first vowel recording, do DC shift and normalization
2) Select 5 frames from the steady part.
3) Compute Ri's, Ai's and Ci's of these frames and apply raised Sine window on Ci's
4) Repeat the above steps for 9 more recording of the same vowel (so now you will have 50 rows of Ci values)
and take the average of these recordings with respect to frames, i.e. frame 1 of all the 10 recordings
should be considered for average and hence 10 rows will produce 1 row of Ci's.
Similarly for all the frames, and so finally we will get 5 rows of Ci values (5 rows and 12 columns).
Dump these values in a text file.
5) Similarly you we will get text file for remaining 4 vowels and these 5 text files will be used
as reference file for vowels.
Now for testing:
1)Take input files for testing (10 test files per vowel) and pass these test files in a loop so that we can check out of 10 files how many are recognized correctly
2) for each test file take 5 frames from stable part
3) Calculate Tokhura's distance from each reference file.
(Note : Since you have 5 stable frames and in each reference file you have 5 rows. So, Calculate the Tokhura’s distance with corresponding frames.
Now, you have got 5 distances take the average that will give you final distance).
Similarly do it with each reference file.
4) The one with minimum distance will be recognized as a vowel.
Please find the Tokhura weights:
1.0, 3.0, 7.0, 13.0, 19.0, 22.0, 25.0, 33.0, 42.0, 50.0, 56.0, 61.0
....................................................................
Submission:
Upload your vowel recognition assignment.
Note:
Upload Source code with all required files by 8th September 2021 EOD. (increased to 13th)
Your project should contain a ReadMe file with proper instruction to execute your code.
Your code should be properly commented and indented.
Naming convention:
yourRollNo_vowelRecognition.zip
**Please follow name convention properly.
10% bonus marks is there for early submission of source code.
**Please don’t upload multiple times.
Link to upload your assignment: