-
Notifications
You must be signed in to change notification settings - Fork 23
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add job to compute the context label from alignment #507
Conversation
mm/context_label.py
Outdated
|
||
pastLabel = np.mod(popFutureLabel, n_contexts) | ||
centerState = np.floor_divide(popFutureLabel, n_contexts) | ||
centerState = np.floor_divide(centerState - hmm_state, 3) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is not correct. Please see also the comment above.
Once the identity of right and left are popped, one is left with the center phoneme state information. Depending on how you dumped the state-tying
options in RASR
, you would do further steps to pop also word end class and hmm state info.
My suggestion here is that the class has a dictionary that has the label info, such as whether end of word or boundary classes are used, and what is the number of hmm state per phoneme. And then to apply the exact inverse function with respect to the implementation of RASR here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
or just parse the string representation of the allophone?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please see the single comment.
mm/context_label.py
Outdated
|
||
pastLabel = np.mod(popFutureLabel, n_contexts) | ||
centerState = np.floor_divide(popFutureLabel, n_contexts) | ||
centerState = np.floor_divide(centerState - hmm_state, 3) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
or just parse the string representation of the allophone?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to me!
The code looks ok on the surface after a quick glance, but as I am not familiar with the logic I would prefer if others approve that have more knowledge of this. |
Co-authored-by: michelwi <michelwi@users.noreply.github.com>
Co-authored-by: michelwi <michelwi@users.noreply.github.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See my comments.
Co-authored-by: Albert Zeyer <zeyer@cs.rwth-aachen.de>
Co-authored-by: Albert Zeyer <zeyer@cs.rwth-aachen.de>
Co-authored-by: Albert Zeyer <zeyer@cs.rwth-aachen.de>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
After my final small comment, looks ok from my side.
Co-authored-by: Albert Zeyer <zeyer@cs.rwth-aachen.de>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am fine with the code. We could still discuss if the sparse option should be true by default, or if your use case actually needs the dense representation. (?)
mm/context_label.py
Outdated
|
||
:param alignment_cache_path: path to alginment cache | ||
:param allophone_path: path to allohone | ||
:param dense_tying_path: path to denser tying file |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
typo: denser -> dense
No description provided.