Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add job to compute the context label from alignment #507

Merged
merged 24 commits into from
May 8, 2024

Conversation

Judyxujj
Copy link
Contributor

No description provided.

Jingjing Xu added 2 commits April 29, 2024 16:00
mm/context_label.py Outdated Show resolved Hide resolved
mm/context_label.py Outdated Show resolved Hide resolved

pastLabel = np.mod(popFutureLabel, n_contexts)
centerState = np.floor_divide(popFutureLabel, n_contexts)
centerState = np.floor_divide(centerState - hmm_state, 3)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not correct. Please see also the comment above.

Once the identity of right and left are popped, one is left with the center phoneme state information. Depending on how you dumped the state-tying options in RASR, you would do further steps to pop also word end class and hmm state info.
My suggestion here is that the class has a dictionary that has the label info, such as whether end of word or boundary classes are used, and what is the number of hmm state per phoneme. And then to apply the exact inverse function with respect to the implementation of RASR here

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

or just parse the string representation of the allophone?

Copy link
Contributor

@Marvin84 Marvin84 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please see the single comment.

mm/context_label.py Outdated Show resolved Hide resolved
mm/context_label.py Outdated Show resolved Hide resolved
mm/context_label.py Outdated Show resolved Hide resolved
mm/context_label.py Outdated Show resolved Hide resolved

pastLabel = np.mod(popFutureLabel, n_contexts)
centerState = np.floor_divide(popFutureLabel, n_contexts)
centerState = np.floor_divide(centerState - hmm_state, 3)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

or just parse the string representation of the allophone?

mm/context_label.py Outdated Show resolved Hide resolved
Jingjing Xu and others added 7 commits May 3, 2024 12:21
Co-authored-by: michelwi <michelwi@users.noreply.github.com>
Co-authored-by: michelwi <michelwi@users.noreply.github.com>
@Judyxujj Judyxujj requested review from Marvin84 and michelwi May 3, 2024 13:17
Copy link
Contributor

@Marvin84 Marvin84 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me!

@JackTemaki
Copy link
Contributor

The code looks ok on the surface after a quick glance, but as I am not familiar with the logic I would prefer if others approve that have more knowledge of this.

mm/context_label.py Outdated Show resolved Hide resolved
mm/context_label.py Outdated Show resolved Hide resolved
mm/context_label.py Outdated Show resolved Hide resolved
Co-authored-by: michelwi <michelwi@users.noreply.github.com>
Judyxujj and others added 2 commits May 6, 2024 22:46
Co-authored-by: michelwi <michelwi@users.noreply.github.com>
@Judyxujj Judyxujj requested a review from michelwi May 6, 2024 15:25
mm/context_label.py Outdated Show resolved Hide resolved
mm/context_label.py Outdated Show resolved Hide resolved
mm/context_label.py Outdated Show resolved Hide resolved
mm/context_label.py Outdated Show resolved Hide resolved
Copy link
Member

@albertz albertz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See my comments.

Judyxujj and others added 4 commits May 6, 2024 17:58
Co-authored-by: Albert Zeyer <zeyer@cs.rwth-aachen.de>
Co-authored-by: Albert Zeyer <zeyer@cs.rwth-aachen.de>
Co-authored-by: Albert Zeyer <zeyer@cs.rwth-aachen.de>
mm/context_label.py Outdated Show resolved Hide resolved
Copy link
Member

@albertz albertz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After my final small comment, looks ok from my side.

Jingjing Xu added 2 commits May 6, 2024 16:17
mm/context_label.py Outdated Show resolved Hide resolved
Co-authored-by: Albert Zeyer <zeyer@cs.rwth-aachen.de>
Copy link
Contributor

@michelwi michelwi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am fine with the code. We could still discuss if the sparse option should be true by default, or if your use case actually needs the dense representation. (?)


:param alignment_cache_path: path to alginment cache
:param allophone_path: path to allohone
:param dense_tying_path: path to denser tying file
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

typo: denser -> dense

Jingjing Xu added 2 commits May 8, 2024 05:54
@Judyxujj Judyxujj merged commit 8c0daa1 into main May 8, 2024
4 checks passed
@Judyxujj Judyxujj deleted the jing-context-label branch May 8, 2024 09:57
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants