Skip to content

Feature/conjugation preserving normalize for subword #35

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 12 commits into from
Dec 27, 2021

Conversation

t-yamamura
Copy link
Collaborator

No description provided.

@t-yamamura t-yamamura self-assigned this Dec 24, 2021
}
self._format = self.word_form_types[self.word_form_type]

def format(self, m: Morpheme):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

At this moment the format function is not very useful, it also adds to the hotpath one more Python function call which are unfortunately not free. Returning a callable instead of making this a class would be probably better.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There still should be support of processing data in parallel, which probably will use PreTokenizer from SudachiPy, the fix can be delayed till then.

@t-yamamura t-yamamura requested a review from katsutan December 27, 2021 03:05
@eiennohito eiennohito self-requested a review December 27, 2021 07:22
Copy link
Collaborator

@eiennohito eiennohito left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@t-yamamura t-yamamura merged commit 6b8be88 into main Dec 27, 2021
@t-yamamura t-yamamura deleted the feature/conjugation_preserving_normalize_for_subword branch January 17, 2022 04:51
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants