Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Optimize English g2p function for improved speed and efficiency #124

Merged
merged 2 commits into from
May 14, 2024

Conversation

gordon0414
Copy link

Overview

This merge request optimizes the English g2p function to significantly improve its performance, resolving Issue #104. The primary changes include moving global variables to the top-level scope to avoid redundant initialization, simplifying conditions, and streamlining functions.

Changes:

  • Global Variables:
    • Moved ARPA, _g2p, eng_dict, and tokenizer to the top-level scope to prevent redundant initialization.
  • Function Optimization:
    • g2p Function:
      • Simplified conditions for words with apostrophes.
      • Refactored word-to-phone mapping logic for clarity.
    • __post_replace_ph Function:
      • Removed redundant checks and streamlined mapping replacement.
    • __refine_syllables Function:
      • Replaced explicit loop index with direct iteration.
    • __text_to_words Function:
      • Avoided re-initialization of the tokenizer by moving it to global scope.

Performance Impact:

These modifications resulted in a significant speed improvement for the g2p function
example test was conducted as inferencing a sentence 100 times sequentially

sentence = "In this paper, we propose 1 DSPGAN, a GAN-based universal vocoder."
g2p_results = [g2p(sentence) for _ in range 100)]

before modification

real 2m59.692s
user 3m8.842s
sys 2m1.637s

after modification

real	0m10.850s
user	0m12.223s
sys	0m24.916s

- Moved global variables (ARPA, _g2p, eng_dict, tokenizer) to top-level scope to avoid redundant initialization.
- Simplified conditions for words with apostrophes.
- Streamlined __post_replace_ph function by removing redundant checks.
- Optimized __refine_syllables function with direct iteration.
- Avoided re-initialization of tokenizer in __text_to_words by moving it to global scope.
- Added performance test in __main__ to validate improvements.
- Resolves issue litagin02#104.
@tsukumijima
Copy link

tsukumijima commented May 9, 2024

@gordon0414
I was responsible for a major refactoring in Style-Bert-VITS v2.4.
I took a look at the pull request, but there were a few things I was concerned about.

With this change, the tokenizer loading process has been moved to the global scope. However, this is something to avoid.
bert_models.load_tokenizer() caches loaded tokenizers for each language and keeps them in memory until explicitly unloaded.
Also, (Style-)Bert-VITS2 has different tokenizers for each language.

Moving bert_models.load_tokenizer() to the global scope will load the English tokenizer the moment style_bert_vits2/nlp/english/g2p.py is imported.
Due to this behavior, in a use case where speech synthesis is performed only in Japanese, the originally unnecessary English tokenizer and his BERT pre-trained model are loaded and downloaded the moment the module is imported.

The current Style-Bert-VITS2 code imports style_bert_vits2/nlp/english/g2p.py only when performing English speech synthesis.
However, if you manually import the module separately, this may cause an impact.
In my opinion, heavy operations like loading BERT models/tokenizers should be able to be loaded selectively and at appropriate times as needed.

@gordon0414
Copy link
Author

@tsukumijima I moved the loading tokenizer inside the __text_to_words function for the code integrity.

The improved performance is reserved!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants