Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

πŸ“š [Want to contribute?] ebook2audiobookxtts roadmap #32

Open
4 of 54 tasks
DrewThomasson opened this issue Oct 11, 2024 · 2 comments
Open
4 of 54 tasks

πŸ“š [Want to contribute?] ebook2audiobookxtts roadmap #32

DrewThomasson opened this issue Oct 11, 2024 · 2 comments
Labels
documentation Improvements or additions to documentation help wanted Extra attention is needed open to public contributions open to public contributions TODOs

Comments

@DrewThomasson
Copy link
Owner

DrewThomasson commented Oct 11, 2024

All Features open to public Contributions ⭐

Wanted Extra Parameters

  • Make parameter for specifying output audio file format
  • The F5-TTS model referenced here
  • Make ebook input parameter accept a list of files for multiple files.
  • Make a way for multiple lines to have audio generated at a time using multiple instances of coqui tts running for more beefy hardware.
  • Make ebook input parameter accept a folder containing ebook files to auto-run through.
  • OCR for PDF files (as a Parameter) Talked about here
  • Add a force use device (cpu or GPU) (This will force set the device at the top of the script) Talked about here(currently being added by @ROBERT-MCDOWELL) ref here
  • Use Deepfilternet2 to de-noise any reference audio for voice cloning, demo huggingfacespace using it, Talked about here
  • Custom model dir input for pointing to a folder containing all of the custom model files if available instead of having to point to each model file individually
  • Change voices per chapter parameter Talked about here

My Other Repos I Want to Integrate into the App for Extra Options :)

Create a standard function for load_model() and inference_model() for:

  • ⓍXTTSv2
  • Styletts2
  • πŸͺˆ Piper-tts
  • 🐢 Bark tts
# Standard functions should be:
def load_model() - Will load model and download model to load if not available locally
def inference_model() - Will inference the pre-loaded model

Create Readme in these languages

  • English (en)
  • Spanish (es)
  • French (fr)
  • German (de)
  • Italian (it)
  • Portuguese (pt)
  • Polish (pl)
  • Turkish (tr)
  • Russian (ru)
  • Dutch (nl)
  • Czech (cs)
  • Arabic (ar)
  • Chinese (zh-cn)
  • Japanese (ja)
  • Hungarian (hu)
  • Korean (ko)

Binary builds Working pyinstaller script for:

  • 🍎 Mac Intel x86
  • πŸͺŸ Windows x86
  • 🐧 Linux x86
  • πŸ–₯️🍏 Apple Silicon Mac
  • πŸͺŸπŸ’ͺ ARM Windows
  • 🐧πŸ’ͺ ARM Linux

🐍 Single pip command install that works for:

  • being overseen by @ROBERT-MCDOWELL
  • 🍎 Mac Intel x86
  • πŸͺŸ Windows x86
  • 🐧 Linux x86
  • πŸ–₯️🍏 Apple Silicon Mac
  • πŸͺŸπŸ’ͺ ARM Windows
  • 🐧πŸ’ͺ ARM Linux

Extra Overkill for training models and such (All supported Coqio tts models and piper-tts in one easy command)

Wanted Auto-testing scripts for development

@DrewThomasson if you want to help out at all! πŸ˜ƒ

@DrewThomasson DrewThomasson added the documentation Improvements or additions to documentation label Oct 11, 2024
@DrewThomasson DrewThomasson pinned this issue Oct 11, 2024
@DrewThomasson DrewThomasson changed the title πŸ“š ebook2audiobookxtts roadmap πŸ“š [Want to contribute?] ebook2audiobookxtts roadmap Oct 12, 2024
@DrewThomasson DrewThomasson added help wanted Extra attention is needed open to public contributions open to public contributions TODOs and removed documentation Improvements or additions to documentation labels Oct 13, 2024
@DrewThomasson DrewThomasson added the documentation Improvements or additions to documentation label Oct 14, 2024
@ROBERT-MCDOWELL
Copy link
Contributor

ROBERT-MCDOWELL commented Oct 15, 2024

Another interesting option would be to change voices between chapters. for i.e.:
--voice_mapping {"chapters": {1:"john.wav",2:"stella.wav",3:"child.wav",4:"random"} }
so the selected chapters will see their voice mapped, others will keep the main --voice intact.

@DrewThomasson
Copy link
Owner Author

Another interesting option would be to change voices between chapters. for i.e.: --voice_mapping {"chapters": {1:"john.wav",2:"stella.wav",3:"child.wav",4:"random"} } so the selected chapters will see their voice mapped, others will keep the main --voice intact.

@ROBERT-MCDOWELL Added to roadmap checklist

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
documentation Improvements or additions to documentation help wanted Extra attention is needed open to public contributions open to public contributions TODOs
Projects
None yet
Development

No branches or pull requests

2 participants