-
Notifications
You must be signed in to change notification settings - Fork 379
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Update klite.embd #719
Update klite.embd #719
Conversation
Added AllTalk support, a more recent and better performing XTTSv2 implementation supporting streaming mode, narration mode, deep-speed, multiple inference endpoints and many more features - https://github.com/erew123/alltalk_tts Implemented new code for: - retrieving available voices using AllTalk's available voices API endpoint: "/api/voices" - sending TTS generation requests using AllTalk's TTS generation endpoint: "/api/tts-generate" - new default settings for: xtts default base URL (using AllTalk base URL by default), xttp default voice language, xttp default setting for streaming mode Compatibility with legacy XTTS mode (and legacy XTTS code) has been kept throughout all changed parts. The only missing part in order to enable full support for both XTTS implementations, is in the request payload sending block, where a simple condition should be added to to select the correct payload based on the selected XTTS implementation (but we currently lack UI setting for that as well, so...) Other than that, the only other changes this commit implements consist in a couple of variable renaming for consistency and some minor CSS typo fixes.
Hi, thanks for this PR. But are all the fields necessary? I feel like it's almost like an entirely different endpoint rather than an XTTS drop in, especially if its not even expecting a JSON payload? Let's follow up at erew123/alltalk_tts#88 |
use `autoplay=false` in AllTalk TTS request payload code, otherwise the generated audio will be played by AllTalk sever-side, instead of being sent back to the browser
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Setting the autoplay
parameter to false
in the AllTalk TTS generation request.
With this change the audio is now sent back to the browser and played by it, rather than being played server-side by the XTTS endpoint.
the filename supplied for the generated TTS audio must *not* include an extension, dashes, etc.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixing the default audio filename used in the TTS request (it must not include an extension, dashes, etc.)
fixed `streaming` (needs to be set to `true` for a non-streaming request to work... @daswer123, why?)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed streaming
(needs to be set to true
for a non-streaming request to work... @daswer123, why?)
Hi, I am the author of the project https://github.com/daswer123/xtts-api-server , as I understand you got mistaken and meant @erew123? :) |
Hi @daswer123 Hi Danil thanks for the heads up! Hope you are keeping well! :) @illtellyoulater Is this a question for me? Im suspecting it may more be a question for LostRuins. Ive only had a quick glance at the code youve written/changed but I suspect that "streaming" in this case may be something to do with how Kobold is handing the audio return to play in the webpage. |
I am refactoring this PR, I will probably split the implementation between AllTalk and XTTS as they are too different, rather than try to fit both APIs together. The user will pick which one they wish to use. |
Hi @illtellyoulater, I have added tentative AllTalk support as a separate endpoint based on this PR. As I didn't manage to actually get AllTalk running on colab, this has not been properly tested. It's now added as a separate option from the XTTS-API-Server which functions the same as before. Could you select the AllTalk option in https://lite.koboldai.net to do a quick test, and see if it works fine for you? Thanks! btw, in future, all kobold lite development happens at the lite repo at https://github.com/LostRuins/lite.koboldai.net so PRs should be directed there. cc: @erew123 |
AllTalk implementation is in, please test |
I'm back from travelling (for now) did you manage to test this @illtellyoulater or is there anything my help is needed on this? Thanks |
I did not manage to test this, however, it is merged based on the fields in the PR. If someone could test it, would be good. If I can get it running on colab i'd be happy to test it. |
Hi @LostRuins, hope you are well! I've downloaded a local copy of Kobold and given it a test both in the settings page and the main chat interface, both seem to be working. Just for reference the message highlighted in grey (as below) is fine. That's normal when using the streaming method and adjusting the model to streaming only. So as a base configuration it seems absolutely fine for the streaming mode. If somewhere down the line people wanted to use AllTalk's narrator function, thats a different API call and a few extra things to check (depending on what Kobold may filter from the text it sends over). All in though, it works and cant see an issue! Ill have another shot at getting the colab working at some point! All the best to you both @LostRuins @illtellyoulater Thanks |
Oh that is awesome! Glad to know the integration worked perfectly :) thanks for testing |
Does the audio file get decoded and played correctly? |
As its streaming its not actually generating a wav file as such, just an audio wav file blob that's its firing over as quickly as possible. Taking a quick look at how its interacting it seems that Kobold is playing that back within the browser and handling the playback perfectly. It seems good to me! :) |
Got caught up with something else sorry, but I'm glad to see the progresses! Will give it a try as soon as possible! 👍 Good job guys! |
Added AllTalk support, a more recent and better performing XTTSv2 implementation supporting streaming mode, narration mode, deep-speed, multiple inference endpoints and many more features - https://github.com/erew123/alltalk_tts
Implemented new code for:
Compatibility with legacy XTTS mode (and legacy XTTS code) has been kept throughout all changed parts. The only missing part in order to enable full support for both XTTS implementations, is in the request payload sending block, where a simple condition should be added to to select the correct payload based on the selected XTTS implementation (but we currently lack UI setting for that as well, so...)
Other than that, the only other changes this commit implements consist in a couple of variable renaming for consistency and some minor CSS typo fixes.