You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Not sure the technical constraints, maybe impossible... but I'll show my use case that would be heavily improved and my bottleneck I ran into.
I'm using PlayHT AI audio and want to attach the audio data alongside the text. Latency is important, I want to do everything at once, inside the stream.
You can see how I'm hacking a Buffer, then I decode back to audio on frontend client side because data only supports JSON values.
Some may say, use blob storage... I tried writing to vercel blob instead and pass URL, but I found base64 was still faster.
Ideally, no conversions... I am able to send a Blob or Buffer directly in data would be very cool!
Here is an example of my API:
export async function POST(req: Request) {
// Extract the `messages` from the body of the request
const { messages, personaName } = await req.json();
// Request the OpenAI API for the response based on the prompt
const aiResponse = await openai.chat.completions.create({
model: "gpt-3.5-turbo",
stream: true,
messages: messages,
});
const data = new experimental_StreamData();
const persona = await prisma.persona.findFirst({
where: { name: personaName },
});
const stream = OpenAIStream(aiResponse, {
onFinal: async (completion) => {
const voicesFiltered = voices.filter(
(v) =>
v.voice_engine === "PlayHT2.0" &&
v.gender === persona?.gender &&
v.accent === persona?.accent
);
const resp = await fetch("https://api.play.ht/api/v2/tts/stream", {
method: "POST",
headers: {
"Content-Type": "application/json",
AUTHORIZATION: `${process.env.PLAYHT_SECRET_KEY}`,
"X-USER-ID": process.env.PLAYHT_USER_ID!,
accept: "audio/mpeg",
},
body: JSON.stringify({
text: completion,
voice:
persona?.voiceId ??
voicesFiltered[Math.floor(Math.random() * voicesFiltered.length)]
.id,
output_format: "mp3",
voice_engine: "PlayHT2.0-turbo",
}),
}).catch((err) => console.log("fetch error:", err));
if (!resp) return;
// hack here to get around JSON keys
data.append({
voiceData: Buffer.from(await resp.arrayBuffer()).toString("base64"),
});
// IMPORTANT! you must close StreamData manually or the response will never finish.
data.close();
},
// IMPORTANT! until this is stable, you must explicitly opt in to supporting streamData.
experimental_streamData: true,
});
// Respond with the stream
return new StreamingTextResponse(stream, {}, data);
}
Use Case
For voice audio streaming alongside text AI responses. Probably many other Buffer uses as well people doing. Images, webcam streams, etc.
Additional context
No response
The text was updated successfully, but these errors were encountered:
Feature Description
Not sure the technical constraints, maybe impossible... but I'll show my use case that would be heavily improved and my bottleneck I ran into.
I'm using PlayHT AI audio and want to attach the audio data alongside the text. Latency is important, I want to do everything at once, inside the stream.
The major line in question is:
You can see how I'm hacking a Buffer, then I decode back to audio on frontend client side because
data
only supports JSON values.Some may say, use blob storage... I tried writing to vercel blob instead and pass URL, but I found base64 was still faster.
Ideally, no conversions... I am able to send a Blob or Buffer directly in
data
would be very cool!Here is an example of my API:
Use Case
For voice audio streaming alongside text AI responses. Probably many other Buffer uses as well people doing. Images, webcam streams, etc.
Additional context
No response
The text was updated successfully, but these errors were encountered: