Skip to content

[Feature request] streamer callback for text-generation task #394

Closed
@seonglae

Description

@seonglae

Streamer
https://huggingface.co/docs/transformers/generation_strategies#streaming

Reason for request
Currently, iterating max_new_tokens: 1 takes much longer time than single generation. Text generation takes time even for light model. Token streaming is key feature for user experience. In my case. Task-specific text generation could be a key feature of AI app development using transformers.js with low cost.

Additional context
I'm not sure the TextStreamer class need to be compatibility with python transformers. I wrote an use case proposal with TextStreamer extends TransformStream. AsyncIterable, AsynsGeneator and Stream API might be usable.

Suggesting streaming code

let aggregatedResponse = '';
const streamer = TextStreamer()
const pipe: QuestionAnsweringPipeline = await pipeline(
  'text-generation',
  model,
  { quantized: true }
)
pipe(prompt, { streamer: streamer })
res = new Response(streamer.readable)

This is vercel's approach
https://github.com/vercel/ai/blob/main/packages/core/streams/ai-stream.ts
https://github.com/vercel-labs/ai-chatbot/blob/main/app/api/chat/route.ts

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions