Open
Description
Feature request
add option to stream output from pipeline
Motivation
using tokenizer.apply_chat_template
then other stuff then model.generate
is pretty repetitive and I think it's time to integrate this with pipelines, also it's time to add a streaming pipeline too.
Your contribution
I can provide this resource as a reference.
This is a pr I made with the requested feature https://huggingface.co/google/gemma-1.1-2b-it/discussions/14.
another tip I can provide is don't use yield and return in the same function, you should separate them (it's a python problem)
sadly I'm a bit busy lately to open a PR, but if I could find some time I'll try to help out.