Skip to content

Commit f160fef

Browse files
committed
Implement streaming argument for use()
1 parent 1185b7b commit f160fef

File tree

1 file changed

+24
-16
lines changed

1 file changed

+24
-16
lines changed

README.md

Lines changed: 24 additions & 16 deletions
Original file line numberDiff line numberDiff line change
@@ -510,12 +510,12 @@ The latest versions of `replicate >= 1.0.8` include a new experimental `use()` f
510510
Some key differences to `replicate.run()`.
511511

512512
1. You "import" the model using the `use()` syntax, after that you call the model like a function.
513-
2. The output type matches the model definition. i.e. if the model uses an iterator output will be an iterator.
514-
3. Files will be downloaded output as `Path` objects*.
513+
2. The output type matches the model definition.
514+
3. Baked in support for streaming for all models.
515+
4. File outputs will be represented as `PathLike` objects and downloaded to disk when used*.
515516

516517
> [!NOTE]
517-
518-
\* We've replaced the `FileOutput` implementation with `Path` objects. However to avoid unnecessary downloading of files until they are needed we've implemented a `PathProxy` class that will defer the download until the first time the object is used. If you need the underlying URL of the `Path` object you can use the `get_path_url(path: Path) -> str` helper.
518+
> \* We've replaced the `FileOutput` implementation with `Path` objects. However to avoid unnecessary downloading of files until they are needed we've implemented a `PathProxy` class that will defer the download until the first time the object is used. If you need the underlying URL of the `Path` object you can use the `get_path_url(path: Path) -> str` helper.
519519
520520
### Examples
521521

@@ -534,22 +534,14 @@ for output in outputs:
534534
print(output) # Path(/tmp/output.webp)
535535
```
536536

537-
Models that output iterators will return iterators:
538-
537+
Models that implement iterators will return the output of the completed run as a list unless explicitly streaming (see Streaming section below). Language models that define `x-cog-iterator-display: concatenate` will return strings:
539538

540539
```py
541540
claude = replicate.use("anthropic/claude-4-sonnet")
542541

543542
output = claude(prompt="Give me a recipe for tasty smashed avocado on sourdough toast that could feed all of California.")
544543

545-
for token in output:
546-
print(token) # "Here's a recipe"
547-
```
548-
549-
You can call `str()` on a language model to get the full output when done rather than iterating over tokens:
550-
551-
```py
552-
str(output) # "Here's a recipe to feed all of California (about 39 million people)! ..."
544+
print(output) # "Here's a recipe to feed all of California (about 39 million people)! ..."
553545
```
554546

555547
You can pass the results of one model directly into another:
@@ -579,6 +571,19 @@ prediction.logs() # get current logs (WIP)
579571
prediction.output() # get the output
580572
```
581573

574+
### Streaming
575+
576+
Many models, particularly large language models (LLMs), will yield partial results as the model is running. To consume outputs from these models as they run you can pass the `streaming` argument to `use()`:
577+
578+
```py
579+
claude = replicate.use("anthropic/claude-4-sonnet", streaming=True)
580+
581+
output = claude(prompt="Give me a recipe for tasty smashed avocado on sourdough toast that could feed all of California.")
582+
583+
for chunk in output:
584+
print(chunk) # "Here's a recipe ", "to feed all", " of California"
585+
```
586+
582587
### Downloading file outputs
583588

584589
Output files are provided as Python [os.PathLike](https://docs.python.org/3.12/library/os.html#os.PathLike) objects. These are supported by most of the Python standard library like `open()` and `Path`, as well as third-party libraries like `pillow` and `ffmpeg-python`.
@@ -646,14 +651,14 @@ async def main():
646651
asyncio.run(main())
647652
```
648653

649-
If the model returns an iterator an `AsyncIterator` implementation will be used:
654+
When used in streaming mode then an `AsyncIterator` will be returned.
650655

651656
```py
652657
import asyncio
653658
import replicate
654659

655660
async def main():
656-
claude = replicate.use("anthropic/claude-3.5-haiku", use_async=True)
661+
claude = replicate.use("anthropic/claude-3.5-haiku", streaming=True, use_async=True)
657662
output = await claude(prompt="say hello")
658663

659664
# Stream the response as it comes in.
@@ -700,6 +705,9 @@ output1 = flux_dev() # will warn that `prompt` is missing
700705
output2 = flux_dev(prompt="str") # output2 will be typed as `str`
701706
```
702707

708+
> [!WARNING]
709+
> Currently the typing system doesn't correctly support the `streaming` flag for models that return lists or use iterators. We're working on improvements here.
710+
703711
In future we hope to provide tooling to generate and provide these models as packages to make working with them easier. For now you may wish to create your own.
704712

705713
### TODO

0 commit comments

Comments
 (0)