You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: README.md
+24-16Lines changed: 24 additions & 16 deletions
Original file line number
Diff line number
Diff line change
@@ -510,12 +510,12 @@ The latest versions of `replicate >= 1.0.8` include a new experimental `use()` f
510
510
Some key differences to `replicate.run()`.
511
511
512
512
1. You "import" the model using the `use()` syntax, after that you call the model like a function.
513
-
2. The output type matches the model definition. i.e. if the model uses an iterator output will be an iterator.
514
-
3. Files will be downloaded output as `Path` objects*.
513
+
2. The output type matches the model definition.
514
+
3. Baked in support for streaming for all models.
515
+
4. File outputs will be represented as `PathLike` objects and downloaded to disk when used*.
515
516
516
517
> [!NOTE]
517
-
518
-
\* We've replaced the `FileOutput` implementation with `Path` objects. However to avoid unnecessary downloading of files until they are needed we've implemented a `PathProxy` class that will defer the download until the first time the object is used. If you need the underlying URL of the `Path` object you can use the `get_path_url(path: Path) -> str` helper.
518
+
> \* We've replaced the `FileOutput` implementation with `Path` objects. However to avoid unnecessary downloading of files until they are needed we've implemented a `PathProxy` class that will defer the download until the first time the object is used. If you need the underlying URL of the `Path` object you can use the `get_path_url(path: Path) -> str` helper.
519
519
520
520
### Examples
521
521
@@ -534,22 +534,14 @@ for output in outputs:
534
534
print(output) # Path(/tmp/output.webp)
535
535
```
536
536
537
-
Models that output iterators will return iterators:
538
-
537
+
Models that implement iterators will return the output of the completed run as a list unless explicitly streaming (see Streaming section below). Language models that define `x-cog-iterator-display: concatenate` will return strings:
539
538
540
539
```py
541
540
claude = replicate.use("anthropic/claude-4-sonnet")
542
541
543
542
output = claude(prompt="Give me a recipe for tasty smashed avocado on sourdough toast that could feed all of California.")
544
543
545
-
for token in output:
546
-
print(token) # "Here's a recipe"
547
-
```
548
-
549
-
You can call `str()` on a language model to get the full output when done rather than iterating over tokens:
550
-
551
-
```py
552
-
str(output) # "Here's a recipe to feed all of California (about 39 million people)! ..."
544
+
print(output) # "Here's a recipe to feed all of California (about 39 million people)! ..."
553
545
```
554
546
555
547
You can pass the results of one model directly into another:
@@ -579,6 +571,19 @@ prediction.logs() # get current logs (WIP)
579
571
prediction.output() # get the output
580
572
```
581
573
574
+
### Streaming
575
+
576
+
Many models, particularly large language models (LLMs), will yield partial results as the model is running. To consume outputs from these models as they run you can pass the `streaming` argument to `use()`:
577
+
578
+
```py
579
+
claude = replicate.use("anthropic/claude-4-sonnet", streaming=True)
580
+
581
+
output = claude(prompt="Give me a recipe for tasty smashed avocado on sourdough toast that could feed all of California.")
582
+
583
+
for chunk in output:
584
+
print(chunk) # "Here's a recipe ", "to feed all", " of California"
585
+
```
586
+
582
587
### Downloading file outputs
583
588
584
589
Output files are provided as Python [os.PathLike](https://docs.python.org/3.12/library/os.html#os.PathLike) objects. These are supported by most of the Python standard library like `open()` and `Path`, as well as third-party libraries like `pillow` and `ffmpeg-python`.
@@ -646,14 +651,14 @@ async def main():
646
651
asyncio.run(main())
647
652
```
648
653
649
-
If the model returns an iterator an `AsyncIterator`implementation will be used:
654
+
When used in streaming mode then an `AsyncIterator` will be returned.
650
655
651
656
```py
652
657
import asyncio
653
658
import replicate
654
659
655
660
asyncdefmain():
656
-
claude = replicate.use("anthropic/claude-3.5-haiku", use_async=True)
661
+
claude = replicate.use("anthropic/claude-3.5-haiku", streaming=True, use_async=True)
657
662
output =await claude(prompt="say hello")
658
663
659
664
# Stream the response as it comes in.
@@ -700,6 +705,9 @@ output1 = flux_dev() # will warn that `prompt` is missing
700
705
output2 = flux_dev(prompt="str") # output2 will be typed as `str`
701
706
```
702
707
708
+
> [!WARNING]
709
+
> Currently the typing system doesn't correctly support the `streaming` flag for models that return lists or use iterators. We're working on improvements here.
710
+
703
711
In future we hope to provide tooling to generate and provide these models as packages to make working with them easier. For now you may wish to create your own.
0 commit comments