replicate · deepfates · Oct 27, 2024 · Oct 27, 2024 · Oct 27, 2024
diff --git a/README.md b/README.md
@@ -2,6 +2,21 @@
 
 This is a Python client for [Replicate](https://replicate.com). It lets you run models from your Python code or Jupyter notebook, and do various other things on Replicate.
 
+## Breaking Changes in 1.0.0
+
+The 1.0.0 release contains breaking changes:
+
+- The `replicate.run()` method now returns `FileObject`s instead of URL strings by default for models that output files.
+- `FileObject` implements an iterable interface similar to `httpx.Response`, making it easier to work with files efficiently.
+
+To revert to the previous behavior, you can opt out of `FileObject` by passing `use_file_output=False`:
+
+```python
+output = replicate.run("acmecorp/acme-model", use_file_output=False)
+```
+
+In most cases, updating existing applications to call `output.url()` should resolve any issues.
+
 > **👋** Check out an interactive version of this tutorial on [Google Colab](https://colab.research.google.com/drive/1K91q4p-OhL96FHBAVLsv9FlwFdu6Pn3c).
 >
 > [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/drive/1K91q4p-OhL96FHBAVLsv9FlwFdu6Pn3c)
@@ -35,71 +50,57 @@ replacing the model identifier and input with your own:
 
 ```python
 >>> import replicate
->>> replicate.run(
-        "stability-ai/stable-diffusion:27b93a2413e7f36cd83da926f3656280b2931564ff050bf9575f1fdf9bcd7478",
-        input={"prompt": "a 19th century portrait of a wombat gentleman"}
+>>> output = replicate.run(
+        "black-forest-labs/flux-schnell",
+        input={"prompt": "astronaut riding a rocket like a horse"}
     )
 
-['https://replicate.com/api/models/stability-ai/stable-diffusion/files/50fcac81-865d-499e-81ac-49de0cb79264/out-0.png']
-```
+>>> output.url()  # Get the URL for the image
+'https://replicate.delivery/...'
 
-> [!TIP]
-> You can also use the Replicate client asynchronously by prepending `async_` to the method name. 
-> 
-> Here's an example of how to run several predictions concurrently and wait for them all to complete:
->
-> ```python
-> import asyncio
-> import replicate
-> 
-> # https://replicate.com/stability-ai/sdxl
-> model_version = "stability-ai/sdxl:39ed52f2a78e934b3ba6e2a89f5b1c712de7dfea535525255b1aa35c5565e08b"
-> prompts = [
->     f"A chariot pulled by a team of {count} rainbow unicorns"
->     for count in ["two", "four", "six", "eight"]
-> ]
->
-> async with asyncio.TaskGroup() as tg:
->     tasks = [
->         tg.create_task(replicate.async_run(model_version, input={"prompt": prompt}))
->         for prompt in prompts
->     ]
->
-> results = await asyncio.gather(*tasks)
-> print(results)
-> ```
+>>> # Save the file directly to disk
+>>> with open("astronaut.png", "wb") as f:
+...     f.write(output.read())
 
-To run a model that takes a file input you can pass either
-a URL to a publicly accessible file on the Internet
-or a handle to a file on your local device.
+>>> # For very large files, you can stream the content
+>>> with open("large_file.bin", "wb") as f:
+...     for chunk in output:
+...         f.write(chunk)
+```
 
-```python
->>> output = replicate.run(
-        "andreasjansson/blip-2:f677695e5e89f8b236e52ecd1d3f01beb44c34606419bcc19345e046d8f786f9",
-        input={ "image": open("path/to/mystery.jpg") }
-    )
+> [!NOTE]
+> The `FileObject` returned by `replicate.run()` for file outputs provides methods like `url()`, `read()`, 
+> and supports iteration for efficient handling of large files.
 
-"an astronaut riding a horse"
-```
+## Async Usage
 
-`replicate.run` raises `ModelError` if the prediction fails.
-You can access the exception's `prediction` property 
-to get more information about the failure.
+The Replicate client supports asynchronous operations. Here's how to use the async API:
 
 ```python
+import asyncio
+import aiofiles
 import replicate
-from replicate.exceptions import ModelError
 
-try:
-  output = replicate.run("stability-ai/stable-diffusion-3", { "prompt": "An astronaut riding a rainbow unicorn" })
-except ModelError as e
-  if "(some known issue)" in e.prediction.logs:
-    pass
+async def save_file(output, filename):
+    async with aiofiles.open(filename, 'wb') as f:
+        await f.write(await output.aread())
 
-  print("Failed prediction: " + e.prediction.id)
-```
+async def stream_file(output, filename):
+    async with aiofiles.open(filename, 'wb') as f:
+        async for chunk in output:
+            await f.write(chunk)
 
+async def main():
+    output = await replicate.async_run(
+        "black-forest-labs/flux-schnell",
+        input={"prompt": "astronaut riding a rocket like a horse"}
+    )
+
+    await save_file(output, "astronaut1.png")
+    await stream_file(output, "astronaut2.png")
 
+asyncio.run(main())
+```
 ## Run a model and stream its output
 
 Replicate’s API supports server-sent event streams (SSEs) for language models. 
@@ -142,14 +143,16 @@ For more information, see
 
 ## Run a model in the background
 
-You can start a model and run it in the background:
+You can start a model and run it in the background using polling mode:
 
 ```python
 >>> model = replicate.models.get("kvfrans/clipdraw")
 >>> version = model.versions.get("5797a99edc939ea0e9242d5e8c9cb3bc7d125b1eac21bda852e5cb79ede2cd9b")
 >>> prediction = replicate.predictions.create(
     version=version,
-    input={"prompt":"Watercolor painting of an underwater submarine"})
+    input={"prompt": "Watercolor painting of an underwater submarine"},
+    wait={"type": "poll"}  # Use polling instead of blocking
+)
 
 >>> prediction
 Prediction(...)
@@ -175,8 +178,8 @@ iteration: 30, render:loss: -1.3994140625
 >>> prediction.status
 'succeeded'
 
->>> prediction.output
-'https://.../output.png'
+>>> output = prediction.output
+>>> output.save("submarine.png")  # Save the output file
 ```
 
 ## Run a model in the background and get a webhook
@@ -263,20 +266,70 @@ if page1.next:
 
 ## Load output files
 
-Output files are returned as HTTPS URLs. You can load an output file as a buffer:
+Model outputs that return files provide a `FileOutput` object with several methods for handling the data:
 
 ```python
 import replicate
 from PIL import Image
-from urllib.request import urlretrieve
+import io
 
-out = replicate.run(
-    "stability-ai/stable-diffusion:27b93a2413e7f36cd83da926f3656280b2931564ff050bf9575f1fdf9bcd7478",
+output = replicate.run(
+    "black-forest-labs/flux-schnell",
     input={"prompt": "wavy colorful abstract patterns, oceans"}
-    )
+)
+
+# Save directly to a file
+output.save("output.png")
 
-urlretrieve(out[0], "/tmp/out.png")
-background = Image.open("/tmp/out.png")
+# Get the URL (may be a data URI for faster delivery)
+url = output.url()
+
+# Load into PIL Image
+image_data = output.read()
+image = Image.open(io.BytesIO(image_data))
+```
+
+## Stream file data
+
+When working with file outputs, you can stream the data in chunks:
+
+```python
+import replicate
+
+output = replicate.run(
+    "black-forest-labs/flux-schnell",
+    input={"prompt": "an astronaut riding a horse"}
+)
+
+# Stream the file data in chunks
+for chunk in output:
+    process_chunk(chunk)  # Process each chunk of binary data
+
+# Or stream directly to a file
+with open("astronaut.png", "wb") as f:
+    for chunk in output:
+        f.write(chunk)
+```
+
+This is particularly useful when working with web frameworks:
+
+```python
+from fastapi import FastAPI
+from fastapi.responses import StreamingResponse
+
+app = FastAPI()
+
+@app.get("/generate")
+async def generate_image():
+    output = replicate.run(
+        "black-forest-labs/flux-schnell",
+        input={"prompt": "an astronaut riding a horse"}
+    )
+
+    return StreamingResponse(
+        output,
+        media_type="image/png"
+    )
 ```
 
 ## List models
@@ -376,20 +429,26 @@ The `replicate` package exports a default shared client.
 This client is initialized with an API token
 set by the `REPLICATE_API_TOKEN` environment variable.
 
-You can create your own client instance to
-pass a different API token value,
-add custom headers to requests,
-or control the behavior of the underlying [HTTPX client](https://www.python-httpx.org/api/#client):
+You can create your own client instance to customize its behavior:
 
 ```python
 import os
 from replicate.client import Client
 
 replicate = Client(
-  api_token=os.environ["SOME_OTHER_REPLICATE_API_TOKEN"]
-  headers={
-    "User-Agent": "my-app/1.0"
-  }
+    api_token=os.environ["SOME_OTHER_REPLICATE_API_TOKEN"],
+    headers={
+        "User-Agent": "my-app/1.0"
+    },
+    # Control file output behavior
+    use_file_output=True,  # Enable FileOutput objects (default: True)
+
+    # Configure default wait behavior
+    wait={
+        "type": "block",      # Use blocking mode (default)
+        "timeout": 60,        # Maximum time to hold connection open
+        "fallback": "poll"    # Fall back to polling if timeout reached
+    }
 )
 ```