Skip to content

docs: update README for sync API #386

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 3 commits into from
Closed
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
199 changes: 129 additions & 70 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,21 @@

This is a Python client for [Replicate](https://replicate.com). It lets you run models from your Python code or Jupyter notebook, and do various other things on Replicate.

## Breaking Changes in 1.0.0

The 1.0.0 release contains breaking changes:

- The `replicate.run()` method now returns `FileObject`s instead of URL strings by default for models that output files.
- `FileObject` implements an iterable interface similar to `httpx.Response`, making it easier to work with files efficiently.

To revert to the previous behavior, you can opt out of `FileObject` by passing `use_file_output=False`:

```python
output = replicate.run("acmecorp/acme-model", use_file_output=False)
```

In most cases, updating existing applications to call `output.url()` should resolve any issues.

> **👋** Check out an interactive version of this tutorial on [Google Colab](https://colab.research.google.com/drive/1K91q4p-OhL96FHBAVLsv9FlwFdu6Pn3c).
>
> [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/drive/1K91q4p-OhL96FHBAVLsv9FlwFdu6Pn3c)
Expand Down Expand Up @@ -35,71 +50,57 @@ replacing the model identifier and input with your own:

```python
>>> import replicate
>>> replicate.run(
"stability-ai/stable-diffusion:27b93a2413e7f36cd83da926f3656280b2931564ff050bf9575f1fdf9bcd7478",
input={"prompt": "a 19th century portrait of a wombat gentleman"}
>>> output = replicate.run(
"black-forest-labs/flux-schnell",
input={"prompt": "astronaut riding a rocket like a horse"}
)

['https://replicate.com/api/models/stability-ai/stable-diffusion/files/50fcac81-865d-499e-81ac-49de0cb79264/out-0.png']
```
>>> output.url() # Get the URL for the image
'https://replicate.delivery/...'

> [!TIP]
> You can also use the Replicate client asynchronously by prepending `async_` to the method name.
>
> Here's an example of how to run several predictions concurrently and wait for them all to complete:
>
> ```python
> import asyncio
> import replicate
>
> # https://replicate.com/stability-ai/sdxl
> model_version = "stability-ai/sdxl:39ed52f2a78e934b3ba6e2a89f5b1c712de7dfea535525255b1aa35c5565e08b"
> prompts = [
> f"A chariot pulled by a team of {count} rainbow unicorns"
> for count in ["two", "four", "six", "eight"]
> ]
>
> async with asyncio.TaskGroup() as tg:
> tasks = [
> tg.create_task(replicate.async_run(model_version, input={"prompt": prompt}))
> for prompt in prompts
> ]
>
> results = await asyncio.gather(*tasks)
> print(results)
> ```
>>> # Save the file directly to disk
>>> with open("astronaut.png", "wb") as f:
... f.write(output.read())

To run a model that takes a file input you can pass either
a URL to a publicly accessible file on the Internet
or a handle to a file on your local device.
>>> # For very large files, you can stream the content
>>> with open("large_file.bin", "wb") as f:
... for chunk in output:
... f.write(chunk)
```

```python
>>> output = replicate.run(
"andreasjansson/blip-2:f677695e5e89f8b236e52ecd1d3f01beb44c34606419bcc19345e046d8f786f9",
input={ "image": open("path/to/mystery.jpg") }
)
> [!NOTE]
> The `FileObject` returned by `replicate.run()` for file outputs provides methods like `url()`, `read()`,
> and supports iteration for efficient handling of large files.

"an astronaut riding a horse"
```
## Async Usage

`replicate.run` raises `ModelError` if the prediction fails.
You can access the exception's `prediction` property
to get more information about the failure.
The Replicate client supports asynchronous operations. Here's how to use the async API:

```python
import asyncio
import aiofiles
import replicate
from replicate.exceptions import ModelError

try:
output = replicate.run("stability-ai/stable-diffusion-3", { "prompt": "An astronaut riding a rainbow unicorn" })
except ModelError as e
if "(some known issue)" in e.prediction.logs:
pass
async def save_file(output, filename):
async with aiofiles.open(filename, 'wb') as f:
await f.write(await output.aread())

print("Failed prediction: " + e.prediction.id)
```
async def stream_file(output, filename):
async with aiofiles.open(filename, 'wb') as f:
async for chunk in output:
await f.write(chunk)

async def main():
output = await replicate.async_run(
"black-forest-labs/flux-schnell",
input={"prompt": "astronaut riding a rocket like a horse"}
)

await save_file(output, "astronaut1.png")
await stream_file(output, "astronaut2.png")

asyncio.run(main())
```
## Run a model and stream its output

Replicate’s API supports server-sent event streams (SSEs) for language models.
Expand Down Expand Up @@ -142,14 +143,16 @@ For more information, see

## Run a model in the background

You can start a model and run it in the background:
You can start a model and run it in the background using polling mode:

```python
>>> model = replicate.models.get("kvfrans/clipdraw")
>>> version = model.versions.get("5797a99edc939ea0e9242d5e8c9cb3bc7d125b1eac21bda852e5cb79ede2cd9b")
>>> prediction = replicate.predictions.create(
version=version,
input={"prompt":"Watercolor painting of an underwater submarine"})
input={"prompt": "Watercolor painting of an underwater submarine"},
wait={"type": "poll"} # Use polling instead of blocking
)

>>> prediction
Prediction(...)
Expand All @@ -175,8 +178,8 @@ iteration: 30, render:loss: -1.3994140625
>>> prediction.status
'succeeded'

>>> prediction.output
'https://.../output.png'
>>> output = prediction.output
>>> output.save("submarine.png") # Save the output file
```

## Run a model in the background and get a webhook
Expand Down Expand Up @@ -263,20 +266,70 @@ if page1.next:

## Load output files

Output files are returned as HTTPS URLs. You can load an output file as a buffer:
Model outputs that return files provide a `FileOutput` object with several methods for handling the data:

```python
import replicate
from PIL import Image
from urllib.request import urlretrieve
import io

out = replicate.run(
"stability-ai/stable-diffusion:27b93a2413e7f36cd83da926f3656280b2931564ff050bf9575f1fdf9bcd7478",
output = replicate.run(
"black-forest-labs/flux-schnell",
input={"prompt": "wavy colorful abstract patterns, oceans"}
)
)

# Save directly to a file
output.save("output.png")

urlretrieve(out[0], "/tmp/out.png")
background = Image.open("/tmp/out.png")
# Get the URL (may be a data URI for faster delivery)
url = output.url()

# Load into PIL Image
image_data = output.read()
image = Image.open(io.BytesIO(image_data))
```

## Stream file data

When working with file outputs, you can stream the data in chunks:

```python
import replicate

output = replicate.run(
"black-forest-labs/flux-schnell",
input={"prompt": "an astronaut riding a horse"}
)

# Stream the file data in chunks
for chunk in output:
process_chunk(chunk) # Process each chunk of binary data

# Or stream directly to a file
with open("astronaut.png", "wb") as f:
for chunk in output:
f.write(chunk)
```

This is particularly useful when working with web frameworks:

```python
from fastapi import FastAPI
from fastapi.responses import StreamingResponse

app = FastAPI()

@app.get("/generate")
async def generate_image():
output = replicate.run(
"black-forest-labs/flux-schnell",
input={"prompt": "an astronaut riding a horse"}
)

return StreamingResponse(
output,
media_type="image/png"
)
```

## List models
Expand Down Expand Up @@ -376,20 +429,26 @@ The `replicate` package exports a default shared client.
This client is initialized with an API token
set by the `REPLICATE_API_TOKEN` environment variable.

You can create your own client instance to
pass a different API token value,
add custom headers to requests,
or control the behavior of the underlying [HTTPX client](https://www.python-httpx.org/api/#client):
You can create your own client instance to customize its behavior:

```python
import os
from replicate.client import Client

replicate = Client(
api_token=os.environ["SOME_OTHER_REPLICATE_API_TOKEN"]
headers={
"User-Agent": "my-app/1.0"
}
api_token=os.environ["SOME_OTHER_REPLICATE_API_TOKEN"],
headers={
"User-Agent": "my-app/1.0"
},
# Control file output behavior
use_file_output=True, # Enable FileOutput objects (default: True)

# Configure default wait behavior
wait={
"type": "block", # Use blocking mode (default)
"timeout": 60, # Maximum time to hold connection open
"fallback": "poll" # Fall back to polling if timeout reached
}
)
```

Expand Down