Running Docling as an API service.
- Learning how to configure the webserver
- Get to know all runtime options of the API
- Explore useful deployment examples
- And more
Note
Migration to the v1 API. Docling Serve now has a stable v1 API. Read more on the migration to v1.
Install the docling-serve package and run the server.
# Using the python package
pip install "docling-serve[ui]"
docling-serve run --enable-ui
# Using container images, e.g. with Podman
podman run -p 5001:5001 -e DOCLING_SERVE_ENABLE_UI=1 quay.io/docling-project/docling-serveThe server is available at
- API http://127.0.0.1:5001
- API documentation http://127.0.0.1:5001/docs
- UI playground http://127.0.0.1:5001/ui
Try it out with a simple conversion:
curl -X 'POST' \
'http://localhost:5001/v1/convert/source' \
-H 'accept: application/json' \
-H 'Content-Type: application/json' \
-d '{
"sources": [{"kind": "http", "url": "https://arxiv.org/pdf/2501.17887"}]
}'The following container images are available for running Docling Serve with different hardware and PyTorch configurations:
| Image | Description | Architectures | Size |
|---|---|---|---|
ghcr.io/docling-project/docling-serve quay.io/docling-project/docling-serve |
Base image with all packages installed from the official PyPI index. | linux/amd64, linux/arm64 |
4.4 GB (arm64) 8.7 GB (amd64) |
ghcr.io/docling-project/docling-serve-cpu quay.io/docling-project/docling-serve-cpu |
CPU-only variant, using torch from the PyTorch CPU index. |
linux/amd64, linux/arm64 |
4.4 GB |
ghcr.io/docling-project/docling-serve-cu128 quay.io/docling-project/docling-serve-cu128 |
CUDA 12.8 build with torch from the cu128 index. |
linux/amd64 |
11.4 GB |
ghcr.io/docling-project/docling-serve-cu130 quay.io/docling-project/docling-serve-cu130 |
CUDA 13.0 build with torch from the cu130 index. |
linux/amd64, linux/arm64 |
TBD |
Important
CUDA Image Tagging Policy
CUDA-specific images (-cu128, -cu130) follow PyTorch's CUDA version support lifecycle and are tagged differently from base images:
- Base images (
docling-serve,docling-serve-cpu): Tagged withlatestandmainfor convenience - CUDA images (
docling-serve-cu*): Only tagged with explicit versions (e.g.,1.12.0) andmain
Why? CUDA versions are deprecated over time as PyTorch adds support for newer CUDA releases. To avoid accidentally pulling deprecated CUDA versions, CUDA images intentionally exclude the latest tag. Always use explicit version tags like:
# ✅ Recommended: Explicit version
docker pull quay.io/docling-project/docling-serve-cu130:1.12.0
# ❌ Not available for CUDA images
docker pull quay.io/docling-project/docling-serve-cu130:latestAn image for AMD ROCm 6.3 (docling-serve-rocm) is supported but not published due to its large size.
To build it locally:
git clone --branch main git@github.com:docling-project/docling-serve.git
cd docling-serve/
make docling-serve-rocm-imageFor deployment using Docker Compose, see docs/deployment.md.
Coming soon: docling-serve-slim images will reduce the size by skipping the model weights download.
An easy to use UI is available at the /ui endpoint.
Please feel free to connect with us using the discussion section.
Please read Contributing to Docling Serve for details.
If you use Docling in your projects, please consider citing the following:
@techreport{Docling,
author = {Docling Contributors},
month = {1},
title = {Docling: An Efficient Open-Source Toolkit for AI-driven Document Conversion},
url = {https://arxiv.org/abs/2501.17887},
eprint = {2501.17887},
doi = {10.48550/arXiv.2501.17887},
version = {2.0.0},
year = {2025}
}The Docling Serve codebase is under MIT license.
Docling has been brought to you by IBM.


