Running a distributed job processing documents with Docling.
Note
This is an unstable draft implementation which will quickly evolve.
Make sure your Ray cluster has docling-jobkit
installed, then submit the job.
ray job submit --no-wait --working-dir . --runtime-env runtime_env.yml -- docling-ray-job
-
Create a file
runtime_env.yml
:# Expected environment if clean ray image is used. Take into account that ray worker can timeout before it finishes installing modules. pip: - docling-jobkit
-
Submit the job using the custom runtime env:
ray job submit --no-wait --runtime-env runtime_env.yml -- docling-ray-job
More examples and customization are provided in docs/ray-job/.
Coming soon. Initial instruction from OpenShift AI docs.
Please feel free to connect with us using the discussion section of the main Docling repository.
Please read Contributing to Docling Serve for details.
If you use Docling in your projects, please consider citing the following:
@techreport{Docling,
author = {Deep Search Team},
month = {1},
title = {Docling: An Efficient Open-Source Toolkit for AI-driven Document Conversion},
url = {https://arxiv.org/abs/2501.17887},
eprint = {2501.17887},
doi = {10.48550/arXiv.2501.17887},
version = {2.0.0},
year = {2025}
}
The Docling Serve codebase is under MIT license.
Docling is hosted as a project in the LF AI & Data Foundation.
The project was started by the AI for Knowledge team at IBM Research Zurich.