Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(docker): download model into docker image #8

Open
wants to merge 2 commits into
base: master
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
89 changes: 89 additions & 0 deletions .dockerignore
Original file line number Diff line number Diff line change
@@ -0,0 +1,89 @@
# Git
.git
.gitignore
.gitattributes
.github

# CI
.codeclimate.yml
.travis.yml
.taskcluster.yml

# Docker
docker-compose.yml
Dockerfile
.docker
.dockerignore

# Byte-compiled / optimized / DLL files
**/__pycache__/
**/*.py[cod]

# C extensions
*.so

# Distribution / packaging
.Python
env/
build/
develop-eggs/
dist/
downloads/
eggs/
lib/
lib64/
parts/
sdist/
var/
*.egg-info/
.installed.cfg
*.egg

# PyInstaller
# Usually these files are written by a python script from a template
# before PyInstaller builds the exe, so as to inject date/other infos into it.
*.manifest
*.spec

# Installer logs
pip-log.txt
pip-delete-this-directory.txt

# Unit test / coverage reports
htmlcov/
.tox/
.coverage
.cache
nosetests.xml
coverage.xml

# Translations
*.mo
*.pot

# Django stuff:
*.log

# Sphinx documentation
docs/_build/

# PyBuilder
target/

# Virtual environment
.env
.venv/
venv/

# PyCharm
.idea

# Python mode for VIM
.ropeproject
**/.ropeproject

# Vim swap files
**/*.swp

# VS Code
.vscode/
17 changes: 6 additions & 11 deletions Dockerfile
Original file line number Diff line number Diff line change
@@ -1,16 +1,8 @@
FROM nvcr.io/nvidia/cuda:11.8.0-cudnn8-devel-ubuntu22.04
FROM pytorch/pytorch:2.3.1-cuda11.8-cudnn8-runtime

# Update package lists and install necessary packages
RUN apt-get update \
&& apt-get install -y git python3 python3-pip libgl1 libglib2.0-0 curl \
&& ln -s /usr/bin/python3 /usr/bin/python \
&& curl -s https://packagecloud.io/install/repositories/github/git-lfs/script.deb.sh | bash \
&& apt-get install -y git-lfs \
&& git lfs install \
&& rm -rf /var/lib/apt/lists/*

# Install PyTorch and related packages
RUN pip3 install --no-cache-dir torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu118
&& apt-get install -y ffmpeg libsm6 libxext6

# Assuming you have a setup.py or requirements.txt in your project directory
COPY . /app
Expand All @@ -19,8 +11,11 @@ WORKDIR /app
# Install your Python package (assuming it has a setup.py)
RUN pip3 install --no-cache-dir -e .

# Download models used by convert
RUN python -c 'from marker.models import load_all_models; load_all_models()'

# Expose the desired port
EXPOSE 8000

# Run the server
CMD ["python", "server.py", "--host", "0.0.0.0" ,"--port", "8000"]
CMD ["uvicorn", "server:app", "--host", "0.0.0.0" ,"--port", "8000", "--workers", "4"]
3 changes: 1 addition & 2 deletions examples/invoke.py
Original file line number Diff line number Diff line change
@@ -1,6 +1,5 @@
import os
import requests
from PIL import Image
import base64
import argparse

Expand Down Expand Up @@ -68,4 +67,4 @@ def convert_pdf_to_markdown_and_save(pdf_file_paths, output_folder, server_url):

"""
python invoke.py --server_url http://127.0.0.1:8000/convert --filename test1.pdf test2.pdf --output output
"""
"""