Skip to content
This repository was archived by the owner on Aug 7, 2025. It is now read-only.

Commit 98eac4e

Browse files
authored
Merge branch 'master' into fix/dali_batch_input
2 parents 67fa87e + 255a047 commit 98eac4e

35 files changed

+317
-52
lines changed

.github/workflows/docker-nightly-build.yml

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1,9 +1,9 @@
11
name: Push Docker Nightly
22

33
on:
4-
# run every day at 11:15am
4+
# run every day at 1:15pm
55
schedule:
6-
- cron: '15 11 * * *'
6+
- cron: '15 13 * * *'
77
jobs:
88
nightly:
99
runs-on: ubuntu-20.04
Lines changed: 55 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,55 @@
1+
name: Run Regression Tests on Docker
2+
3+
on:
4+
# run every day at 5:15am
5+
schedule:
6+
- cron: '15 5 * * *'
7+
8+
concurrency:
9+
group: ci-cpu-${{ github.workflow }}-${{ github.ref == 'refs/heads/master' && github.run_number || github.ref }}
10+
cancel-in-progress: true
11+
12+
jobs:
13+
docker-regression:
14+
strategy:
15+
fail-fast: false
16+
matrix:
17+
hardware: [ubuntu-20.04, [self-hosted, regression-test-gpu]]
18+
runs-on:
19+
- ${{ matrix.hardware }}
20+
steps:
21+
- name: Clean up previous run
22+
run: |
23+
echo "Cleaning up previous run"
24+
ls -la ./
25+
sudo rm -rf ./* || true
26+
sudo rm -rf ./.??* || true
27+
ls -la ./
28+
docker system prune -f
29+
- name: Checkout TorchServe
30+
uses: actions/checkout@v3
31+
- name: Branch name
32+
run: |
33+
echo $GITHUB_REF_NAME
34+
- name: Build CPU Docker Image
35+
if: contains(matrix.hardware, 'ubuntu')
36+
run: |
37+
cd docker
38+
./build_image.sh -bt ci -n -b $GITHUB_REF_NAME -t pytorch/torchserve:ci
39+
- name: Build GPU Docker Image
40+
if: false == contains(matrix.hardware, 'ubuntu')
41+
run: |
42+
cd docker
43+
./build_image.sh -g -cv cu117 -bt ci -n -b $GITHUB_REF_NAME -t pytorch/torchserve:ci
44+
- name: Torchserve GPU Regression Tests
45+
if: false == contains(matrix.hardware, 'ubuntu')
46+
run: |
47+
docker run --gpus all -v $GITHUB_WORKSPACE:/home/serve pytorch/torchserve:ci
48+
- name: Torchserve CPU Regression Tests
49+
if: contains(matrix.hardware, 'ubuntu')
50+
run: |
51+
docker run -v $GITHUB_WORKSPACE:/home/serve pytorch/torchserve:ci
52+
- name: Cleanup Docker Images
53+
if: success()
54+
run: |
55+
docker system prune -f && docker rmi pytorch/torchserve:ci

.github/workflows/regression_tests_gpu.yml

Lines changed: 4 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -20,11 +20,10 @@ jobs:
2020
- name: Clean up previous run
2121
run: |
2222
echo "Cleaning up previous run"
23-
cd $RUNNER_WORKSPACE
24-
pwd
25-
cd ..
26-
pwd
27-
rm -rf _tool
23+
ls -la ./
24+
sudo rm -rf ./* || true
25+
sudo rm -rf ./.??* || true
26+
ls -la ./
2827
- name: Update git
2928
run: sudo add-apt-repository ppa:git-core/ppa -y && sudo apt-get update && sudo apt-get install git -y
3029
- name: Check git version

CONTRIBUTING.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -20,9 +20,9 @@ Your contributions will fall into two categories:
2020

2121
For GPU
2222
```bash
23-
python ts_scripts/install_dependencies.py --environment=dev --cuda=cu102
23+
python ts_scripts/install_dependencies.py --environment=dev --cuda=cu118
2424
```
25-
> Supported cuda versions as cu117, cu116, cu113, cu111, cu102, cu101, cu92
25+
> Supported cuda versions as cu118, cu117, cu116, cu113, cu111, cu102, cu101, cu92
2626
- Install `pre-commit` to your Git flow:
2727
```bash
2828
pre-commit install

README.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -72,7 +72,7 @@ Refer to [torchserve docker](docker/README.md) for details.
7272

7373

7474
## 🏆 Highlighted Examples
75-
* [🤗 HuggingFace Transformers](examples/Huggingface_Transformers) with a [Better Transformer Integration](examples/Huggingface_Transformers#Speed-up-inference-with-Better-Transformer)
75+
* [🤗 HuggingFace Transformers](examples/Huggingface_Transformers) with a [Better Transformer Integration/ Flash Attention & Xformer Memory Efficient ](examples/Huggingface_Transformers#Speed-up-inference-with-Better-Transformer)
7676
* [Model parallel inference](examples/Huggingface_Transformers#model-parallelism)
7777
* [MultiModal models with MMF](https://github.com/pytorch/serve/tree/master/examples/MMF-activity-recognition) combining text, audio and video
7878
* [Dual Neural Machine Translation](examples/Workflows/nmt_transformers_pipeline) for a complex workflow DAG
@@ -101,7 +101,7 @@ To learn more about how to contribute, see the contributor guide [here](https://
101101
* [Grokking Intel CPU PyTorch performance from first principles( Part 2): a TorchServe case study](https://pytorch.org/tutorials/intermediate/torchserve_with_ipex_2.html)
102102
* [Case Study: Amazon Ads Uses PyTorch and AWS Inferentia to Scale Models for Ads Processing](https://pytorch.org/blog/amazon-ads-case-study/)
103103
* [Optimize your inference jobs using dynamic batch inference with TorchServe on Amazon SageMaker](https://aws.amazon.com/blogs/machine-learning/optimize-your-inference-jobs-using-dynamic-batch-inference-with-torchserve-on-amazon-sagemaker/)
104-
* [Using AI to bring children's drawings to life](https://ai.facebook.com/blog/using-ai-to-bring-childrens-drawings-to-life/)
104+
* [Using AI to bring children's drawings to life](https://ai.meta.com/blog/using-ai-to-bring-childrens-drawings-to-life/)
105105
* [🎥 Model Serving in PyTorch](https://www.youtube.com/watch?v=2A17ZtycsPw)
106106
* [Evolution of Cresta's machine learning architecture: Migration to AWS and PyTorch](https://aws.amazon.com/blogs/machine-learning/evolution-of-crestas-machine-learning-architecture-migration-to-aws-and-pytorch/)
107107
* [🎥 Explain Like I’m 5: TorchServe](https://www.youtube.com/watch?v=NEdZbkfHQCk)

benchmarks/auto_benchmark.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -149,7 +149,7 @@ def install_torchserve(skip_ts_install, hw, ts_version):
149149

150150
# install_dependencies.py
151151
if hw == "gpu":
152-
cmd = "python ts_scripts/install_dependencies.py --environment dev --cuda cu117"
152+
cmd = "python ts_scripts/install_dependencies.py --environment dev --cuda cu118"
153153
elif hw == "neuronx":
154154
cmd = "python ts_scripts/install_dependencies.py --environment dev --neuronx"
155155
else:

benchmarks/benchmark-ab.py

Lines changed: 4 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -659,7 +659,10 @@ def plot_line(fig, data, color="blue", title=None):
659659
title="Combined Graph",
660660
)
661661
fig5.grid()
662-
plt.savefig("api-profile1.png", bbox_inches="tight")
662+
plt.savefig(
663+
f"{execution_params['report_location']}/benchmark/api-profile1.png",
664+
bbox_inches="tight",
665+
)
663666

664667

665668
def stop_torchserve():

docker/Dockerfile

Lines changed: 53 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -30,6 +30,7 @@ ARG PYTHON_VERSION=3.9
3030
FROM ${BASE_IMAGE} AS compile-image
3131
ARG BASE_IMAGE=ubuntu:rolling
3232
ARG PYTHON_VERSION
33+
ARG BUILD_NIGHTLY
3334
ENV PYTHONUNBUFFERED TRUE
3435

3536
RUN --mount=type=cache,id=apt-dev,target=/var/cache/apt \
@@ -82,10 +83,15 @@ RUN \
8283
fi
8384

8485
# Make sure latest version of torchserve is uploaded before running this
85-
RUN python -m pip install --no-cache-dir torchserve torch-model-archiver torch-workflow-archiver
86+
RUN \
87+
if echo "$BUILD_NIGHTLY" | grep -q "false"; then \
88+
python -m pip install --no-cache-dir torchserve torch-model-archiver torch-workflow-archiver;\
89+
else \
90+
python -m pip install --no-cache-dir torchserve-nightly torch-model-archiver-nightly torch-workflow-archiver-nightly;\
91+
fi
8692

8793
# Final image for production
88-
FROM ${BASE_IMAGE} AS runtime-image
94+
FROM ${BASE_IMAGE} AS production-image
8995
# Re-state ARG PYTHON_VERSION to make it active in this build-stage (uses default define at the top)
9096
ARG PYTHON_VERSION
9197
ENV PYTHONUNBUFFERED TRUE
@@ -130,3 +136,48 @@ WORKDIR /home/model-server
130136
ENV TEMP=/home/model-server/tmp
131137
ENTRYPOINT ["/usr/local/bin/dockerd-entrypoint.sh"]
132138
CMD ["serve"]
139+
140+
# Final image for docker regression
141+
FROM ${BASE_IMAGE} AS ci-image
142+
# Re-state ARG PYTHON_VERSION to make it active in this build-stage (uses default define at the top)
143+
ARG PYTHON_VERSION
144+
ARG BRANCH_NAME
145+
ENV PYTHONUNBUFFERED TRUE
146+
147+
RUN --mount=type=cache,target=/var/cache/apt \
148+
apt-get update && \
149+
apt-get upgrade -y && \
150+
apt-get install software-properties-common -y && \
151+
add-apt-repository -y ppa:deadsnakes/ppa && \
152+
apt remove python-pip python3-pip && \
153+
DEBIAN_FRONTEND=noninteractive apt-get install --no-install-recommends -y \
154+
python$PYTHON_VERSION \
155+
python3-distutils \
156+
python$PYTHON_VERSION-dev \
157+
python$PYTHON_VERSION-venv \
158+
# using openjdk-17-jdk due to circular dependency(ca-certificates) bug in openjdk-17-jre-headless debian package
159+
# https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=1009905
160+
openjdk-17-jdk \
161+
build-essential \
162+
wget \
163+
numactl \
164+
nodejs \
165+
npm \
166+
zip \
167+
unzip \
168+
&& npm install -g newman newman-reporter-htmlextra markdown-link-check \
169+
&& rm -rf /var/lib/apt/lists/* \
170+
&& cd /tmp
171+
172+
173+
COPY --from=compile-image /home/venv /home/venv
174+
175+
ENV PATH="/home/venv/bin:$PATH"
176+
177+
RUN python -m pip install --no-cache-dir -r https://raw.githubusercontent.com/pytorch/serve/$BRANCH_NAME/requirements/developer.txt
178+
179+
RUN mkdir /home/serve
180+
ENV TS_RUN_IN_DOCKER True
181+
182+
WORKDIR /home/serve
183+
CMD ["python", "test/regression_tests.py"]

docker/README.md

Lines changed: 4 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -28,16 +28,17 @@ cd serve/docker
2828

2929
# Create TorchServe docker image
3030

31-
Use `build_image.sh` script to build the docker images. The script builds the `production`, `dev` and `codebuild` docker images.
31+
Use `build_image.sh` script to build the docker images. The script builds the `production`, `dev` , `ci` and `codebuild` docker images.
3232
| Parameter | Description |
3333
|------|------|
3434
|-h, --help|Show script help|
3535
|-b, --branch_name|Specify a branch name to use. Default: master |
3636
|-g, --gpu|Build image with GPU based ubuntu base image|
37-
|-bt, --buildtype|Which type of docker image to build. Can be one of : production, dev, codebuild|
37+
|-bt, --buildtype|Which type of docker image to build. Can be one of : production, dev, ci, codebuild|
3838
|-t, --tag|Tag name for image. If not specified, script uses torchserve default tag names.|
3939
|-cv, --cudaversion| Specify to cuda version to use. Supported values `cu92`, `cu101`, `cu102`, `cu111`, `cu113`, `cu116`, `cu117`, `cu118`. Default `cu117`|
4040
|-ipex, --build-with-ipex| Specify to build with intel_extension_for_pytorch. If not specified, script builds without intel_extension_for_pytorch.|
41+
|-n, --nightly| Specify to build with TorchServe nightly.|
4142
|--codebuild| Set if you need [AWS CodeBuild](https://aws.amazon.com/codebuild/)|
4243
|-py, --pythonversion| Specify the python version to use. Supported values `3.8`, `3.9`, `3.10`. Default `3.9`|
4344

@@ -52,7 +53,7 @@ Creates a docker image with publicly available `torchserve` and `torch-model-arc
5253
./build_image.sh
5354
```
5455

55-
- To create a GPU based image with cuda 10.2. Options are `cu92`, `cu101`, `cu102`, `cu111`, `cu113`, `cu116`, `cu117`
56+
- To create a GPU based image with cuda 10.2. Options are `cu92`, `cu101`, `cu102`, `cu111`, `cu113`, `cu116`, `cu117`, `cu118`
5657

5758
```bash
5859
./build_image.sh -g -cv cu102

docker/build_image.sh

Lines changed: 11 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -11,6 +11,7 @@ USE_CUSTOM_TAG=false
1111
CUDA_VERSION=""
1212
USE_LOCAL_SERVE_FOLDER=false
1313
BUILD_WITH_IPEX=false
14+
BUILD_NIGHTLY=false
1415
PYTHON_VERSION=3.9
1516

1617
for arg in "$@"
@@ -27,6 +28,7 @@ do
2728
echo "-lf, --use-local-serve-folder specify this option for the benchmark image if the current 'serve' folder should be used during automated benchmarks"
2829
echo "-ipex, --build-with-ipex specify to build with intel_extension_for_pytorch"
2930
echo "-py, --pythonversion specify to python version to use: Possible values: 3.8 3.9 3.10"
31+
echo "-n, --nightly specify to build with TorchServe nightly"
3032
exit 0
3133
;;
3234
-b|--branch_name)
@@ -43,7 +45,7 @@ do
4345
-g|--gpu)
4446
MACHINE=gpu
4547
DOCKER_TAG="pytorch/torchserve:latest-gpu"
46-
BASE_IMAGE="nvidia/cuda:11.7.1-base-ubuntu20.04"
48+
BASE_IMAGE="nvidia/cuda:11.8.0-base-ubuntu20.04"
4749
CUDA_VERSION="cu117"
4850
shift
4951
;;
@@ -66,6 +68,10 @@ do
6668
BUILD_WITH_IPEX=true
6769
shift
6870
;;
71+
-n|--nightly)
72+
BUILD_NIGHTLY=true
73+
shift
74+
;;
6975
-py|--pythonversion)
7076
PYTHON_VERSION="$2"
7177
if [[ $PYTHON_VERSION = 3.8 || $PYTHON_VERSION = 3.9 || $PYTHON_VERSION = 3.10 ]]; then
@@ -137,7 +143,10 @@ fi
137143

138144
if [ "${BUILD_TYPE}" == "production" ]
139145
then
140-
DOCKER_BUILDKIT=1 docker build --file Dockerfile --build-arg BASE_IMAGE="${BASE_IMAGE}" --build-arg CUDA_VERSION="${CUDA_VERSION}" --build-arg PYTHON_VERSION="${PYTHON_VERSION}" -t "${DOCKER_TAG}" .
146+
DOCKER_BUILDKIT=1 docker build --file Dockerfile --build-arg BASE_IMAGE="${BASE_IMAGE}" --build-arg CUDA_VERSION="${CUDA_VERSION}" --build-arg PYTHON_VERSION="${PYTHON_VERSION}" --build-arg BUILD_NIGHTLY="${BUILD_NIGHTLY}" -t "${DOCKER_TAG}" --target production-image .
147+
elif [ "${BUILD_TYPE}" == "ci" ]
148+
then
149+
DOCKER_BUILDKIT=1 docker build --file Dockerfile --build-arg BASE_IMAGE="${BASE_IMAGE}" --build-arg CUDA_VERSION="${CUDA_VERSION}" --build-arg PYTHON_VERSION="${PYTHON_VERSION}" --build-arg BUILD_NIGHTLY="${BUILD_NIGHTLY}" --build-arg BRANCH_NAME="${BRANCH_NAME}" -t "${DOCKER_TAG}" --target ci-image .
141150
elif [ "${BUILD_TYPE}" == "benchmark" ]
142151
then
143152
DOCKER_BUILDKIT=1 docker build --pull --no-cache --file Dockerfile.benchmark --build-arg USE_LOCAL_SERVE_FOLDER=$USE_LOCAL_SERVE_FOLDER --build-arg BASE_IMAGE="${BASE_IMAGE}" --build-arg BRANCH_NAME="${BRANCH_NAME}" --build-arg CUDA_VERSION="${CUDA_VERSION}" --build-arg MACHINE_TYPE="${MACHINE}" --build-arg PYTHON_VERSION="${PYTHON_VERSION}" -t "${DOCKER_TAG}" .

0 commit comments

Comments
 (0)