Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Custom container composition script #3039

Merged
merged 48 commits into from
Jul 23, 2021
Merged

Custom container composition script #3039

merged 48 commits into from
Jul 23, 2021

Conversation

jbkyang-nvi
Copy link
Contributor

@jbkyang-nvi jbkyang-nvi commented Jun 23, 2021

Make compose.py so users can compose custom containers with different backends themselves.
Gitlab test PR: https://gitlab-master.nvidia.com/dl/dgx/tritonserver/-/merge_requests/327

@deadeyegoodwin deadeyegoodwin changed the title Kyang make compose.py Custom container composition script Jun 23, 2021
compose.py Outdated Show resolved Hide resolved
compose.py Show resolved Hide resolved
qa/L0_infer/infer_test.py Outdated Show resolved Hide resolved
compose.py Outdated
@@ -0,0 +1,305 @@
#!/usr/bin/env python3
# Copyright (c) 2020-2021, NVIDIA CORPORATION & AFFILIATES. All rights reserved.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably need to remove the "(c)" so the formatter doesn't wrap line?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It doesn't wrap line for # started comments. I've removed the (c) for consistency

compose.py Outdated
required=False,
help='Enable verbose output.')
parser.add_argument(
'--build-name',
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

--output-name the name of the generated container, default is 'tritonserver'

Note that there is no "building" happening so we should avoid using that terminology as it can be confusing with a source build performed by build.py.

compose.py Outdated
'Build name. Default is "tritonserver_build".'
)
parser.add_argument(
'--build-dir',
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

--work-dir we don't do any clones or builds so the description should say something about "generated dockerfiles are placed here". Should default to the current directory.

compose.py Outdated
FLAGS.upstream_container_version = TRITON_VERSION_MAP[
FLAGS.version][1]

dockerfile_name = 'Dockerfile.buildbase'
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Dockerfile.compose

compose.py Outdated
with open(os.path.join(ddir, dockerfile_name), "a") as dfile:
dfile.write(df)

### create build container
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

remove build terminology

docs/compose.md Outdated
nvcr.io/nvidia/tritonserver:<xx.yy>-py3-min
nvcr.io/nvidia/tritonserver:<xx.yy>-py3
```
to build the new custom container. The resulting container version will be `xx.yy` as it inherits from the pre-mentioned builds.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"new custom Triton Docker image".

I would drop "The resulting ..." sentence

docs/compose.md Outdated
```
to build the new custom container. The resulting container version will be `xx.yy` as it inherits from the pre-mentioned builds.

### Customizing backends and repo-agents
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would not create a new section here but just continue the previous paragraph...
"compose.py provides --backend and --repoagent options that allow you to specify which backends and repository agents you want to include in the custom image. The --enable-gpu flag indicates that you want to create an image that supports NVIDIA GPUs. For example, the following creates a new docker image that contains only the TensorFlow 1 and TensorFlow 2 backends and the checksum repository agent.

docs/compose.md Outdated

Example:
```
python3 compose.py --build-dir . --backend tensorflow1 --backend tensorflow2 --repoagent checksum
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add --enable-gpu and remove --build-dir

docs/compose.md Outdated Show resolved Hide resolved
qa/L0_infer/infer_test.py Outdated Show resolved Hide resolved
compose.py Show resolved Hide resolved
compose.py Outdated Show resolved Hide resolved
compose.py Outdated Show resolved Hide resolved
compose.py Outdated Show resolved Hide resolved
compose.py Outdated Show resolved Hide resolved
compose.py Outdated Show resolved Hide resolved
compose.py Outdated Show resolved Hide resolved
compose.py Outdated Show resolved Hide resolved
compose.py Outdated Show resolved Hide resolved
compose.py Outdated Show resolved Hide resolved
compose.py Outdated Show resolved Hide resolved
build.py Show resolved Hide resolved
build.py Show resolved Hide resolved
compose.py Outdated Show resolved Hide resolved
compose.py Outdated Show resolved Hide resolved
compose.py Outdated Show resolved Hide resolved
compose.py Outdated Show resolved Hide resolved
compose.py Outdated Show resolved Hide resolved
build.py Outdated Show resolved Hide resolved
build.py Outdated
ENV NVIDIA_BUILD_ID {}
LABEL com.nvidia.build.id={}
LABEL com.nvidia.build.ref={}
'''.format(argmap['NVIDIA_BUILD_ID'], argmap['NVIDIA_BUILD_ID'],
argmap['NVIDIA_BUILD_REF'])

# Add feature labels for SageMaker endpoint
# Add feature labels for SageMaker endpoint
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

indent

build.py Outdated


def dockerfile_add_installation_linux(argmap, backends, endpoints):
df='''
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add description:
"""Common steps for production docker image, shared by build.py and compose.py"""

compose.py Outdated
import docker

### Global variables
TRITON_VERSION_MAP = {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remove, you should use this from build.py when needed

compose.py Show resolved Hide resolved
compose.py Show resolved Hide resolved
docs/compose.md Outdated
```
to create the new custom Triton Docker image.

`compose.py` provides `--container-version`, `--backend`, `--repoagent` and `--endpoint` options that allow you to specify which upstream container to use and which backends, repository agents and endpoints to include in the custom image. The `--enable-gpu` flag indicates that you want to create an image that supports NVIDIA GPUs. For example, the following creates a new docker image that contains only the TensorFlow 1 and TensorFlow 2 backends and the checksum repository agent.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

remove --container-version

docs/compose.md Outdated
nvcr.io/nvidia/tritonserver:<xx.yy>-py3-min
nvcr.io/nvidia/tritonserver:<xx.yy>-py3
```
to create the new custom Triton Docker image.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

describe here how you need to "git clone" the release branch of the server repo that corresponds to the docker container release that you want to use to compose a new image

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it not be easier to ask the user specify the upstream container version and if they don't specify, then fallback to branch specific version? This way, people don't have to clone the entire repo just for this script

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can make --container-version optional. The potential problem is that the branch they run from (e.g. "main") may have changes in compose.py that are not valid for the release container they target. So the default should be to work using TRITON_VERSION but it is ok to have --container-version flag to override if the user knows what they are doing.

build.py Show resolved Hide resolved
compose.py Outdated Show resolved Hide resolved
build.py Outdated
dfile.write(df)


"""Common steps for production docker image, shared by build.py and compose.py"""
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In python function comments go after the def, so this should be after line 689.

build.py Outdated
@@ -742,9 +766,7 @@ def create_dockerfile_linux(ddir, dockerfile_name, argmap, backends, repoagents,
COPY --chown=1000:1000 --from=tritonserver_build /workspace/build/sagemaker/serve /usr/bin/.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is no "tritonserver_build" when running from compose.py. So you likely need to move the sagemaker out of the common function and modify it as needed in compose.py.

compose.py Show resolved Hide resolved
compose.py Outdated
help=
'Generated dockerfiles are placed here. Default to current directory.')
parser.add_argument(
'--upstream-container-version',
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

just --container-version not upstream-container-version

docs/compose.md Outdated
following commands.
## Use the compose.py script

The `compose.py` script can be found in the [server repository](https://github.com/triton-inference-server/server). Simply clone the repository and run `compose.py` to create a custom container.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Need to help them know which branch to clone. They need to clone the release branch corresponding to the NGC release that they want to base their custom image on. For example they need to use r21.06 branch if they want to create an image based on the NGC 21.06 Triton release.

build.py Outdated
'2021.2.200', # ORT OpenVINO
'2021.2') # Standalone OpenVINO
'2.12.0dev': (
'21.06', # triton container
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

21.07dev

compose.py Outdated
'NVIDIA_BUILD_ID': p_build.stdout.rstrip(),
'TRITON_VERSION': version,
'TRITON_CONTAINER_VERSION': container_version,
'ENDPOINTS': f is not None,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you name this "SAGEMAKER_ENDPOINT" to be more descriptive

compose.py Outdated
type=str,
required=False,
help=
'The version to use for the generated Docker image. If not specified the container version will be chosen automatically based on --version value.'
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is no --version flag, so change desc to be "... will be chosen automatically based on the repository branch."

compose.py Outdated
'--dry-run',
action="store_true",
required=False,
help='Only create Dockerfile, does not create a container.')
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"Only create Dockerfile.compose, do not build the Docker image"

docs/compose.md Outdated
cuDNN, etc. dependencies that are required to run Triton. The
\<xx.yy\>-py3 image contains the complete Triton with all options and
backends.
The `compose.py` script can be found in the [server repository](https://github.com/triton-inference-server/server). Simply clone the repository and run `compose.py` to create a custom container. Note created container version will depend on the branch that was cloned. For example branch [r21.06](https://github.com/triton-inference-server/server/tree/r21.06) is needed to create a image based on the NGC 21.06 Triton release.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"is needed to create" -> "should be used to create"

docs/compose.md Outdated
customization currently available is limited. As a result the minimum
Triton still contains both HTTP/REST and GRPC endpoints; S3, GCS and
Azure Storage filesystem support; and the TensorRT backend.
`compose.py` provides `--backend`, `--repoagent` and `--endpoint` options that allow you to specify which backends, repository agents and endpoints to include in the custom image. The `--enable-gpu` flag indicates that you want to create an image that supports NVIDIA GPUs. For example, the following creates a new docker image that contains only the TensorFlow 1 and TensorFlow 2 backends and the checksum repository agent.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remove --endpoint

deadeyegoodwin
deadeyegoodwin previously approved these changes Jun 30, 2021
GuanLuo
GuanLuo previously approved these changes Jul 1, 2021
build.py Outdated Show resolved Hide resolved
docs/compose.md Outdated Show resolved Hide resolved
docs/compose.md Show resolved Hide resolved
qa/L0_infer/install_and_test.sh Outdated Show resolved Hide resolved
qa/L0_infer/install_and_test.sh Outdated Show resolved Hide resolved
CoderHam
CoderHam previously approved these changes Jul 22, 2021
deadeyegoodwin
deadeyegoodwin previously approved these changes Jul 22, 2021
CoderHam
CoderHam previously approved these changes Jul 22, 2021
@jbkyang-nvi jbkyang-nvi dismissed stale reviews from CoderHam and deadeyegoodwin via b0748af July 22, 2021 20:38
docs/compose.md Outdated
@@ -41,7 +41,7 @@ from source to get more exact customization.

## Use the compose.py script

The `compose.py` script can be found in the [server repository](https://github.com/triton-inference-server/server). Simply clone the repository and run `compose.py` to create a custom container. Note created container version will depend on the branch that was cloned. For example branch [r21.06](https://github.com/triton-inference-server/server/tree/r21.06) should be used to create a image based on the NGC 21.06 Triton release.
The `compose.py` script can be found in the [server repository](https://github.com/triton-inference-server/server). Simply clone the repository and run `compose.py` to create a custom container. Note created container version will depend on the branch that was cloned. For example branch [r21.08](https://github.com/triton-inference-server/server/tree/r21.06) should be used to create a image based on the NGC 21.08 Triton release.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Arent the doc updates to reflect 21.08 does during the 21.08 codefreeze? These should be 21.07 for now

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not going to be merged in before the 21.07 release.. lets change it back to 21.06.

docs/compose.md Outdated
@@ -54,6 +54,7 @@ will provide a container `tritonserver` locally. You can access the container wi
$ docker run -it tritonserver:latest
```

Note: If `compose.py` is run on release versions older than r21.08, the resulting container will have DCGM version 2.2.8 installed. This may result in different GPU statistic reporting behavior.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same here. This should be added during codefreeze. Additionally won't 21.07 have 2.2.3 with your latest changes?

dcgm_ver = re.search("DCGM_VERSION=([\S]{4,}) ", vars)
dcgm_version = ""
if dcgm_ver == None:
dcgm_version = "2.2.3"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why was this changed back to 2.2.3?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since the two versions that matter -- 21.06 and 21.07 will have 2.2.3, we should just use that for the older versions. 21.08 will have 2.2.8,

Copy link
Contributor Author

@jbkyang-nvi jbkyang-nvi Jul 22, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note the buildbase is the only time we will need the patch anyway, so this should work (and additionally make the composed container have the same version of dcgm as 21.06 and 21.07).

@jbkyang-nvi jbkyang-nvi merged commit a3faff7 into main Jul 23, 2021
@jbkyang-nvi jbkyang-nvi deleted the kyang-make-container branch July 23, 2021 17:56
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

5 participants