-
Notifications
You must be signed in to change notification settings - Fork 1.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Custom container composition script #3039
Conversation
compose.py
Outdated
@@ -0,0 +1,305 @@ | |||
#!/usr/bin/env python3 | |||
# Copyright (c) 2020-2021, NVIDIA CORPORATION & AFFILIATES. All rights reserved. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Probably need to remove the "(c)" so the formatter doesn't wrap line?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It doesn't wrap line for #
started comments. I've removed the (c)
for consistency
compose.py
Outdated
required=False, | ||
help='Enable verbose output.') | ||
parser.add_argument( | ||
'--build-name', |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
--output-name the name of the generated container, default is 'tritonserver'
Note that there is no "building" happening so we should avoid using that terminology as it can be confusing with a source build performed by build.py.
compose.py
Outdated
'Build name. Default is "tritonserver_build".' | ||
) | ||
parser.add_argument( | ||
'--build-dir', |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
--work-dir we don't do any clones or builds so the description should say something about "generated dockerfiles are placed here". Should default to the current directory.
compose.py
Outdated
FLAGS.upstream_container_version = TRITON_VERSION_MAP[ | ||
FLAGS.version][1] | ||
|
||
dockerfile_name = 'Dockerfile.buildbase' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Dockerfile.compose
compose.py
Outdated
with open(os.path.join(ddir, dockerfile_name), "a") as dfile: | ||
dfile.write(df) | ||
|
||
### create build container |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
remove build terminology
docs/compose.md
Outdated
nvcr.io/nvidia/tritonserver:<xx.yy>-py3-min | ||
nvcr.io/nvidia/tritonserver:<xx.yy>-py3 | ||
``` | ||
to build the new custom container. The resulting container version will be `xx.yy` as it inherits from the pre-mentioned builds. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"new custom Triton Docker image".
I would drop "The resulting ..." sentence
docs/compose.md
Outdated
``` | ||
to build the new custom container. The resulting container version will be `xx.yy` as it inherits from the pre-mentioned builds. | ||
|
||
### Customizing backends and repo-agents |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would not create a new section here but just continue the previous paragraph...
"compose.py
provides --backend
and --repoagent
options that allow you to specify which backends and repository agents you want to include in the custom image. The --enable-gpu flag indicates that you want to create an image that supports NVIDIA GPUs. For example, the following creates a new docker image that contains only the TensorFlow 1 and TensorFlow 2 backends and the checksum repository agent.
docs/compose.md
Outdated
|
||
Example: | ||
``` | ||
python3 compose.py --build-dir . --backend tensorflow1 --backend tensorflow2 --repoagent checksum |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add --enable-gpu and remove --build-dir
db0eb02
to
830816b
Compare
478f43e
to
5a73b4f
Compare
build.py
Outdated
ENV NVIDIA_BUILD_ID {} | ||
LABEL com.nvidia.build.id={} | ||
LABEL com.nvidia.build.ref={} | ||
'''.format(argmap['NVIDIA_BUILD_ID'], argmap['NVIDIA_BUILD_ID'], | ||
argmap['NVIDIA_BUILD_REF']) | ||
|
||
# Add feature labels for SageMaker endpoint | ||
# Add feature labels for SageMaker endpoint |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
indent
build.py
Outdated
|
||
|
||
def dockerfile_add_installation_linux(argmap, backends, endpoints): | ||
df=''' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add description:
"""Common steps for production docker image, shared by build.py and compose.py"""
compose.py
Outdated
import docker | ||
|
||
### Global variables | ||
TRITON_VERSION_MAP = { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Remove, you should use this from build.py when needed
docs/compose.md
Outdated
``` | ||
to create the new custom Triton Docker image. | ||
|
||
`compose.py` provides `--container-version`, `--backend`, `--repoagent` and `--endpoint` options that allow you to specify which upstream container to use and which backends, repository agents and endpoints to include in the custom image. The `--enable-gpu` flag indicates that you want to create an image that supports NVIDIA GPUs. For example, the following creates a new docker image that contains only the TensorFlow 1 and TensorFlow 2 backends and the checksum repository agent. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
remove --container-version
docs/compose.md
Outdated
nvcr.io/nvidia/tritonserver:<xx.yy>-py3-min | ||
nvcr.io/nvidia/tritonserver:<xx.yy>-py3 | ||
``` | ||
to create the new custom Triton Docker image. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
describe here how you need to "git clone" the release branch of the server repo that corresponds to the docker container release that you want to use to compose a new image
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would it not be easier to ask the user specify the upstream container version and if they don't specify, then fallback to branch specific version? This way, people don't have to clone the entire repo just for this script
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can make --container-version optional. The potential problem is that the branch they run from (e.g. "main") may have changes in compose.py that are not valid for the release container they target. So the default should be to work using TRITON_VERSION but it is ok to have --container-version flag to override if the user knows what they are doing.
build.py
Outdated
dfile.write(df) | ||
|
||
|
||
"""Common steps for production docker image, shared by build.py and compose.py""" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In python function comments go after the def, so this should be after line 689.
build.py
Outdated
@@ -742,9 +766,7 @@ def create_dockerfile_linux(ddir, dockerfile_name, argmap, backends, repoagents, | |||
COPY --chown=1000:1000 --from=tritonserver_build /workspace/build/sagemaker/serve /usr/bin/. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is no "tritonserver_build" when running from compose.py. So you likely need to move the sagemaker out of the common function and modify it as needed in compose.py.
compose.py
Outdated
help= | ||
'Generated dockerfiles are placed here. Default to current directory.') | ||
parser.add_argument( | ||
'--upstream-container-version', |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
just --container-version not upstream-container-version
docs/compose.md
Outdated
following commands. | ||
## Use the compose.py script | ||
|
||
The `compose.py` script can be found in the [server repository](https://github.com/triton-inference-server/server). Simply clone the repository and run `compose.py` to create a custom container. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Need to help them know which branch to clone. They need to clone the release branch corresponding to the NGC release that they want to base their custom image on. For example they need to use r21.06 branch if they want to create an image based on the NGC 21.06 Triton release.
build.py
Outdated
'2021.2.200', # ORT OpenVINO | ||
'2021.2') # Standalone OpenVINO | ||
'2.12.0dev': ( | ||
'21.06', # triton container |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
21.07dev
compose.py
Outdated
'NVIDIA_BUILD_ID': p_build.stdout.rstrip(), | ||
'TRITON_VERSION': version, | ||
'TRITON_CONTAINER_VERSION': container_version, | ||
'ENDPOINTS': f is not None, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you name this "SAGEMAKER_ENDPOINT" to be more descriptive
compose.py
Outdated
type=str, | ||
required=False, | ||
help= | ||
'The version to use for the generated Docker image. If not specified the container version will be chosen automatically based on --version value.' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is no --version flag, so change desc to be "... will be chosen automatically based on the repository branch."
compose.py
Outdated
'--dry-run', | ||
action="store_true", | ||
required=False, | ||
help='Only create Dockerfile, does not create a container.') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"Only create Dockerfile.compose, do not build the Docker image"
docs/compose.md
Outdated
cuDNN, etc. dependencies that are required to run Triton. The | ||
\<xx.yy\>-py3 image contains the complete Triton with all options and | ||
backends. | ||
The `compose.py` script can be found in the [server repository](https://github.com/triton-inference-server/server). Simply clone the repository and run `compose.py` to create a custom container. Note created container version will depend on the branch that was cloned. For example branch [r21.06](https://github.com/triton-inference-server/server/tree/r21.06) is needed to create a image based on the NGC 21.06 Triton release. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"is needed to create" -> "should be used to create"
docs/compose.md
Outdated
customization currently available is limited. As a result the minimum | ||
Triton still contains both HTTP/REST and GRPC endpoints; S3, GCS and | ||
Azure Storage filesystem support; and the TensorRT backend. | ||
`compose.py` provides `--backend`, `--repoagent` and `--endpoint` options that allow you to specify which backends, repository agents and endpoints to include in the custom image. The `--enable-gpu` flag indicates that you want to create an image that supports NVIDIA GPUs. For example, the following creates a new docker image that contains only the TensorFlow 1 and TensorFlow 2 backends and the checksum repository agent. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Remove --endpoint
2b73bef
to
e1eced4
Compare
b0748af
docs/compose.md
Outdated
@@ -41,7 +41,7 @@ from source to get more exact customization. | |||
|
|||
## Use the compose.py script | |||
|
|||
The `compose.py` script can be found in the [server repository](https://github.com/triton-inference-server/server). Simply clone the repository and run `compose.py` to create a custom container. Note created container version will depend on the branch that was cloned. For example branch [r21.06](https://github.com/triton-inference-server/server/tree/r21.06) should be used to create a image based on the NGC 21.06 Triton release. | |||
The `compose.py` script can be found in the [server repository](https://github.com/triton-inference-server/server). Simply clone the repository and run `compose.py` to create a custom container. Note created container version will depend on the branch that was cloned. For example branch [r21.08](https://github.com/triton-inference-server/server/tree/r21.06) should be used to create a image based on the NGC 21.08 Triton release. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Arent the doc updates to reflect 21.08 does during the 21.08 codefreeze? These should be 21.07 for now
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is not going to be merged in before the 21.07 release.. lets change it back to 21.06.
docs/compose.md
Outdated
@@ -54,6 +54,7 @@ will provide a container `tritonserver` locally. You can access the container wi | |||
$ docker run -it tritonserver:latest | |||
``` | |||
|
|||
Note: If `compose.py` is run on release versions older than r21.08, the resulting container will have DCGM version 2.2.8 installed. This may result in different GPU statistic reporting behavior. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same here. This should be added during codefreeze. Additionally won't 21.07 have 2.2.3 with your latest changes?
dcgm_ver = re.search("DCGM_VERSION=([\S]{4,}) ", vars) | ||
dcgm_version = "" | ||
if dcgm_ver == None: | ||
dcgm_version = "2.2.3" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why was this changed back to 2.2.3?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since the two versions that matter -- 21.06 and 21.07 will have 2.2.3, we should just use that for the older versions. 21.08 will have 2.2.8,
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note the buildbase is the only time we will need the patch anyway, so this should work (and additionally make the composed container have the same version of dcgm as 21.06 and 21.07).
Make compose.py so users can compose custom containers with different backends themselves.
Gitlab test PR: https://gitlab-master.nvidia.com/dl/dgx/tritonserver/-/merge_requests/327