Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Custom container composition script #3039

Merged
merged 48 commits into from
Jul 23, 2021
Merged
Show file tree
Hide file tree
Changes from 1 commit
Commits
Show all changes
48 commits
Select commit Hold shift + click to select a range
41a2ee1
initial commit, able to specify backends
jbkyang-nvi Jun 16, 2021
5cf56a6
testing
jbkyang-nvi Jun 22, 2021
1f58666
testing
jbkyang-nvi Jun 22, 2021
5790f4f
another test
jbkyang-nvi Jun 23, 2021
d76bd6d
working version of unit test. Need to fix reliance on identity backend
jbkyang-nvi Jun 23, 2021
bdbce38
changed version for testing
jbkyang-nvi Jun 23, 2021
12c24f7
fixing some tests
jbkyang-nvi Jun 24, 2021
c1cdc3d
addressed comments
jbkyang-nvi Jun 24, 2021
e1461b4
fixing more things
jbkyang-nvi Jun 24, 2021
fb9fa2f
return infer test to original status
jbkyang-nvi Jun 24, 2021
21cd7b4
remove extra space
jbkyang-nvi Jun 24, 2021
4a4d49a
addressed comments
jbkyang-nvi Jun 25, 2021
ad0a15b
removed global variables for backend and repoagent
jbkyang-nvi Jun 25, 2021
a4734c9
addessed comments
jbkyang-nvi Jun 29, 2021
d04a225
addressed comments
jbkyang-nvi Jun 29, 2021
970264e
changed docs
jbkyang-nvi Jun 29, 2021
151924c
addressed comments
jbkyang-nvi Jun 30, 2021
f25aaf5
fix search pattern
jbkyang-nvi Jun 30, 2021
4a291ed
added dry run and updated docs
jbkyang-nvi Jun 30, 2021
e2b16d3
addressed comments
jbkyang-nvi Jun 30, 2021
d42f956
addressed comments
jbkyang-nvi Jul 1, 2021
080faea
addressed documentation comments
jbkyang-nvi Jul 2, 2021
b34a725
fix build.py
jbkyang-nvi Jul 8, 2021
048bfd1
add workdir before copy
jbkyang-nvi Jul 8, 2021
dad9be5
adjusted order so entrypoint is inserted correctly
jbkyang-nvi Jul 9, 2021
5382383
setting perms correctly
jbkyang-nvi Jul 9, 2021
60fbfb5
fix user permission issue
jbkyang-nvi Jul 9, 2021
2ac9e3d
addressed comments and removed extra instance of set path
jbkyang-nvi Jul 9, 2021
d2639f1
fix compose script
jbkyang-nvi Jul 9, 2021
f3cb329
fix issue with docker inspect
jbkyang-nvi Jul 11, 2021
a06784a
remove extra definition of
jbkyang-nvi Jul 12, 2021
3165e91
adding comments and error messages to find discrepency between local …
jbkyang-nvi Jul 12, 2021
5508fd9
added error messages
jbkyang-nvi Jul 12, 2021
dbea483
remove extra env in dockerfile
jbkyang-nvi Jul 12, 2021
da3b716
adjusted build so more similar to old build.py
jbkyang-nvi Jul 13, 2021
344fb50
removed log message
jbkyang-nvi Jul 14, 2021
adf7e9c
add images flag so users can specify the image
jbkyang-nvi Jul 16, 2021
32da9db
remove wrong variable name
jbkyang-nvi Jul 16, 2021
3144fba
addressed comments
jbkyang-nvi Jul 20, 2021
279f9d4
addressed comments
jbkyang-nvi Jul 20, 2021
8d72b6d
changed function name and add comments
jbkyang-nvi Jul 20, 2021
e1eced4
rebased and reformatted
jbkyang-nvi Jul 21, 2021
46b3be4
fix build.py problem
jbkyang-nvi Jul 21, 2021
12b78b9
fix again
jbkyang-nvi Jul 22, 2021
da3c92e
add default DCGM for older container versions
jbkyang-nvi Jul 22, 2021
f3a25d2
change default version to 2.2.8
jbkyang-nvi Jul 22, 2021
b0748af
updated documentation
jbkyang-nvi Jul 22, 2021
ad8b523
fixed documentation
jbkyang-nvi Jul 22, 2021
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Prev Previous commit
Next Next commit
addressed comments
  • Loading branch information
jbkyang-nvi committed Jul 21, 2021
commit d04a2251c4a305c2dde273075413594f9a1a5175
52 changes: 24 additions & 28 deletions build.py
Original file line number Diff line number Diff line change
Expand Up @@ -503,6 +503,19 @@ def fil_cmake_args(images):
return cargs


def get_container_versions(version, container_version,
upstream_container_version):
if container_version is None:
if version not in TRITON_VERSION_MAP:
fail('container version not known for {}'.format(version))
container_version = TRITON_VERSION_MAP[version][0]
if upstream_container_version is None:
if version not in TRITON_VERSION_MAP:
fail('upstream container version not known for {}'.format(version))
upstream_container_version = TRITON_VERSION_MAP[version][1]
return container_version, upstream_container_version


def create_dockerfile_buildbase(ddir, dockerfile_name, argmap, backends):
df = '''
ARG TRITON_VERSION={}
Expand Down Expand Up @@ -626,9 +639,6 @@ def create_dockerfile_linux(ddir, dockerfile_name, argmap, backends, repoagents,
#
# Multistage build.
#
ARG TRITON_VERSION={}
ARG TRITON_CONTAINER_VERSION={}

ARG BASE_IMAGE={}
ARG BUILD_IMAGE=tritonserver_build

Expand All @@ -642,18 +652,10 @@ def create_dockerfile_linux(ddir, dockerfile_name, argmap, backends, repoagents,
############################################################################
FROM ${{BASE_IMAGE}}

ARG TRITON_VERSION
ARG TRITON_CONTAINER_VERSION

ENV TRITON_SERVER_VERSION ${{TRITON_VERSION}}
ENV NVIDIA_TRITON_SERVER_VERSION ${{TRITON_CONTAINER_VERSION}}
ENV TRITON_SERVER_VERSION ${{TRITON_VERSION}}
ENV NVIDIA_TRITON_SERVER_VERSION ${{TRITON_CONTAINER_VERSION}}
LABEL com.nvidia.tritonserver.version="${{TRITON_SERVER_VERSION}}"

ENV PATH /opt/tritonserver/bin:${{PATH}}
'''.format(argmap['TRITON_VERSION'], argmap['TRITON_CONTAINER_VERSION'],
argmap['BASE_IMAGE'])
'''.format(argmap['BASE_IMAGE'])

df += dockerfile_add_installation_linux(argmap, backends, endpoints)

df += '''
WORKDIR /opt/tritonserver
Expand Down Expand Up @@ -681,15 +683,16 @@ def create_dockerfile_linux(ddir, dockerfile_name, argmap, backends, repoagents,
df += '''
COPY --chown=1000:1000 --from=tritonserver_build /tmp/tritonbuild/install/repoagents repoagents
'''
df += dockerfile_add_installation_linux(argmap, backends, endpoints)

mkdir(ddir)
with open(os.path.join(ddir, dockerfile_name), "w") as dfile:
dfile.write(df)


"""Common steps for production docker image, shared by build.py and compose.py"""
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In python function comments go after the def, so this should be after line 689.



def dockerfile_add_installation_linux(argmap, backends, endpoints):
df='''
df = '''
ARG TRITON_VERSION={}
deadeyegoodwin marked this conversation as resolved.
Show resolved Hide resolved
ARG TRITON_CONTAINER_VERSION={}

Expand Down Expand Up @@ -761,7 +764,7 @@ def dockerfile_add_installation_linux(argmap, backends, endpoints):
'''.format(argmap['NVIDIA_BUILD_ID'], argmap['NVIDIA_BUILD_ID'],
argmap['NVIDIA_BUILD_REF'])

# Add feature labels for SageMaker endpoint
# Add feature labels for SageMaker endpoint
if 'sagemaker' in endpoints:
df += '''
LABEL com.amazonaws.sagemaker.capabilities.accept-bind-to-port=true
Expand Down Expand Up @@ -1277,16 +1280,9 @@ def container_build(images, backends, repoagents, endpoints):
# For other versions use the TRITON_VERSION_MAP unless explicitly
# given.
if not FLAGS.no_container_build:
if FLAGS.container_version is None:
if FLAGS.version not in TRITON_VERSION_MAP:
fail('container version not known for {}'.format(FLAGS.version))
FLAGS.container_version = TRITON_VERSION_MAP[FLAGS.version][0]
if FLAGS.upstream_container_version is None:
if FLAGS.version not in TRITON_VERSION_MAP:
fail('upstream container version not known for {}'.format(
FLAGS.version))
FLAGS.upstream_container_version = TRITON_VERSION_MAP[
FLAGS.version][1]
FLAGS.container_version, FLAGS.upstream_container_version = get_container_versions(
FLAGS.version, FLAGS.container_version,
FLAGS.upstream_container_version)

log('container version {}'.format(FLAGS.container_version))
log('upstream container version {}'.format(
Expand Down
134 changes: 77 additions & 57 deletions compose.py
Original file line number Diff line number Diff line change
Expand Up @@ -28,17 +28,10 @@
import os
import subprocess
import sys
import docker

### Global variables
TRITON_VERSION_MAP = {
'2.12.0dev': (
'21.07dev', # triton container
'21.06'), # upstream container
}

FLAGS = None


#### helper functions
def log(msg, force=False):
if force or not FLAGS.quiet:
Expand All @@ -56,12 +49,12 @@ def log_verbose(msg):
def fail(msg):
deadeyegoodwin marked this conversation as resolved.
Show resolved Hide resolved
print('error: {}'.format(msg), file=sys.stderr)
sys.exit(1)


def fail_if(p, msg):
if p:
fail(msg)


##### create base image for gpu
def start_gpu_dockerfile(ddir, argmap, dockerfile_name):
Expand All @@ -72,15 +65,8 @@ def start_gpu_dockerfile(ddir, argmap, dockerfile_name):
#
FROM nvcr.io/nvidia/tritonserver:{}-py3 as full
FROM nvcr.io/nvidia/tritonserver:{}-py3-min
ARG TRITON_VERSION={}
ARG TRITON_CONTAINER_VERSION={}
ENV TRITON_SERVER_VERSION ${{TRITON_VERSION}}
ENV NVIDIA_TRITON_SERVER_VERSION ${{TRITON_CONTAINER_VERSION}}
ENV TRITON_SERVER_VERSION ${{TRITON_VERSION}}
ENV NVIDIA_TRITON_SERVER_VERSION ${{TRITON_CONTAINER_VERSION}}
LABEL com.nvidia.tritonserver.version="${{TRITON_SERVER_VERSION}}"
ENV PATH /opt/tritonserver/bin:${{PATH}}
'''.format(argmap['TRITON_CONTAINER_VERSION'], argmap['TRITON_CONTAINER_VERSION'], argmap['TRITON_VERSION'], argmap['TRITON_CONTAINER_VERSION'])
'''.format(argmap['TRITON_CONTAINER_VERSION'],
argmap['TRITON_CONTAINER_VERSION'])

# Copy over files
df += '''
Expand All @@ -107,6 +93,7 @@ def start_gpu_dockerfile(ddir, argmap, dockerfile_name):
with open(os.path.join(ddir, dockerfile_name), "w") as dfile:
dfile.write(df)


### add additional backends needed
def add_requested_backends(ddir, dockerfile_name, backends):
df = "# Copying over backends \n"
Expand All @@ -127,33 +114,48 @@ def add_requested_repoagents(ddir, dockerfile_name, repoagents):
with open(os.path.join(ddir, dockerfile_name), "a") as dfile:
dfile.write(df)


def create_argmap(container_version):
upstreamDockerImage = 'nvcr.io/nvidia/tritonserver:{}-py3'.format(container_version)

baseRunArgs = [ 'docker', 'inspect', '-f']
p_version = subprocess.run(baseRunArgs + ['{{range $index, $value := .Config.Env}}{{$value}} {{end}}', upstreamDockerImage], capture_output=True, text=True)
upstreamDockerImage = 'nvcr.io/nvidia/tritonserver:{}-py3'.format(
container_version)

baseRunArgs = ['docker', 'inspect', '-f']
p_version = subprocess.run(baseRunArgs + [
'{{range $index, $value := .Config.Env}}{{$value}} {{end}}',
upstreamDockerImage
],
capture_output=True,
text=True)
vars = p_version.stdout
import re # parse all PATH enviroment variables
import re # parse all PATH enviroment variables
e = re.search("TRITON_SERVER_VERSION=([\S]{6,}) ", vars)
version = "" if e == None else e.group(1)
fail_if(p_version.returncode != 0 or len(version) == 0, 'docker inspect to find triton version failed')
p_sha = subprocess.run(baseRunArgs + ['{{ index .Config.Labels "com.nvidia.build.ref"}}', upstreamDockerImage], capture_output=True, text=True)
fail_if(p_sha.returncode != 0, 'docker inspect of upstream docker image build sha failed')
p_build = subprocess.run(baseRunArgs + ['{{ index .Config.Labels "com.nvidia.build.id"}}', upstreamDockerImage], capture_output=True, text=True)
fail_if(p_build.returncode != 0, 'docker inspect of upstream docker image build sha failed')

fail_if(p_version.returncode != 0 or len(version) == 0,
'docker inspect to find triton version failed')
p_sha = subprocess.run(baseRunArgs + [
'{{ index .Config.Labels "com.nvidia.build.ref"}}', upstreamDockerImage
],
capture_output=True,
text=True)
fail_if(p_sha.returncode != 0,
'docker inspect of upstream docker image build sha failed')
p_build = subprocess.run(baseRunArgs + [
'{{ index .Config.Labels "com.nvidia.build.id"}}', upstreamDockerImage
],
capture_output=True,
text=True)
fail_if(p_build.returncode != 0,
'docker inspect of upstream docker image build sha failed')

argmap = {
'NVIDIA_BUILD_REF':
p_sha.stdout.rstrip(),
'NVIDIA_BUILD_ID':
p_build.stdout.rstrip(),
'TRITON_VERSION':
version,
'TRITON_CONTAINER_VERSION':
container_version,
'NVIDIA_BUILD_REF': p_sha.stdout.rstrip(),
'NVIDIA_BUILD_ID': p_build.stdout.rstrip(),
'TRITON_VERSION': version,
'TRITON_CONTAINER_VERSION': container_version,
}
return argmap


# Install dependencies and run entrypoint script
def end_gpu_dockerfile(ddir, dockerfile_name, argmap, backends, endpoint):
import build
Expand Down Expand Up @@ -183,30 +185,28 @@ def build_docker_image(ddir, dockerfile_name, container_name):
action="store_true",
required=False,
help='Enable verbose output.')
parser.add_argument('--output-name',
type=str,
required=False,
help='Name for the generated Docker image. Default is "tritonserver".')
parser.add_argument(
'--output-name',
type=str,
required=False,
help='Name for the generated Docker image. Default is "tritonserver".')
parser.add_argument(
'--work-dir',
type=str,
required=False,
help=
'Generated dockerfiles are placed here. Default to current directory.'
)
'Generated dockerfiles are placed here. Default to current directory.')
parser.add_argument(
'--container-version',
'--upstream-container-version',
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

just --container-version not upstream-container-version

type=str,
required=True,
required=False,
help=
'The version to use for the generated Docker image. If not specified the container version will be chosen automatically based on --version value.'
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is no --version flag, so change desc to be "... will be chosen automatically based on the repository branch."

)
parser.add_argument(
'--enable-gpu',
action="store_true",
required=False,
help='Generate a Triton image that supports GPU'
)
parser.add_argument('--enable-gpu',
action="store_true",
required=False,
help='Generate a Triton image that supports GPU')
parser.add_argument(
'--backend',
action='append',
Expand All @@ -218,14 +218,22 @@ def build_docker_image(ddir, dockerfile_name, container_name):
'--repoagent',
action='append',
required=False,
help='Include <repoagent-name> in the generated Docker image. The flag may be specified multiple times.')
help=
'Include <repoagent-name> in the generated Docker image. The flag may be specified multiple times.'
)
parser.add_argument(
deadeyegoodwin marked this conversation as resolved.
Show resolved Hide resolved
'--endpoint',
action='append',
required=False,
help='Include <endpoint-name> in the generated Docker image. The flag may be specified multiple times.')
help=
'Include <endpoint-name> in the generated Docker image. The flag may be specified multiple times.'
)
FLAGS = parser.parse_args()

fail_if(
not FLAGS.enable_gpu,
"Only GPU versions are supported right now. Add --enable-gpu to compose.py command"
)

if FLAGS.work_dir is None:
FLAGS.work_dir = "."
if FLAGS.output_name is None:
Expand All @@ -240,9 +248,21 @@ def build_docker_image(ddir, dockerfile_name, container_name):
if FLAGS.endpoint is None:
FLAGS.endpoint = []

deadeyegoodwin marked this conversation as resolved.
Show resolved Hide resolved
argmap = create_argmap(FLAGS.container_version)
if FLAGS.upstream_container_version is None:
# Read from TRITON_VERSION file in server repo to determine version
with open('TRITON_VERSION', "r") as vfile:
version = vfile.readline().strip()
import build
container_version, FLAGS.upstream_container_version = build.get_container_versions(
version, "", FLAGS.upstream_container_version)
log('version {}'.format(version))
log('upstream container version {}'.format(
FLAGS.upstream_container_version))
argmap = create_argmap(FLAGS.upstream_container_version)

start_gpu_dockerfile(FLAGS.work_dir, argmap, dockerfile_name)
deadeyegoodwin marked this conversation as resolved.
Show resolved Hide resolved
add_requested_backends(FLAGS.work_dir, dockerfile_name, FLAGS.backend)
add_requested_repoagents(FLAGS.work_dir, dockerfile_name, FLAGS.repoagent)
end_gpu_dockerfile(FLAGS.work_dir, dockerfile_name, argmap, FLAGS.backend, FLAGS.endpoint)
end_gpu_dockerfile(FLAGS.work_dir, dockerfile_name, argmap, FLAGS.backend,
FLAGS.endpoint)
build_docker_image(FLAGS.work_dir, dockerfile_name, FLAGS.output_name)