Skip to content

Commit

Permalink
Update documentation to describe automatic configuration generation
Browse files Browse the repository at this point in the history
  • Loading branch information
David Goodwin committed Nov 20, 2018
1 parent d2e9c69 commit 0c96d51
Show file tree
Hide file tree
Showing 2 changed files with 79 additions and 6 deletions.
58 changes: 54 additions & 4 deletions docs/model_configuration.rst
Original file line number Diff line number Diff line change
Expand Up @@ -30,10 +30,14 @@
Model Configuration
===================

Each model in a :ref:`section-model-repository` must include a file
called config.pbtxt that contains the configuration information for
the model. The model configuration must be specified as
:doc:`ModelConfig <protobuf_api/model_config.proto>` protobuf.
Each model in a :ref:`section-model-repository` must include a model
configuration that provides required and optional information about
the model. Typically, this configuration is provided in a config.pbtxt
file specified as :doc:`ModelConfig <protobuf_api/model_config.proto>`
protobuf. In some cases, discussed in
:ref:`section-generated-model-configuration`, the model configuration
can be generated automatically by the inference server and so does not
need to be provided explicitly.

A minimal model configuration must specify :cpp:var:`name
<nvidia::inferenceserver::ModelConfig::name>`, :cpp:var:`platform
Expand Down Expand Up @@ -93,6 +97,52 @@ zero. If the above example specified a :cpp:var:`max_batch_size
inference server would expect to receive input tensors with shape **[
16 ]**, and would produce an output tensor with shape **[ 16 ]**.

.. _section-generated-model-configuration:

Generated Model Configuration
-----------------------------

By default, the model configuration file containing the required
settings must be provided with each model. However, if the inference
server is started with the -\\-strict-model-config=false option, then
in some cases the required portions of the model configuration file
can be generated automatically by the inference server. The required
portion of the model configuration are those settings shown in the
example minimal configuration above. Specifically:

* :ref:`TensorRT Plan <section-tensorrt-models>` models do not require
a model configuration file because the inference server can derive
all the required settings automatically.

* Some :ref:`TensorFlow SavedModel <section-tensorflow-models>` models
do not require a model configuration file. The models must specify
all inputs and outputs as fixed-size tensors (with an optional
initial batch dimension) for the model configuration to be generated
automatically. The easiest way to determine if a particular
SavedModel is supported is to try it with the inference server and
check the console log and :ref:`Status API <section-api-status>` to
determine if the model loaded successfully.

When using -\\-strict-model-config=false you can see the model
configuration that was generated for a model by using the :ref:`Status
API <section-api-status>`.

The inference server only generates the required portion of the model
configuration file. You must still provide the optional portions of
the model configuration if necessary, such as :cpp:var:`version_policy
<nvidia::inferenceserver::ModelConfig::version_policy>`,
:cpp:var:`optimization
<nvidia::inferenceserver::ModelConfig::optimization>`,
:cpp:var:`dynamic_batching
<nvidia::inferenceserver::ModelConfig::dynamic_batching>`,
:cpp:var:`instance_group
<nvidia::inferenceserver::ModelConfig::instance_group>`,
:cpp:var:`default_model_filename
<nvidia::inferenceserver::ModelConfig::default_model_filename>`,
:cpp:var:`cc_model_filenames
<nvidia::inferenceserver::ModelConfig::cc_model_filenames>`, and
:cpp:var:`tags <nvidia::inferenceserver::ModelConfig::tags>`.

.. _section-version-policy:

Version Policy
Expand Down
27 changes: 25 additions & 2 deletions docs/model_repository.rst
Original file line number Diff line number Diff line change
Expand Up @@ -63,13 +63,13 @@ An example of a typical model repository layout is shown below::
model.graphdef

Any number of models may be specified and the inference server will
attempt to load all models into CPU and GPU when the server
attempt to load all models into the CPU and GPU when the server
starts. The :ref:`Status API <section-api-status>` can be used to
determine if any models failed to load successfully. The server's
console log will also show the reason for any failures during startup.

The name of the model directory (model_0 and model_1 in the above
example) must match the name of the model specified in the required
example) must match the name of the model specified in the
:ref:`model configuration file <section-model-configuration>`,
config.pbtxt. The model name is used in the :ref:`client API
<section-client-api>` and :ref:`server API
Expand Down Expand Up @@ -178,6 +178,8 @@ configuration <section-model-configuration>` for description of how to
specify different model definitions for different compute
capabilities.

.. _section-tensorrt-models:

TensorRT Models
^^^^^^^^^^^^^^^

Expand All @@ -197,6 +199,17 @@ like::
1/
model.plan

As described in :ref:`section-generated-model-configuration` the
config.pbtxt is optional for some models. In cases where it is not
required the minimal model repository would look like::

models/
<model-name>/
1/
model.plan

.. _section-tensorflow-models:

TensorFlow Models
^^^^^^^^^^^^^^^^^

Expand Down Expand Up @@ -232,6 +245,16 @@ repository for a single TensorFlow SavedModel model would look like::
model.savedmodel/
<saved-model files>

As described in :ref:`section-generated-model-configuration` the
config.pbtxt is optional for some models. In cases where it is not
required the minimal model repository would look like::

models/
<model-name>/
1/
model.savedmodel/
<saved-model files>

Caffe2 Models
^^^^^^^^^^^^^

Expand Down

0 comments on commit 0c96d51

Please sign in to comment.