Update documentation to describe automatic configuration generation

pangge · Nov 20, 2018 · 0c96d51 · 0c96d51
1 parent d2e9c69
commit 0c96d51
Show file tree

Hide file tree

Showing 2 changed files with 79 additions and 6 deletions.
diff --git a/docs/model_configuration.rst b/docs/model_configuration.rst
@@ -30,10 +30,14 @@
 Model Configuration
 ===================
 
-Each model in a :ref:`section-model-repository` must include a file
-called config.pbtxt that contains the configuration information for
-the model. The model configuration must be specified as
-:doc:`ModelConfig <protobuf_api/model_config.proto>` protobuf.
+Each model in a :ref:`section-model-repository` must include a model
+configuration that provides required and optional information about
+the model. Typically, this configuration is provided in a config.pbtxt
+file specified as :doc:`ModelConfig <protobuf_api/model_config.proto>`
+protobuf. In some cases, discussed in
+:ref:`section-generated-model-configuration`, the model configuration
+can be generated automatically by the inference server and so does not
+need to be provided explicitly.
 
 A minimal model configuration must specify :cpp:var:`name
 <nvidia::inferenceserver::ModelConfig::name>`, :cpp:var:`platform
@@ -93,6 +97,52 @@ zero. If the above example specified a :cpp:var:`max_batch_size
 inference server would expect to receive input tensors with shape **[
 16 ]**, and would produce an output tensor with shape **[ 16 ]**.
 
+.. _section-generated-model-configuration:
+
+Generated Model Configuration
+-----------------------------
+
+By default, the model configuration file containing the required
+settings must be provided with each model. However, if the inference
+server is started with the -\\-strict-model-config=false option, then
+in some cases the required portions of the model configuration file
+can be generated automatically by the inference server. The required
+portion of the model configuration are those settings shown in the
+example minimal configuration above. Specifically:
+
+* :ref:`TensorRT Plan <section-tensorrt-models>` models do not require
+  a model configuration file because the inference server can derive
+  all the required settings automatically.
+
+* Some :ref:`TensorFlow SavedModel <section-tensorflow-models>` models
+  do not require a model configuration file. The models must specify
+  all inputs and outputs as fixed-size tensors (with an optional
+  initial batch dimension) for the model configuration to be generated
+  automatically. The easiest way to determine if a particular
+  SavedModel is supported is to try it with the inference server and
+  check the console log and :ref:`Status API <section-api-status>` to
+  determine if the model loaded successfully.
+
+When using -\\-strict-model-config=false you can see the model
+configuration that was generated for a model by using the :ref:`Status
+API <section-api-status>`.
+
+The inference server only generates the required portion of the model
+configuration file. You must still provide the optional portions of
+the model configuration if necessary, such as :cpp:var:`version_policy
+<nvidia::inferenceserver::ModelConfig::version_policy>`,
+:cpp:var:`optimization
+<nvidia::inferenceserver::ModelConfig::optimization>`,
+:cpp:var:`dynamic_batching
+<nvidia::inferenceserver::ModelConfig::dynamic_batching>`,
+:cpp:var:`instance_group
+<nvidia::inferenceserver::ModelConfig::instance_group>`,
+:cpp:var:`default_model_filename
+<nvidia::inferenceserver::ModelConfig::default_model_filename>`,
+:cpp:var:`cc_model_filenames
+<nvidia::inferenceserver::ModelConfig::cc_model_filenames>`, and
+:cpp:var:`tags <nvidia::inferenceserver::ModelConfig::tags>`.
+
 .. _section-version-policy:
 
 Version Policy

diff --git a/docs/model_repository.rst b/docs/model_repository.rst
@@ -63,13 +63,13 @@ An example of a typical model repository layout is shown below::
         model.graphdef
 
 Any number of models may be specified and the inference server will
-attempt to load all models into CPU and GPU when the server
+attempt to load all models into the CPU and GPU when the server
 starts. The :ref:`Status API <section-api-status>` can be used to
 determine if any models failed to load successfully. The server's
 console log will also show the reason for any failures during startup.
 
 The name of the model directory (model_0 and model_1 in the above
-example) must match the name of the model specified in the required
+example) must match the name of the model specified in the
 :ref:`model configuration file <section-model-configuration>`,
 config.pbtxt. The model name is used in the :ref:`client API
 <section-client-api>` and :ref:`server API
@@ -178,6 +178,8 @@ configuration <section-model-configuration>` for description of how to
 specify different model definitions for different compute
 capabilities.
 
+.. _section-tensorrt-models:
+
 TensorRT Models
 ^^^^^^^^^^^^^^^
 
@@ -197,6 +199,17 @@ like::
       1/
         model.plan
 
+As described in :ref:`section-generated-model-configuration` the
+config.pbtxt is optional for some models. In cases where it is not
+required the minimal model repository would look like::
+
+  models/
+    <model-name>/
+      1/
+        model.plan
+
+.. _section-tensorflow-models:
+
 TensorFlow Models
 ^^^^^^^^^^^^^^^^^
 
@@ -232,6 +245,16 @@ repository for a single TensorFlow SavedModel model would look like::
         model.savedmodel/
            <saved-model files>
 
+As described in :ref:`section-generated-model-configuration` the
+config.pbtxt is optional for some models. In cases where it is not
+required the minimal model repository would look like::
+
+  models/
+    <model-name>/
+      1/
+        model.savedmodel/
+           <saved-model files>
+
 Caffe2 Models
 ^^^^^^^^^^^^^