Add libtorch ragged test in L0_batcher (triton-inference-server#5522)

* Add libtorch ragged test in L0_batcher * Update copyright * Fix Pytorch config modification based on convention * Mention PyTorch backend in ragged batching doc
mailmahee · Mar 20, 2023 · 1340078 · 1340078
1 parent 882ef7a
commit 1340078
Show file tree

Hide file tree

Showing 3 changed files with 19 additions and 3 deletions.
diff --git a/docs/user_guide/ragged_batching.md b/docs/user_guide/ragged_batching.md
@@ -57,12 +57,13 @@ How ragged input are processed in a batch of requests depends on the backend
 implementation. The backends, such as
 [ONNX Runtime backend](https://github.com/triton-inference-server/onnxruntime_backend),
 [TensorFlow backend](https://github.com/triton-inference-server/tensorflow_backend),
+[PyTorch backend](https://github.com/triton-inference-server/pytorch_backend),
 and [TensorRT backend](https://github.com/triton-inference-server/tensorrt_backend),
 require models to accept ragged inputs as 1-dimensional tensors.
 These backends concatenates the request inputs into the 1-dimensional tensor.
 
 Because the concatenated input doesn't track the start and end index for each
-request, the backends also require the model to have additional input(s),
+request, the backends often require the model to have additional input(s),
 [batch input](#batch-input), that describe various information about the batch
 formed.
 

diff --git a/qa/L0_batcher/batcher_test.py b/qa/L0_batcher/batcher_test.py
@@ -1,4 +1,4 @@
-# Copyright 2018-2022, NVIDIA CORPORATION & AFFILIATES. All rights reserved.
+# Copyright 2018-2023, NVIDIA CORPORATION & AFFILIATES. All rights reserved.
 #
 # Redistribution and use in source and binary forms, with or without
 # modification, are permitted provided that the following conditions
@@ -71,6 +71,8 @@
     _ragged_batch_supported_trials.append("plan")
 if "onnx" in _trials:
     _ragged_batch_supported_trials.append("onnx")
+if "libtorch" in _trials:
+    _ragged_batch_supported_trials.append("libtorch")
 
 _max_queue_delay_ms = 10000
 

diff --git a/qa/L0_batcher/test.sh b/qa/L0_batcher/test.sh
@@ -1,5 +1,5 @@
 #!/bin/bash
-# Copyright 2018-2022, NVIDIA CORPORATION & AFFILIATES. All rights reserved.
+# Copyright 2018-2023, NVIDIA CORPORATION & AFFILIATES. All rights reserved.
 #
 # Redistribution and use in source and binary forms, with or without
 # modification, are permitted provided that the following conditions
@@ -242,6 +242,19 @@ if [[ $BACKENDS == *"onnx"* ]]; then
                     dynamic_batching { preferred_batch_size: [ 2, 6 ], max_queue_delay_microseconds: 10000000 }" >> config.pbtxt)
 fi
 
+if [[ $BACKENDS == *"libtorch"* ]]; then
+    # Use nobatch model to match the ragged test requirement
+    cp -r $DATADIR/qa_identity_model_repository/libtorch_nobatch_zero_1_float32 var_models/libtorch_zero_1_float32 && \
+        (cd var_models/libtorch_zero_1_float32 && \
+            sed -i "s/nobatch_//" config.pbtxt && \
+            sed -i "s/^max_batch_size:.*/max_batch_size: 8/" config.pbtxt && \
+            sed -i "s/name: \"INPUT__0\"/name: \"INPUT__0\"\\nallow_ragged_batch: true/" config.pbtxt && \
+            echo "batch_output [{target_name: \"OUTPUT__0\" \
+                                    kind: BATCH_SCATTER_WITH_INPUT_SHAPE \
+                                    source_input: \"INPUT__0\" }] \
+                    dynamic_batching { preferred_batch_size: [ 2, 6 ], max_queue_delay_microseconds: 10000000 }" >> config.pbtxt)
+fi
+
 # Need to launch the server for each test so that the model status is
 # reset (which is used to make sure the correctly batch size was used
 # for execution). Test everything with fixed-tensor-size models and