Skip to content

Commit

Permalink
Update trace example in documentation (triton-inference-server#4951)
Browse files Browse the repository at this point in the history
* Update trace example in  documentation

* Add brief description on new trace options
  • Loading branch information
GuanLuo authored Oct 7, 2022
1 parent 5072e8a commit 76366af
Showing 1 changed file with 39 additions and 26 deletions.
65 changes: 39 additions & 26 deletions docs/user_guide/trace.md
Original file line number Diff line number Diff line change
Expand Up @@ -33,15 +33,21 @@ individual inference requests. Tracing is enable by command-line
arguments when running the tritonserver executable. For example,

```
$ tritonserver --trace-file=/tmp/trace.json --trace-rate=100 --trace-level=TIMESTAMPS ...
$ tritonserver --trace-file=/tmp/trace.json --trace-rate=100 --trace-level=TIMESTAMPS \
--log-frequency=50 --trace-count=100 ...
```

The --trace-file option indicates where the trace output should be
written. The --trace-rate option specifies the sampling rate. In
this example every 100-th inference request will be traced. The
--trace-level option indicates the level of trace detail that should
be collected. --trace-level option may be specified multiple times to
trace multiple informations. Use the --help option to get more information.
trace multiple informations. The --log-frequency option specifies the rate that
the traces are written to file. In this example Triton will log to file for
every 50 traces collected. The --trace-count option specifies the remaining
number of traces to be collected. In this example Triton will stop tracing more
requests after 100 traces are collected.
Use the --help option to get more information.

In addition to configure trace settings in command line arguments, The user may
modify the trace setting when Triton server
Expand All @@ -63,20 +69,21 @@ The trace output is a JSON file with the following schema.
"model_name": $string,
"model_version": $number,
"id": $number
"parent_id": $number,
"parent_id": $number
},
{
"id": $number,
"timestamps": [
{ "name" : $string, "ns" : $number },
...
{ "name" : $string, "ns" : $number }
]
},
{
"model_name": $string,
"model_version": $number,
"id": $number
"activity": $string,
"tensor":{
"name": $string,
"data": $string,
"shape": $string,
"dtype": $string
}
},
Expand All @@ -87,30 +94,37 @@ The trace output is a JSON file with the following schema.
Each trace is assigned a "id", which indicates the model name and
version of the inference request. If the trace is from a
model run as part of an ensemble, the "parent_id" will indicate the
"id" of the containing ensemble.
"id" of the containing ensemble.
For example:
```
[
{
"id": 1,
"model_name": "simple",
"model_version": 1
},
...
]
```

Each `TIMESTAMPS` trace will have one or more "timestamps" with
each timestamp having a name and the timestamp in nanoseconds ("ns").
For example:

```
[
{
"model_name": "simple",
"model_version": -1,
"id": 1,
"timestamps" : [
{ "name": "http recv start", "ns": 2259961222771924 },
{ "name": "http recv end", "ns": 2259961222820985 },
{ "name": "request handler start", "ns": 2259961223164078 },
{ "name": "queue start", "ns": 2259961223182400 },
{ "name": "compute start", "ns": 2259961223232405 },
{ "name": "compute end", "ns": 2259961230206777 },
{ "name": "request handler end", "ns": 2259961230211887 },
{ "name": "http send start", "ns": 2259961230529606 },
{ "name": "http send end", "ns": 2259961230543930 }
]
}
{"id": 1, "timestamps": [{ "name": "HTTP_RECV_START", "ns": 2356425054587444 }] },
{"id": 1, "timestamps": [{ "name": "HTTP_RECV_END", "ns": 2356425054632308 }] },
{"id": 1, "timestamps": [{ "name": "REQUEST_START", "ns": 2356425054785863 }] },
{"id": 1, "timestamps": [{ "name": "QUEUE_START", "ns": 2356425054791517 }] },
{"id": 1, "timestamps": [{ "name": "INFER_RESPONSE_COMPLETE", "ns": 2356425057587919 }] },
{"id": 1, "timestamps": [{ "name": "COMPUTE_START", "ns": 2356425054887198 }] },
{"id": 1, "timestamps": [{ "name": "COMPUTE_INPUT_END", "ns": 2356425057152908 }] },
{"id": 1, "timestamps": [{ "name": "COMPUTE_OUTPUT_START", "ns": 2356425057497763 }] },
{"id": 1, "timestamps": [{ "name": "COMPUTE_END", "ns": 2356425057540989 }] },
{"id": 1, "timestamps": [{ "name": "REQUEST_END", "ns": 2356425057643164 }] },
{"id": 1, "timestamps": [{ "name": "HTTP_SEND_START", "ns": 2356425057681578 }] },
{"id": 1, "timestamps": [{ "name": "HTTP_SEND_END", "ns": 2356425057712991 }] }
]
```

Expand All @@ -122,13 +136,12 @@ including its "name", "data" and "dtype". For example:
```
[
{
"model_name": "simple",
"model_version": -1,
"id": 1,
"activity": "TENSOR_QUEUE_INPUT",
"tensor":{
"name": "input",
"data": "0.1,0.1,0.1,...",
"shape": "1,16",
"dtype": "FP32"
}
}
Expand Down

0 comments on commit 76366af

Please sign in to comment.