Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update trace example in documentation #4951

Merged
merged 2 commits into from
Oct 7, 2022
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
65 changes: 39 additions & 26 deletions docs/user_guide/trace.md
Original file line number Diff line number Diff line change
Expand Up @@ -33,15 +33,21 @@ individual inference requests. Tracing is enable by command-line
arguments when running the tritonserver executable. For example,

```
$ tritonserver --trace-file=/tmp/trace.json --trace-rate=100 --trace-level=TIMESTAMPS ...
$ tritonserver --trace-file=/tmp/trace.json --trace-rate=100 --trace-level=TIMESTAMPS \
--log-frequency=50 --trace-count=100 ...
```

The --trace-file option indicates where the trace output should be
written. The --trace-rate option specifies the sampling rate. In
this example every 100-th inference request will be traced. The
--trace-level option indicates the level of trace detail that should
be collected. --trace-level option may be specified multiple times to
trace multiple informations. Use the --help option to get more information.
trace multiple informations. The --log-frequency option specifies the rate that
the traces are written to file. In this example Triton will log to file for
every 50 traces collected. The --trace-count option specifies the remaining
number of traces to be collected. In this example Triton will stop tracing more
requests after 100 traces are collected.
Use the --help option to get more information.

In addition to configure trace settings in command line arguments, The user may
modify the trace setting when Triton server
Expand All @@ -63,20 +69,21 @@ The trace output is a JSON file with the following schema.
"model_name": $string,
"model_version": $number,
"id": $number
"parent_id": $number,
"parent_id": $number
},
{
"id": $number,
"timestamps": [
{ "name" : $string, "ns" : $number },
...
{ "name" : $string, "ns" : $number }
]
},
{
"model_name": $string,
"model_version": $number,
"id": $number
"activity": $string,
"tensor":{
"name": $string,
"data": $string,
"shape": $string,
"dtype": $string
}
},
Expand All @@ -87,30 +94,37 @@ The trace output is a JSON file with the following schema.
Each trace is assigned a "id", which indicates the model name and
version of the inference request. If the trace is from a
model run as part of an ensemble, the "parent_id" will indicate the
"id" of the containing ensemble.
"id" of the containing ensemble.
For example:
```
[
{
"id": 1,
"model_name": "simple",
"model_version": 1
},
...
]
```

Each `TIMESTAMPS` trace will have one or more "timestamps" with
each timestamp having a name and the timestamp in nanoseconds ("ns").
For example:

```
[
{
"model_name": "simple",
"model_version": -1,
"id": 1,
"timestamps" : [
{ "name": "http recv start", "ns": 2259961222771924 },
{ "name": "http recv end", "ns": 2259961222820985 },
{ "name": "request handler start", "ns": 2259961223164078 },
{ "name": "queue start", "ns": 2259961223182400 },
{ "name": "compute start", "ns": 2259961223232405 },
{ "name": "compute end", "ns": 2259961230206777 },
{ "name": "request handler end", "ns": 2259961230211887 },
{ "name": "http send start", "ns": 2259961230529606 },
{ "name": "http send end", "ns": 2259961230543930 }
]
}
{"id": 1, "timestamps": [{ "name": "HTTP_RECV_START", "ns": 2356425054587444 }] },
{"id": 1, "timestamps": [{ "name": "HTTP_RECV_END", "ns": 2356425054632308 }] },
{"id": 1, "timestamps": [{ "name": "REQUEST_START", "ns": 2356425054785863 }] },
{"id": 1, "timestamps": [{ "name": "QUEUE_START", "ns": 2356425054791517 }] },
{"id": 1, "timestamps": [{ "name": "INFER_RESPONSE_COMPLETE", "ns": 2356425057587919 }] },
{"id": 1, "timestamps": [{ "name": "COMPUTE_START", "ns": 2356425054887198 }] },
{"id": 1, "timestamps": [{ "name": "COMPUTE_INPUT_END", "ns": 2356425057152908 }] },
{"id": 1, "timestamps": [{ "name": "COMPUTE_OUTPUT_START", "ns": 2356425057497763 }] },
{"id": 1, "timestamps": [{ "name": "COMPUTE_END", "ns": 2356425057540989 }] },
{"id": 1, "timestamps": [{ "name": "REQUEST_END", "ns": 2356425057643164 }] },
{"id": 1, "timestamps": [{ "name": "HTTP_SEND_START", "ns": 2356425057681578 }] },
{"id": 1, "timestamps": [{ "name": "HTTP_SEND_END", "ns": 2356425057712991 }] }
]
```

Expand All @@ -122,13 +136,12 @@ including its "name", "data" and "dtype". For example:
```
[
{
"model_name": "simple",
"model_version": -1,
"id": 1,
"activity": "TENSOR_QUEUE_INPUT",
"tensor":{
"name": "input",
"data": "0.1,0.1,0.1,...",
"shape": "1,16",
"dtype": "FP32"
}
}
Expand Down