Inference Protocols and APIs

Clients can communicate with Triton using either an HTTP/REST or GRPC protocol, or by a C API.

HTTP/REST and GRPC Protocols

Triton exposes both HTTP/REST and GRPC endpoints based on standard inference protocols that have been proposed by the KFServing project. To fully enable all capabilities Triton also implements a number HTTP/REST and GRPC extensions. to the KFServing inference protocol.

The HTTP/REST and GRPC protcols provide endpoints to check server and model health, metadata and statistics. Additional endpoints allow model loading and unloading, and inferencing. See the KFServing and extension documentation for details.

C API

The Triton Inference Server provides a backwards-compatible C API that allows Triton to be linked directly into a C/C++ application. The API is documented in tritonserver.h.

A simple example using the C API can be found in simple.cc. A more complicated example can be found in the source that implements the HTTP/REST and GRPC endpoints for Triton. These endpoints use the C API to communicate with the core of Triton. The primary source files for the endpoints are grpc_server.cc and http_server.cc.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

inference_protocols.md

inference_protocols.md

Inference Protocols and APIs

HTTP/REST and GRPC Protocols

C API

Files

inference_protocols.md

Latest commit

History

inference_protocols.md

File metadata and controls

Inference Protocols and APIs

HTTP/REST and GRPC Protocols

C API