TensorFlow Serving is an open-source system designed by Google that acts as a bridge between trained machine learning models and the applications that need to use them, streamlining the process of deploying and serving models in a production environment while maintaining efficiency and scalability.
A good way to get started using TensorFlow Serving with Intel® Extension for TensorFlow* is with Docker containers.
- Install Docker on Ubuntu 22.04
sudo apt install docker sudo apt install docker.io
- Pull docker image
# For CPU docker pull intel/intel-extension-for-tensorflow:serving-cpu # For GPU docker pull intel/intel-extension-for-tensorflow:serving-gpu
Tips:
- We recommend you put the source code of Intel® Extension for TensorFlow*, TensorFlow, and TensorFlow Serving in the same folder.
- Replace related paths with those on your machine.
Refer to Intel® Extension for TensorFlow* for C++ to build Intel® Extension for TensorFlow* C++ library
Note: When following this installation guide, you only need to build the Intel® Extension for TensorFlow* C++ library. You can ignore the other steps.
The generated libitex_cpu_cc.so
or libitex_gpu_cc.so
binary are found in the intel_extension_for_tensorflow/bazel-bin/itex/
directory.
-
Patch TensorFlow
- Get TensorFlow with commit id specified by TensorFlow Serving: https://github.com/tensorflow/serving/blob/master/WORKSPACE#L28
# Exit intel-extension-for-tensorflow source code folder cd .. # clone TensorFlow git clone https://github.com/tensorflow/tensorflow # checkout specific commit id cd tensorflow git checkout xxxxx
- Add
alwayslink=1
forkernels_experimental
library in localtensorflow/tensorflow/c/BUILD
file:tf_cuda_library( name = "kernels_experimental", srcs = ["kernels_experimental.cc"], hdrs = ["kernels_experimental.h"], copts = tf_copts(), visibility = ["//visibility:public"], deps = [ ... ] + if_not_mobile([ ... ]), alwayslink=1, # add this line )
- Get TensorFlow with commit id specified by TensorFlow Serving: https://github.com/tensorflow/serving/blob/master/WORKSPACE#L28
-
Patch TensorFlow Serving
- Get TensorFlow Serving source code
# Exit tensorflow source code folder cd .. git clone https://github.com/tensorflow/serving
- Patch TensorFlow Serving
cd serving git checkout r2.14 git apply ../intel-extension-for-tensorflow/third_party/tf_serving/serving_plugin.patch
- Get TensorFlow Serving source code
-
Build TensorFlow Serving
bazel build --copt="-Wno-error=stringop-truncation" --config=release //tensorflow_serving/model_servers:tensorflow_model_server
The generated
tensorflow_model_server
will be found in theserving/bazel-bin/tensorflow_serving/model_servers/
directory.
Refer to Intel® Extension for TensorFlow* Serving Docker Container Guide to build docker image from dockerfile.
-
Train and export TensorFlow model
cd serving rm -rf /tmp/mnist python tensorflow_serving/example/mnist_saved_model.py /tmp/mnist
Now let's take a look at the export directory. You should find a directory named
1
that containssaved_models.pb
file andvariables
folder.ls /tmp/mnist 1 ls /tmp/mnist/1 saved_model.pb variables
-
Load exported model with TensorFlow ModelServer plugged with Intel® Extension for TensorFlow*
-
Use Docker from Docker Hub
# For CPU docker run \ -it \ --rm \ -p 8500:8500 \ -e MODEL_NAME=mnist \ -v /tmp/mnist:/models/mnist \ intel/intel-extension-for-tensorflow:serving-cpu # For GPU docker run \ -it \ --rm \ -p 8500:8500 \ -e MODEL_NAME=mnist \ -v /tmp/mnist:/models/mnist \ --device /dev/dri/ \ -v /dev/dri/by-path/:/dev/dri/by-path/ \ intel/intel-extension-for-tensorflow:serving-gpu
You will see:
plugin library "/itex/bazel-bin/itex/libitex_cpu_cc.so" load successfully! plugin library "/itex/bazel-bin/itex/libitex_gpu_cc.so" load successfully!
-
Use tensorflow_model_server built from source
# cd tensorflow_model_server binary folder # For CPU ./tensorflow_model_server \ --port=8500 \ --rest_api_port=8501 \ --model_name=mnist \ --model_base_path=/tmp/mnist \ --tensorflow_plugins=path_to_libitex_cpu_cc.so # For GPU # source oneapi environment source oneapi_install_path/compiler/latest/env/vars.sh source oneapi_install_path/mkl/latest/env/vars.sh ./tensorflow_model_server \ --port=8500 \ --rest_api_port=8501 \ --model_name=mnist \ --model_base_path=/tmp/mnist \ --tensorflow_plugins=path_to_libitex_gpu_cc.so
You will see:
plugin library "path_to_libitex_cpu_cc.so/libitex_cpu_cc.so" load successfully! plugin library "path_to_libitex_gpu_cc.so/libitex_gpu_cc.so" load successfully!
-
Use Docker built from dockerfile
cd intel-extension-for-tensorflow source code folder cd docker/tensorflow-serving export MODEL_NAME=mnist export MODEL_DIR=/tmp/mnist ./run.sh [cpu/gpu]
You will see:
plugin library "/itex/itex-bazel-bin/bin/itex/libitex_cpu_cc.so" load successfully! plugin library "/itex/itex-bazel-bin/bin/itex/libitex_gpu_cc.so" load successfully!
-
-
Test the server
pip install tensorflow-serving-api cd serving python tensorflow_serving/example/mnist_client.py --num_tests=1000 --server=127.0.0.1:8500
You will see:
... Inference error rate: xx.xx%
Refer to TensorFlow Serving Guides to learn more about how to use TensorFlow Serving.