Summary: install CUDA first, then TF.
ref TF 1.0 doc
ref nvidia doc, until step 3


  • 64-bit Linux
  • Python 2.7 or 3.3+ (3.5 used in this blog)
  • NVIDIA CUDA 7.5 (8.0 if Pascal GPU)
  • NVIDIA cuDNN >v4.0 (v5.1 recommended)
  • NVIDIA GPU with compute capability >3.0


1.Manually download "cuDNN v6.0 Library for Linux".
2.Bash auto Install CUDA in Ubuntu 16.04.2, can combine (&& \) with code below.
3.Install cuDNN, PIP:

sudo apt-get install -y curl git tofrodos dos2unix libcupti-dev
 && \
sudo tar -xvf cudnn-8.0-linux-x64-v5.1.tgz -C /usr/local && \
sudo apt-get install -y python-pip python3-pip python-dev python3-dev && \
pip install --upgrade pip && \
pip3 install --upgrade pip

Nvidia also asked for java on its old web (plz ignore).

4.Install TF
Method (1): install by pip, py virtual-env (recommended)
OBS: if you don't want to use pip to install binary version, see: Determine how to install

sudo apt-get install -y python-pip python-dev python-virtualenv && \
virtualenv --system-site-packages ~/tensorflowEnv && \
. ~/tensorflowEnv/bin/activate
sudo pip3 install --upgrade tensorflow-gpu

Method (2): compiling TF from source using Bazel (OBS: dated cmd from Nvidia)

echo "deb stable jdk1.8" | sudo tee /etc/apt/sources.list.d/bazel.list && \
curl | sudo apt-key add - && \
sudo apt-get update && \
sudo apt-get install -y bazel && \
git clone && \
cd tensorflow  && \
git reset --hard 70de76e && \
dos2unix configure && \
./configure && \
bazel build -c opt --config=cuda //tensorflow/tools/pip_package:build_pip_package && \
bazel-bin/tensorflow/tools/pip_package/build_pip_package /tmp/tensorflow_pkg 

OBS: method 2 gives a lot of problems due to the files are using win/dos EOL.
Method (3): docker. See TF doc.

5.Validation of installation

export LD_LIBRARY_PATH=/usr/local/cuda-8.0/lib64:$LD_LIBRARY_PATH && \
export CUDA_HOME=/usr/local/cuda-8.0 && \
export PATH=/usr/local/cuda-8.0/bin:$PATH && \
. ~/tensorflowEnv/bin/activate && \

( OBS: "/usr/local/cuda" is different from "/usr/local/cuda-8.0" )

import tensorflow as tf
hello = tf.constant('Hello, TensorFlow!')
sess = tf.Session()


image classifier using inception
//TODO more

TF Serving SavedModel()



chmod +x
./ --user
export PATH="$PATH:$HOME/bin" # optional, usually already exists
sudo pip install grpcio || pip install --user grpcio 
sudo apt-get update && sudo apt-get install -y \
        build-essential \
        curl \
        libcurl3-dev \
        git \
        libfreetype6-dev \
        libpng12-dev \
        libzmq3-dev \
        pkg-config \
        python-dev \
        python-numpy \
        python-pip \
        software-properties-common \
        swig \
        zip \
pip install tensorflow-serving-api
echo "deb [arch=amd64] stable tensorflow-model-server tensorflow-model-server-universal" | sudo tee /etc/apt/sources.list.d/tensorflow-serving.list
curl | sudo apt-key add -
sudo apt-get update && sudo apt-get install tensorflow-model-server

q&a during installation:
Q: Bazel "ERROR The 'build' command is only supported from within a workspace."
A: Solution: touch WORKSPACE ref

training & saving

Use or its .ipynb to train & save the model in SavedModel() format. (Info: the model contains two Keras Dense Layers, can be seen in


See also "ML Model Serving".

The structure of the Model's folder should be modified to ModelName/<version>/files. The param model_base_path should not contain the level and the (as the folder's name) should be a number (ref1 & ref2). For example, the WineQuality model has two versions, v0 & v2, the newest version (biggest number) will be served by default.

$> tree WineQuality/
├── 0
│   └── WineQuality.bak
│       ├── saved_model.pb
│       └── variables
│           ├──
│           └── variables.index
└── 2
    ├── saved_model.pb
    └── variables
        └── variables.index

5 directories, 6 files

serve via simple_tensorflow_serving

(Tested simple_tensorflow_serving v0.5.0 with TF v1.10.0 & cuda v9.0. See this gist for details.)
Recommended. It uses JSON request and JSON response. [ref]

pip2 install simple_tensorflow_serving
simple_tensorflow_serving --model_base_path=/path_to_model_such_as/fdp-tensorflow-python-examples/savedmodels/WineQuality/


  • use pip2 to install and make sure seeing py2 message when starting the serving. Running simple_tensorflow_serving in py3 will generate a series of errors, such as "TypeError: the JSON object must be str, not 'bytes'". (The response is either HTTP500 or 200 with error info.)
  • --model_name=WineQuality is optional, model_name="default" by default.

When the server is running in terminal, check localhost:8500 to see its status in GUI. Test JSON is generated by: curl http://localhost:8500/v1/models/default/gen_json as well as test clients. For Insomnia, POST the JSON to http://localhost:8500

q&a for simple_tensorflow_serving:

Q/ERR: tf has no attribute 'Session'.
A: (seem to be the version imcompatibility of py tf etc.) Solution: pip3 install --upgrade --force-reinstall tensorflow-gpu

Q/ERR: ImportError: cannot open shared object file: No such file or directory
A: locate gives Cuda 9.1 which is too new for TF. Compact code solution (w/ root/sudo) from ref. A reboot is needed.

serve via tensorflow_model_server

NOT recommended (failed tests), plz use simple_tensorflow_serving.
The gRPC based method is not considered here.
TF starts to support RESTful API along with gRPC since v1.8 with param --rest_api_port.

tensorflow_model_server --port=9000 --rest_api_port=8500 --model_name=WineQuality --model_base_path=/path_to_model_such_as/fdp-tensorflow-python-examples/savedmodels/WineQuality/


consuming the api
General: POST JSON to http://localhost:9000/v1/models/<model_name>:<classify|regress|predict>.
E.g. POST {"instances":[[1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0],[1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0]]} to http://localhost:8500/v1/models/WineQuality:classify.

  • Tests failed due to "JSON Value not formatted correctly".
  • 'v1' in the url seems to be the TF's version (definitely not the models).