TensorFlow Engineering with CUDA GPU for Deep Learning
See also:
- TF practical part in do deep learning
- How to setup Docker and Nvidia-Docker 2.0 on Ubuntu 18.04
Install #
Summary: install CUDA first, then TF.
ref TF 1.0 doc
ref nvidia doc, until step 3
requirements #
- 64-bit Linux
- Python 2.7 or 3.3+ (3.5 used in this blog)
- NVIDIA CUDA 7.5 (8.0 if Pascal GPU)
- NVIDIA cuDNN >v4.0 (v5.1 recommended)
- NVIDIA GPU with compute capability >3.0
steps #
1.Manually download “cuDNN v6.0 Library for Linux”.
2.Bash auto Install CUDA in Ubuntu 16.04.2, can combine (&& \
) with code below.
3.Install cuDNN, PIP:
sudo apt-get install -y curl git tofrodos dos2unix libcupti-dev
&& \
sudo tar -xvf cudnn-8.0-linux-x64-v5.1.tgz -C /usr/local && \
sudo apt-get install -y python-pip python3-pip python-dev python3-dev && \
pip install --upgrade pip && \
pip3 install --upgrade pip
Nvidia also asked for java on its old web (plz ignore).
4.Install TF
Method (1): install by pip, py virtual-env (recommended)
OBS: if you don’t want to use pip to install binary version, see: Determine how to install
sudo apt-get install -y python-pip python-dev python-virtualenv && \
virtualenv --system-site-packages ~/tensorflowEnv && \
. ~/tensorflowEnv/bin/activate
sudo pip3 install --upgrade tensorflow-gpu
Method (2): compiling TF from source using Bazel (OBS: dated cmd from Nvidia)
echo "deb http://storage.googleapis.com/bazel-apt stable jdk1.8" | sudo tee /etc/apt/sources.list.d/bazel.list && \
curl https://storage.googleapis.com/bazel-apt/doc/apt-key.pub.gpg | sudo apt-key add - && \
sudo apt-get update && \
sudo apt-get install -y bazel && \
git clone https://github.com/tensorflow/tensorflow && \
cd tensorflow && \
git reset --hard 70de76e && \
dos2unix configure && \
./configure && \
bazel build -c opt --config=cuda //tensorflow/tools/pip_package:build_pip_package && \
bazel-bin/tensorflow/tools/pip_package/build_pip_package /tmp/tensorflow_pkg
OBS: method 2 gives a lot of problems due to the files are using win/dos EOL.
Method (3): docker. See TF doc.
5.Validation of installation
export LD_LIBRARY_PATH=/usr/local/cuda-8.0/lib64:$LD_LIBRARY_PATH && \
export CUDA_HOME=/usr/local/cuda-8.0 && \
export PATH=/usr/local/cuda-8.0/bin:$PATH && \
. ~/tensorflowEnv/bin/activate && \
python3
( OBS: “/usr/local/cuda” is different from “/usr/local/cuda-8.0” )
import tensorflow as tf
print(tf.__version__)
hello = tf.constant('Hello, TensorFlow!')
sess = tf.Session()
print(sess.run(hello))
Usage #
tensor-board
image classifier using inception
//TODO more
TF Serving SavedModel() #
install #
cd
wget https://github.com/bazelbuild/bazel/releases/download/0.5.4/bazel-0.5.4-installer-linux-x86_64.sh
chmod +x bazel-0.5.4-installer-linux-x86_64.sh
./bazel-0.5.4-installer-linux-x86_64.sh --user
export PATH="$PATH:$HOME/bin" # optional, usually already exists
sudo pip install grpcio || pip install --user grpcio
sudo apt-get update && sudo apt-get install -y \
build-essential \
curl \
libcurl3-dev \
git \
libfreetype6-dev \
libpng12-dev \
libzmq3-dev \
pkg-config \
python-dev \
python-numpy \
python-pip \
software-properties-common \
swig \
zip \
zlib1g-dev
pip install tensorflow-serving-api
echo "deb [arch=amd64] http://storage.googleapis.com/tensorflow-serving-apt stable tensorflow-model-server tensorflow-model-server-universal" | sudo tee /etc/apt/sources.list.d/tensorflow-serving.list
curl https://storage.googleapis.com/tensorflow-serving-apt/tensorflow-serving.release.pub.gpg | sudo apt-key add -
sudo apt-get update && sudo apt-get install tensorflow-model-server
q&a during installation:
Q: Bazel “ERROR The ‘build’ command is only supported from within a workspace.”
A: Solution: touch WORKSPACE
ref
training & saving #
Use WineQualityClassificationSavedModel.py or its .ipynb to train & save the model in SavedModel() format. (Info: the model contains two Keras Dense Layers, can be seen in WineQualityClassification.py.)
serving #
See also “ML Model Serving”.
The structure of the Model’s folder should be modified to ModelName/<version>/files
. The param model_base_path
should not contain the WineQuality
model has two versions, v0 & v2, the newest version (biggest number) will be served by default.
$> tree WineQuality/
WineQuality/
├── 0
│ └── WineQuality.bak
│ ├── saved_model.pb
│ └── variables
│ ├── variables.data-00000-of-00001
│ └── variables.index
└── 2
├── saved_model.pb
└── variables
├── variables.data-00000-of-00001
└── variables.index
5 directories, 6 files
serve via simple_tensorflow_serving #
(Tested simple_tensorflow_serving v0.5.0 with TF v1.10.0 & cuda v9.0. See this gist for details.)
Recommended. It uses JSON request and JSON response. [ref]
pip2 install simple_tensorflow_serving
simple_tensorflow_serving --model_base_path=/path_to_model_such_as/fdp-tensorflow-python-examples/savedmodels/WineQuality/
Note:
- use
pip2
to install and make sure seeing py2 message when starting the serving. Running simple_tensorflow_serving in py3 will generate a series of errors, such as “TypeError: the JSON object must be str, not ‘bytes’”. (The response is either HTTP500 or 200 with error info.) --model_name=WineQuality
is optional,model_name="default"
by default.
When the server is running in terminal, check localhost:8500
to see its status in GUI. Test JSON is generated by: curl http://localhost:8500/v1/models/default/gen_json
as well as test clients. For Insomnia, POST the JSON to http://localhost:8500
q&a for simple_tensorflow_serving:
Q/ERR: tf has no attribute ‘Session’.
A: (seem to be the version imcompatibility of py tf etc.) Solution: pip3 install --upgrade --force-reinstall tensorflow-gpu
Q/ERR: ImportError: libcublas.so.9.0: cannot open shared object file: No such file or directory
A: locate libcublas.so
gives Cuda 9.1 which is too new for TF. Compact code solution (w/ root/sudo) from ref. A reboot
is needed.
serve via tensorflow_model_server #
NOT recommended (failed tests), plz use simple_tensorflow_serving.
The gRPC based method is not considered here.
TF starts to support RESTful API along with gRPC since v1.8 with param --rest_api_port
.
tensorflow_model_server --port=9000 --rest_api_port=8500 --model_name=WineQuality --model_base_path=/path_to_model_such_as/fdp-tensorflow-python-examples/savedmodels/WineQuality/
consuming the api
General: POST JSON to http://localhost:9000/v1/models/<model_name>:<classify|regress|predict>
.
E.g. POST {"instances":[[1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0],[1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0]]}
to http://localhost:8500/v1/models/WineQuality:classify
.
Note:
- Tests failed due to “JSON Value not formatted correctly”.
- ‘v1’ in the url seems to be the TF’s version (definitely not the models).