tensorlayer · zsdonghao · Aug 25, 2018 · Aug 1, 2018 · Aug 14, 2018 · Aug 14, 2018
diff --git a/example/openpose/.gitignore b/example/openpose/.gitignore
@@ -0,0 +1,10 @@
+models_*
+coco/
+cocoapi/
+inference/*png
+vis/
+data/mscoco2014
+data/mscoco2017
+models/
+tensorlayer/
+cocoapi-master/
diff --git a/example/openpose/README.md b/example/openpose/README.md
@@ -0,0 +1,108 @@
+# OpenPose using TensorFlow and TensorLayer
+
+</a>
+<p align="center">
+    <img src="https://github.com/CMU-Perceptual-Computing-Lab/openpose/blob/master/doc/media/dance_foot.gif?raw=true", width="360">
+</p>
+
+## 1. Motivation
+
+OpenPose from CMU provides real-time 2D pose estimation following ["Realtime Multi-Person 2D Pose Estimation using Part Affinity Fields"](https://arxiv.org/pdf/1611.08050.pdf) However, the [training code](https://github.com/ZheC/Realtime_Multi-Person_Pose_Estimation) is based on Caffe and C++, which is hard to customize.
+While in practice, developers need to customize their training set, data augmentation methods according to their requirement.
+For this reason, we reimplemented this project in TensorLayer fashion.
+
+## 2. Project files
+
+- `config.py` : to config the directories of dataset, training details and etc.
+- `models.py` : to define the model structures, currently only VGG19 Based model included
+- `utils.py` : to extract databased from cocodataset and groundtruth calculation
+- `train.py` : to train the model
+- `visualize.py`: draw the training result
+- `inference` folder :
+
+## 3. Preparation
+
+
+1. for data processing, COCOAPIs are used, download cocoapi repo : https://github.com/cocodataset/cocoapi, go into Python folder and make.
+
+```bash
+git clone https://github.com/pdollar/coco.git
+cd coco/PythonAPI
+make
+
+** before recompiling **
+rm -rf build
+```
+
+alternatively, following this:
+
+```bash
+git clone https://github.com/waleedka/coco 
+cd cococ
+python PythonAPI/setup.py build_ext install
+```
+
+2. Build c++ library for post processing. See : https://github.com/ildoonet/tf-pose-estimation/tree/master/tf_pose/pafprocess
+
+```python
+cd pafprocess
+swig -python -c++ pafprocess.i && python3 setup.py build_ext --inplace
+
+** before recompiling **
+rm -rf build
+rm *.so
+```
+
+## 4. Use pre-trained model
+
+In this project, input images are RGB with 0~1
+
+Runs `xxx.py`, it will automatically download the default VGG19-based model from [here](https://github.com/tensorlayer/pretrained-models), 
+and use it for inferencing.
+The performance of pre-trained model is as follow:
+
+|             	| Speed      	| AP      	| xxx |
+|-------------	|---------------	|---------------	|---------------	|
+| VGG19 	| xx	| xx	| xx 	| 
+| Residual Squeeze  	| xx	| xx 	| xx 	| 
+
+- Speed is tested on XXX
+
+## 5. Train a model
+For your own training, please put .jpg files into coco_dataset/images/ and put .json into coco_dataset/annotations/
+
+Runs `train.py`, it will automatically download MSCOCO 2017 dataset into `dataset/coco17`. 
+The default model in `models.py` is based on VGG19, which is the same with the original paper. 
+If you want to customize the model, simply change it in `models.py`.
+And then `train.py` will train the model to the end.
+
+## 6. Evaluate a model
+
+Runs `eval.py` for inference 
+
+## 7. Speed up and deployment
+
+For TensorRT float16 (half-float) inferencing, xxx
+
+## 8. Customization
+- Model : change `models.py`.
+- Data augmentation : ....
+- Train with your own data: ....  
+    - 1) prepare your data following MSCOCO format, you need to ...
+    - 2) concatenate the list of your own data JSON into ...
+- Evaluate on your own testing set:
+    - 1) xx
+
+## 9. Discussion
+
+- [TensorLayer Issues 434](https://github.com/tensorlayer/tensorlayer/issues/434)
+- [TensorLayer Issues 416](https://github.com/tensorlayer/tensorlayer/issues/416)
+
+
+
+Paper's Model
+--------------
+Image : https://github.com/ZheC/Realtime_Multi-Person_Pose_Estimation/tree/master/model/_trained_MPI
+MPII  : https://github.com/ZheC/Realtime_Multi-Person_Pose_Estimation/blob/master/model/_trained_MPI/pose_deploy.prototxt
+COCO  : https://github.com/ZheC/Realtime_Multi-Person_Pose_Estimation/blob/master/model/_trained_COCO/pose_deploy.prototxt  <- same architecture but more key points
+Visualize Caffe model : http://ethereon.github.io/netscope/#/editor
diff --git a/example/openpose/config.py b/example/openpose/config.py
@@ -0,0 +1,44 @@
+import os
+from easydict import EasyDict as edict
+
+config = edict()
+
+config.TRAIN = edict()
+config.TRAIN.batch_size = 8
+config.TRAIN.save_interval = 5000
+config.TRAIN.log_interval = 1
+config.TRAIN.n_epoch = 80
+config.TRAIN.step_size = 136106  # evey number of step to decay lr
+config.TRAIN.base_lr = 4e-5  # initial learning rate
+config.TRAIN.gamma = 0.333  # gamma of Adam
+config.TRAIN.weight_decay = 5e-4
+config.TRAIN.train_mode = 'placeholder' # dataset, distributed
+
+config.MODEL = edict()
+config.MODEL.model_path = 'models'  # save directory
+config.MODEL.n_pos = 19  # number of keypoints
+config.MODEL.hin = 368  # input size during training
+config.MODEL.win = 368
+config.MODEL.hout = int(config.MODEL.hin / 8)  # output size during training (default 46)
+config.MODEL.wout = int(config.MODEL.win / 8)
+
+if (config.MODEL.hin % 16 != 0) or (config.MODEL.win % 16 != 0):
+    raise Exception("image size should be divided by 16")
+
+config.DATA = edict()
+config.DATA.coco_version = '2017'  # MSCOCO version 2014 or 2017
+config.DATA.data_path = 'data'
+config.DATA.your_images_path = os.path.join('data', 'your_data', 'images')
+config.DATA.your_annos_path = os.path.join('data', 'your_data', 'coco.json')
+
+config.LOG = edict()
+config.LOG.vis_path = 'vis'
+
+# config.VALID = edict()
+
+# import json
+# def log_config(filename, cfg):
+#     with open(filename, 'w') as f:
+#         f.write("================================================\n")
+#         f.write(json.dumps(cfg, indent=4))
+#         f.write("\n================================================\n")
diff --git a/example/openpose/inference.py b/example/openpose/inference.py
@@ -0,0 +1,92 @@
+import time, os
+import numpy as np
+from config import config
+from models import model
+import tensorflow as tf
+import tensorlayer as tl
+from utils import draw_intermedia_results
+
+if __name__ == '__main__':
+    n_pos = config.MODEL.n_pos
+    model_path = config.MODEL.model_path
+    h, w = 368, 432  # image size for inferencing, small size can speed up
+    if (h % 16 != 0) or (w % 16 != 0):
+        raise Exception("image size should be divided by 16")
+
+    ## define model
+    x = tf.placeholder(tf.float32, [None, h, w, 3], "image")
+    _, _, _, net = model(x, n_pos, None, None, False, False)
+
+    ## get output from network
+    conf_tensor = tl.layers.get_layers_with_name(net, 'model/cpm/stage6/branch1/conf')[0]
+    pafs_tensor = tl.layers.get_layers_with_name(net, 'model/cpm/stage6/branch2/pafs')[0]
+
+    def get_peak(pafs_tensor):
+        from inference.smoother import Smoother
+        smoother = Smoother({'data': pafs_tensor}, 25, 3.0)
+        gaussian_heatMat = smoother.get_output()
+        max_pooled_in_tensor = tf.nn.pool(gaussian_heatMat, window_shape=(3, 3), pooling_type='MAX', padding='SAME')
+        tensor_peaks = tf.where(
+            tf.equal(gaussian_heatMat, max_pooled_in_tensor), gaussian_heatMat, tf.zeros_like(gaussian_heatMat)
+        )
+        return tensor_peaks
+
+    peak_tensor = get_peak(pafs_tensor)
+
+    ## restore model parameters
+    sess = tf.InteractiveSession()
+    sess.run(tf.global_variables_initializer())
+    # tl.files.load_and_assign_npz_dict(os.path.join(model_path, 'pose1.npz'), sess)
+
+    ## get one example image with range 0~1
+    im = tl.vis.read_image('data/test.jpeg')
+    im = tl.prepro.imresize(im, [h, w])
+    im = im / 255.  # input image 0~1
+
+    ## inference
+    # 1st time need time to compile
+    # _, _ = sess.run([conf_tensor, pafs_tensor], feed_dict={x: [im]})
+    st = time.time()
+    conf, pafs, peak = sess.run([conf_tensor, pafs_tensor, peak_tensor], feed_dict={x: [im]})
+    t = time.time() - st
+    print("get maps took {}s i.e. {} FPS".format(t, 1. / t))
+    # print(conf.shape, pafs.shape, peak.shape)
+
+    ## get coordinate results from maps using conf and pafs from network output, and peak
+    # using OpenPose's official C++ code for this part
+    from inference.estimator import Human
+
+    def estimate_paf(peaks, heat_mat, paf_mat):
+        pafprocess.process_paf(peaks, heat_mat, paf_mat)  # C++
+
+        humans = []
+        for human_id in range(pafprocess.get_num_humans()):
+            human = Human([])
+            is_added = False
+
+            for part_idx in range(18):
+                c_idx = int(pafprocess.get_part_cid(human_id, part_idx))
+                if c_idx < 0:
+                    continue
+
+                is_added = True
+                human.body_parts[part_idx] = BodyPart(
+                    '%d-%d' % (human_id, part_idx), part_idx,
+                    float(pafprocess.get_part_x(c_idx)) / heat_mat.shape[1],
+                    float(pafprocess.get_part_y(c_idx)) / heat_mat.shape[0], pafprocess.get_part_score(c_idx)
+                )
+
+            if is_added:
+                score = pafprocess.get_score(human_id)
+                human.score = score
+                humans.append(human)
+
+        return humans
+
+    humans = estimate_paf(peak[0], conf[0], pafs[0])
+    print(humans)
+
+    ## draw maps
+    draw_intermedia_results([im], None, conf, None, pafs, None, 'inference')
+
+    ## draw connection
diff --git a/example/openpose/inference/cmu_432x368_8.000000.json b/example/openpose/inference/cmu_432x368_8.000000.json
@@ -0,0 +1 @@
+[]
diff --git a/example/openpose/inference/common.py b/example/openpose/inference/common.py
@@ -0,0 +1,138 @@
+from enum import Enum
+
+import tensorflow as tf
+import cv2
+
+regularizer_conv = 0.004
+regularizer_dsconv = 0.0004
+batchnorm_fused = True
+activation_fn = tf.nn.relu
+
+
+class CocoPart(Enum):
+    Nose = 0
+    Neck = 1
+    RShoulder = 2
+    RElbow = 3
+    RWrist = 4
+    LShoulder = 5
+    LElbow = 6
+    LWrist = 7
+    RHip = 8
+    RKnee = 9
+    RAnkle = 10
+    LHip = 11
+    LKnee = 12
+    LAnkle = 13
+    REye = 14
+    LEye = 15
+    REar = 16
+    LEar = 17
+    Background = 18
+
+
+class MPIIPart(Enum):
+    RAnkle = 0
+    RKnee = 1
+    RHip = 2
+    LHip = 3
+    LKnee = 4
+    LAnkle = 5
+    RWrist = 6
+    RElbow = 7
+    RShoulder = 8
+    LShoulder = 9
+    LElbow = 10
+    LWrist = 11
+    Neck = 12
+    Head = 13
+
+    @staticmethod
+    def from_coco(human):
+        # t = {
+        #     MPIIPart.RAnkle: CocoPart.RAnkle,
+        #     MPIIPart.RKnee: CocoPart.RKnee,
+        #     MPIIPart.RHip: CocoPart.RHip,
+        #     MPIIPart.LHip: CocoPart.LHip,
+        #     MPIIPart.LKnee: CocoPart.LKnee,
+        #     MPIIPart.LAnkle: CocoPart.LAnkle,
+        #     MPIIPart.RWrist: CocoPart.RWrist,
+        #     MPIIPart.RElbow: CocoPart.RElbow,
+        #     MPIIPart.RShoulder: CocoPart.RShoulder,
+        #     MPIIPart.LShoulder: CocoPart.LShoulder,
+        #     MPIIPart.LElbow: CocoPart.LElbow,
+        #     MPIIPart.LWrist: CocoPart.LWrist,
+        #     MPIIPart.Neck: CocoPart.Neck,
+        #     MPIIPart.Nose: CocoPart.Nose,
+        # }
+
+        t = [
+            (MPIIPart.Head, CocoPart.Nose),
+            (MPIIPart.Neck, CocoPart.Neck),
+            (MPIIPart.RShoulder, CocoPart.RShoulder),
+            (MPIIPart.RElbow, CocoPart.RElbow),
+            (MPIIPart.RWrist, CocoPart.RWrist),
+            (MPIIPart.LShoulder, CocoPart.LShoulder),
+            (MPIIPart.LElbow, CocoPart.LElbow),
+            (MPIIPart.LWrist, CocoPart.LWrist),
+            (MPIIPart.RHip, CocoPart.RHip),
+            (MPIIPart.RKnee, CocoPart.RKnee),
+            (MPIIPart.RAnkle, CocoPart.RAnkle),
+            (MPIIPart.LHip, CocoPart.LHip),
+            (MPIIPart.LKnee, CocoPart.LKnee),
+            (MPIIPart.LAnkle, CocoPart.LAnkle),
+        ]
+
+        pose_2d_mpii = []
+        visibilty = []
+        # for mpi, coco in t:
+        for _, coco in t:
+            if coco.value not in human.body_parts.keys():
+                pose_2d_mpii.append((0, 0))
+                visibilty.append(False)
+                continue
+            pose_2d_mpii.append((human.body_parts[coco.value].x, human.body_parts[coco.value].y))
+            visibilty.append(True)
+        return pose_2d_mpii, visibilty
+
+
+CocoPairs = [
+    (1, 2), (1, 5), (2, 3), (3, 4), (5, 6), (6, 7), (1, 8), (8, 9), (9, 10), (1, 11), (11, 12), (12, 13), (1, 0),
+    (0, 14), (14, 16), (0, 15), (15, 17), (2, 16), (5, 17)
+]  # = 19
+CocoPairsRender = CocoPairs[:-2]
+# CocoPairsNetwork = [
+#     (12, 13), (20, 21), (14, 15), (16, 17), (22, 23), (24, 25), (0, 1), (2, 3), (4, 5),
+#     (6, 7), (8, 9), (10, 11), (28, 29), (30, 31), (34, 35), (32, 33), (36, 37), (18, 19), (26, 27)
+#  ]  # = 19
+
+CocoColors = [
+    [255, 0, 0], [255, 85, 0], [255, 170, 0], [255, 255, 0], [170, 255, 0], [85, 255, 0], [0, 255, 0], [0, 255, 85],
+    [0, 255, 170], [0, 255, 255], [0, 170, 255], [0, 85, 255], [0, 0, 255], [85, 0, 255], [170, 0, 255], [255, 0, 255],
+    [255, 0, 170], [255, 0, 85]
+]
+
+
+def read_imgfile(path, width=None, height=None):
+    val_image = cv2.imread(path, cv2.IMREAD_COLOR)
+    if width is not None and height is not None:
+        val_image = cv2.resize(val_image, (width, height))
+    return val_image
+
+
+def get_sample_images(w, h):
+    val_image = [
+        read_imgfile('./images/p1.jpg', w, h),
+        read_imgfile('./images/p2.jpg', w, h),
+        read_imgfile('./images/p3.jpg', w, h),
+        read_imgfile('./images/golf.jpg', w, h),
+        read_imgfile('./images/hand1.jpg', w, h),
+        read_imgfile('./images/hand2.jpg', w, h),
+        read_imgfile('./images/apink1_crop.jpg', w, h),
+        read_imgfile('./images/ski.jpg', w, h),
+        read_imgfile('./images/apink2.jpg', w, h),
+        read_imgfile('./images/apink3.jpg', w, h),
+        read_imgfile('./images/handsup1.jpg', w, h),
+        read_imgfile('./images/p3_dance.png', w, h),
+    ]
+    return val_image