first commit

vt-vl-lab · Aug 31, 2018 · dc95e33 · dc95e33
commit dc95e33
Show file tree

Hide file tree

Showing 56 changed files with 7,683 additions and 0 deletions.
diff --git a/.gitignore b/.gitignore
@@ -0,0 +1 @@
+*.pyc
diff --git a/LICENSE b/LICENSE
@@ -0,0 +1,21 @@
+MIT License
+
+Copyright (c) 2018 Virginia Tech Vision and Learning Lab
+
+Permission is hereby granted, free of charge, to any person obtaining a copy
+of this software and associated documentation files (the "Software"), to deal
+in the Software without restriction, including without limitation the rights
+to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
+copies of the Software, and to permit persons to whom the Software is
+furnished to do so, subject to the following conditions:
+
+The above copyright notice and this permission notice shall be included in all
+copies or substantial portions of the Software.
+
+THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
+AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
+OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
+SOFTWARE.
diff --git a/README.md b/README.md
@@ -0,0 +1,129 @@
+# iCAN: Instance-Centric Attention Network for Human-Object Interaction Detection 
+
+Official TensorFlow implementation for [iCAN: Instance-Centric Attention Network 
+for Human-Object Interaction Detection](https://www.dropbox.com/sh/7yx3slrg8x10zdu/AAB1PYH1M0IdEPeKhS9wZ7mba/0017.pdf?dl=1).
+
+See the [project page](https://gaochen315.github.io/iCAN/) for more details. Please contact Chen Gao (chengao@vt.edu) if you have any questions.
+
+<img src='misc/HOI.gif'>
+
+## Prerequisites
+
+This codebase was developed and tested with Python2.7, Tensorflow 1.1.0 or 1.2.0, CUDA 8.0 and Ubuntu 16.04.
+
+
+## Installation
+1. Clone the repository. 
+    ```Shell
+    git clone https://github.com/vt-vl-lab/iCAN.git
+    ```
+2. Download V-COCO and HICO-DET dataset. Setup V-COCO and COCO API. Setup HICO-DET evaluation code.
+    ```Shell
+    chmod +x ./misc/download_dataset.sh 
+    ./misc/download_dataset.sh 
+    
+    # Assume you cloned the repository to `iCAN_DIR'.
+    # If you have download V-COCO or HICO-DET dataset somewhere else, you can create a symlink
+    # ln -s /path/to/your/v-coco/folder Data/
+    # ln -s /path/to/your/hico-det/folder Data/
+    ```
+
+## Evaluate V-COCO and HICO-DET detection results
+1. Download detection results
+    ```Shell
+    chmod +x ./misc/download_detection_results.sh 
+    ./misc/download_detection_results.sh
+    ```
+2. Evaluate V-COCO detection results using iCAN
+    ```Shell
+    python tools/Diagnose_VCOCO.py eval Results/300000_iCAN_ResNet50_VCOCO.pkl
+    ```
+3. Evaluate V-COCO detection results using iCAN (Early fusion)
+    ```Shell
+    python tools/Diagnose_VCOCO.py eval Results/300000_iCAN_ResNet50_VCOCO_Early.pkl
+    ```
+3. Evaluate HICO-DET detection results using iCAN
+    ```Shell
+    cd Data/ho-rcnn
+    matlab -r "Generate_detection; quit"
+    cd ../../
+    ```
+    Here we evaluate our best detection results under ```Results/HICO_DET/1800000_iCAN_ResNet50_HICO```. If you want to evaluate a different detection result, please specify the filename in ```Data/ho-rcnn/Generate_detection.m``` accordingly.
+
+## Error diagnose on V-COCO
+1. Diagnose V-COCO detection results using iCAN
+    ```Shell
+    python tools/Diagnose_VCOCO.py diagnose Results/300000_iCAN_ResNet50_VCOCO.pkl
+    ```
+2. Diagnose V-COCO detection results using iCAN (Early fusion)
+    ```Shell
+    python tools/Diagnose_VCOCO.py diagnose Results/300000_iCAN_ResNet50_VCOCO_Early.pkl
+    ```
+
+## Training
+1. Download COCO pre-trained weights and training data
+    ```Shell
+    chmod +x ./misc/download_training_data.sh 
+    ./misc/download_training_data.sh
+    ```
+2. Train an iCAN on V-COCO
+    ```Shell
+    python tools/Train_ResNet_VCOCO.py --model iCAN_ResNet50_VCOCO --num_iteration 300000
+    ```
+3. Train an iCAN (Early fusion) on V-COCO
+    ```Shell
+    python tools/Train_ResNet_VCOCO.py --model iCAN_ResNet50_VCOCO_Early --num_iteration 300000
+4. Train an iCAN on HICO-DET
+    ```Shell
+    python tools/Train_ResNet_HICO.py --num_iteration 1800000
+    ```
+
+## Testing
+1. Test an iCAN on V-COCO
+    ```Shell
+     python tools/Test_ResNet_VCOCO.py --model iCAN_ResNet50_VCOCO --num_iteration 300000
+    ```
+2. Test an iCAN (Early fusion) on V-COCO
+    ```Shell
+     python tools/Test_ResNet_VCOCO.py --model iCAN_ResNet50_VCOCO_Early --num_iteration 300000
+    ```
+3. Test an iCAN on HICO-DET
+    ```Shell
+    python tools/Test_ResNet_HICO.py --num_iteration 1800000
+    ```
+
+## Visualizing V-COCO detections
+Check ```tools/Visualization.ipynb``` to see how to visualize the detection results.
+
+## Demo/Test on your own images
+0. To get the best performance, we use [Detection](https://github.com/facebookresearch/Detectron) as our object detector. For a simple demo purpose, we use [tf-faster-rcnn](https://github.com/endernewton/tf-faster-rcnn) in this section instead.
+1. Clone and setup the tf-faster-rcnn repository.
+    ```Shell
+    cd $iCAN_DIR
+    chmod +x ./misc/setup_demo.sh 
+    ./misc/setup_demo.sh
+    ```
+2. Put your own images to ```demo/``` folder.
+3. Detect all objects
+    ```Shell
+    # images are saved in $iCAN_DIR/demo/
+    python ../tf-faster-rcnn/tools/Object_Detector.py --img_dir demo/ --img_format png --Demo_RCNN demo/Object_Detection.pkl
+    ``` 
+4. Detect all HOIs
+    ```Shell
+    python tools/Demo.py --img_dir demo/ --Demo_RCNN demo/Object_Detection.pkl --HOI_Detection demo/HOI_Detection.pkl
+    ```
+5. Check ```tools/Demo.ipynb``` to visualize the detection results.
+
+## Citation
+If you find this code useful for your research, please consider citing the following papers:
+
+    @inproceedings{gao2018ican,
+    author    = {Gao, Chen and Zou, Yuliang and Huang, Jia-Bin}, 
+    title     = {iCAN: Instance-Centric Attention Network for Human-Object Interaction Detection}, 
+    booktitle = {British Machine Vision Conference},
+    year      = {2018}
+    }
+
+## Acknowledgement
+Codes are built upon [tf-faster-rcnn](https://github.com/endernewton/tf-faster-rcnn). We thank [Jinwoo Choi](https://github.com/jinwchoi) for the code review.
diff --git a/demo/Djokovic_0001.png b/demo/Djokovic_0001.png
diff --git a/demo/Djokovic_0002.png b/demo/Djokovic_0002.png
diff --git a/demo/Djokovic_0003.png b/demo/Djokovic_0003.png
diff --git a/demo/Djokovic_0004.png b/demo/Djokovic_0004.png
diff --git a/demo/Djokovic_0005.png b/demo/Djokovic_0005.png
diff --git a/lib/models/__init__.py b/lib/models/__init__.py
diff --git a/lib/models/test_HICO.py b/lib/models/test_HICO.py
@@ -0,0 +1,94 @@
+
+from __future__ import absolute_import
+from __future__ import division
+from __future__ import print_function
+
+from ult.config import cfg
+from ult.timer import Timer
+from ult.ult import Get_next_sp
+
+import cv2
+import pickle
+import numpy as np
+import os
+import sys
+import glob
+import time
+import ipdb
+
+import tensorflow as tf
+from tensorflow.python import pywrap_tensorflow
+
+def get_blob(image_id):
+    im_file  = cfg.DATA_DIR + '/' + 'hico_20160224_det/images/test2015/HICO_test2015_' + (str(image_id)).zfill(8) + '.jpg'
+    im       = cv2.imread(im_file)
+    im_orig  = im.astype(np.float32, copy=True)
+    im_orig -= cfg.PIXEL_MEANS
+    im_shape = im_orig.shape
+    im_orig  = im_orig.reshape(1, im_shape[0], im_shape[1], 3)
+    return im_orig, im_shape
+
+def im_detect(sess, net, image_id, Test_RCNN, object_thres, human_thres, detection):
+
+    # save image information
+    This_image = []
+
+    im_orig, im_shape = get_blob(image_id)
+
+    blobs = {}
+    blobs['H_num']       = 1
+
+    for Human_out in Test_RCNN[image_id]:
+        if (np.max(Human_out[5]) > human_thres) and (Human_out[1] == 'Human'): # This is a valid human
+
+            blobs['H_boxes'] = np.array([0, Human_out[2][0],  Human_out[2][1],  Human_out[2][2],  Human_out[2][3]]).reshape(1,5)
+
+            for Object in Test_RCNN[image_id]:
+                if (np.max(Object[5]) > object_thres) and not (np.all(Object[2] == Human_out[2])): # This is a valid object
+
+                    blobs['O_boxes'] = np.array([0, Object[2][0],  Object[2][1],  Object[2][2],  Object[2][3]]).reshape(1,5)
+                    blobs['sp']      = Get_next_sp(Human_out[2], Object[2]).reshape(1, 64, 64, 2)
+
+
+                    prediction_HO  = net.test_image_HO(sess, im_orig, blobs)
+
+                    temp = []
+                    temp.append(Human_out[2])           # Human box
+                    temp.append(Object[2])              # Object box
+                    temp.append(Object[4])              # Object class
+                    temp.append(prediction_HO[0][0])     # Score
+                    temp.append(Human_out[5])           # Human score
+                    temp.append(Object[5])              # Object score
+                    This_image.append(temp)
+
+    detection[image_id] = This_image
+
+
+def test_net(sess, net, Test_RCNN, output_dir, object_thres, human_thres):
+
+
+    np.random.seed(cfg.RNG_SEED)
+    detection = {}
+    count = 0
+
+    # timers
+    _t = {'im_detect' : Timer(), 'misc' : Timer()}
+
+    for line in glob.iglob(cfg.DATA_DIR + '/' + 'hico_20160224_det/images/test2015/*.jpg'):
+
+        _t['im_detect'].tic()
+
+        image_id   = int(line[-9:-4])
+
+        im_detect(sess, net, image_id, Test_RCNN, object_thres, human_thres, detection)
+
+        _t['im_detect'].toc()
+
+        print('im_detect: {:d}/{:d} {:.3f}s'.format(count + 1, 9658, _t['im_detect'].average_time))
+        count += 1
+
+    pickle.dump( detection, open( output_dir, "wb" ) )
+
+
+
+