Below are links to download the three gesture datasets from the paper, "Hand and Body Association in Crowded Environments for Human-Robot Interaction", published in the 2013 Internation Conference on Robotics and Automation (ICRA)[1]. Robot Operating System (ROS)[2] was used to generate each dataset as a compressed rosbag. The custom ROS messages used encode the ground truth is provided in the "custom message definitions" file.
The datasets were recorded using a Microsoft Xbox Kinect, and the "openni_launch" ROS package[3]. Each frame consists of a 640x480 Bayered RGB image, with a registered 640x480 depth image. Setting the "load_driver" ROS parameter to false and running the "openni_launch" package will enable registered RGB-D playback, as described in [3]. Each Bayered RGB frame has an associated ground truth message. Placing the msg folder, contained in the "custom message definitions" file, in the root of your ROS package will enable you to read them. Each ground truth message contains the 2D location of a gesturing person (if present), the 2D location of their hand, the gesture they are performing, and the current gesture phase. Locations are pixel coordinates, referenced from the top-left of the image. The "z" coordinate of ground truth hands and bodies will thus always be 0. As described in the paper, four gestures are used: a wave to grab the robots attention, a subtle push to have the robot leave, a subtle follow me motion, and a raised hand indicating that the robot should stop. Multiple people, from a range of backgrounds, were used to perform these gestures.
[1]: 2013 International Conference on Robotics and Automation
[2]: Robot Operating System
[3]: openni_launch
Hand and Body Association in Crowded Environments for Human-Robot Interaction
Video Showing Clips from Paper Results
First Dataset
Second Dataset
Third Dataset
Custom Message Definitions