We have created a dataset for 6DOF pose estimation featuring a glue gun and its usage for applying glue on top of wooden/cardboard shape primitives. The glueing is performed by multiple users, in different trajectories and with variable glueing styles. The image sequences are completely annotated with depth masks, 6DOF pose of the tool, coordinates of the end effector etc. The spatial information about the glue gun was collected from a HTC Vive controller attached on top of the tool.
The dataset is divided into four sub-datasets according to their main focus. Each dataset can be downloaded and used independently, yet they can be combined to provide enough variability for training and testing. The sub-datasets are following:
Download the dataset from Google Drive: Download
IMITROB dataset documentation (same as below): PDF
/camera/depth/camera_info
/camera/depth/image_rect
/camera/rgb/camera_info
/camera/rgb/image_rect_color/compressed
/controller1
/tf
/tf_static
The complete list of ROS topics available:
/XTION2/depth/camera_info
/XTION2/depth/image
/XTION2/depth/image_rect
/XTION2/rgb/camera_info
/XTION2/rgb/image_rect_color/compressed
/XTION3/XTION3/depth/camera_info
/XTION3/XTION3/rgb/camera_info
/XTION3/XTION3/rgb/image_rect_color/compressed
/XTION3/camera/depth/image_rect
/controller1
/end_point
/tf
There is only one user in this dataset, who performs glueing on multiple different shapes made of cardboard or wood. The whole variability is 5 different glueing styles, 3 different trajectories of glue application and 6 different shapes (see the pictures for a better idea). The dataset thus consists of 5 x 3 x 6 = 90 ros bags, one unique combination in each. The scene is recorded by one Asus XTION camera attached from above.
The variability of the glueing styles, shapes and trajectories is following:
/XTION3/camera/depth/camera_info
/XTION3/camera/depth/image_rect
/XTION3/camera/rgb/camera_info
/XTION3/camera/rgb/image_rect_color/compressed
/controller1
/end_point
/tf
/vive/joy0
...
/vive/joy15
Since the overall size of the dataset is very large (more than 150 GB), we provide several processed bag files as example along with code to generate the same annotations for any bag files of your selection. The annotations include RGB images, depth images, 6DOF pose and pickle annotations with bounding boxes which can be used for training.
This dataset was created for the research paper Specifying Dual-Arm Robot Planning Problems Through Natural Language and Demonstration. It includes four different showcases in which a human demonstrator uses the glue gun to perform glueing task for furniture assembly. Since most of the bag files (except for sentence-wise bag files) contain the glue gun 6DOF pose as well as RGBD image, they can be used for additional training or testing of motion tracking. These showcases are also provided with natural language commands which might serve as an additional source of information.
See more information in the Planning section