

These depth maps capture the depth images after they have been projected onto the RGB image plane but before the missing depth values have been filled in. rawDepths – HxWxN matrix of raw depth maps where H and W are the height and width, respectively, and N is the number of images.namesToIds – map from english label names to class IDs (with C key-value pairs).names – Cx1 cell array of the english names of each class.If a pixel’s label value is 0, then that pixel is ‘unlabeled’. The labels range from 1.C where C is the total number of classes. labels – HxWxN matrix of object label masks where H and W are the height and width, respectively and N is the number of images.Use get_instance_masks.m in the Toolbox to recover masks for each object instance in a scene. instances – HxWxN matrix of instance maps.images – HxWx3xN matrix of RGB images where H and W are the height and width, respectively, and N is the number of images.The values of the depth elements are in meters. depths – HxWxN matrix of in-painted depth maps where H and W are the height and width, respectively and N is the number of images.The columns contain the roll, yaw, pitch and tilt angle of the device. accelData – Nx4 matrix of accelerometer values indicated when each frame was taken.Unlike, the Raw dataset, the labeled dataset is provided as a Matlab. In addition to the projected depth maps, we have included a set of preprocessed depth maps whose missing values have been filled in using the colorization scheme of Levin et al. It is comprised of pairs of RGB and Depth frames that have been synchronized and annotated with dense labels for every image. The labeled dataset is a subset of the Raw Dataset. Output from the RGB camera (left), preprocessed depth (center) and a set of labels (right) for the image. Toolbox: Useful functions for manipulating the data and labels.Raw: The raw rgb, depth and accelerometer data as provided by the Kinect.This data has also been preprocessed to fill in missing depth labels. Labeled: A subset of the video data accompanied by dense multi-class labels.Each object is labeled with a class and an instance number (cup1, cup2, cup3, etc).

1449 densely labeled pairs of aligned RGB and depth images.The NYU-Depth V2 data set is comprised of video sequences from a variety of indoor scenes as recorded by both the RGB and Depth cameras from the Microsoft Kinect. Samples of the RGB image, the raw depth image, and the class labels from the dataset. Indoor Segmentation and Support Inference from RGBD Images If you use the dataset, please cite the following work:
