[Back]

Multiple Pose Human Body Database (LSP/MPII-MPHB)


LSP/MPII-MPHB Database [1]

Overview:

Multi-pose human body detection has many important applications in practice, e.g, in human gesture estimation, one often needs to detect the position of the human body first to provide a reference location for other human parts, such as head, hands, feet, and so on. One of the most studied topics related to this is the pedestrian detection. But pedestrian detection by itself concerns mainly about the human body in upright positions, while people in general will not always be upright - they can be bending, sitting, lying, or in other poses (see Fig. 1 for some illustration of human body in different poses), highlighting the need for detecting human body under arbitrary poses becomes necessary.

Fig. 1. Illustration of human body under different poses. (Images from MPII Human Pose [3] with the bounding boxes annotated in this work.)

 

For the study of object detection, there are many public data sets for researchers to use such as Pascal VOC 2007, VOC 2010, ILSVRC 2010, and ILSVCR 2012. However, these do not focus on the category of the human being among many other categories of objects. Actually, there currently does not exist a large-scale data set tailored for the task of human body detection. Fortunately, there have been many datasets built for pose estimation like FLIC, LSP [2] and MPII Human Pose [3]. Although there are many annotations of the locations of different body parts in these datasets, they all lack the annotation of bounding boxes about the whole human body, and such bounding boxes needed for model training and performance evaluation can not be reliably derived from the available annotations of body parts. Therefore, these data sets cannot be directly used for human body detection. To this end, we annotate a new dataset named LSP/MPII-MPHB (Multiple Poses Human Body) for human body detection, by selecting over 26K challenging images in LSP and MPII Human Pose and annotating human body bounding boxes on each of the selected images.

The resulting dataset, named LSP/MPII-MPHB, contains 26,675 images and 29,732 human bodies. There is at least one human body per image, and some may contain multiple people. Among these images, 2,000 are from LSP and 24,675 are from MPII Human Body. We compute the size ratio of the ground-truth bounding box to the whole image and count the frequency histogram, as shown in Fig. 2. One can see that almost 70% ground-truth¡¯s size ratio is less than 10%, indicating that it is challenging to detect the human beings in the MPHB data set.

Fig. 2. Distribution of the size ratio of human beings in the MPHB dataset.

 

Human bodies in LSP/MPII-MPHB are under various poses. They can be roughly divided into six categories, i.e., bent, kneeling, lying, partial, sitting and upright. Fig. 3 gives an illustration of each of these. Accordingly, Table 1 gives the standard of how we partition these images and the number of images belonging to each category. The table shows that the number of images in each category of poses is unbalanced - the predominant category is "bent" while the poses of "kneeling" and "lying" are relatively few.

Fig. 3. The image examples of 6 human body pose images.

Pose Type
Description
Number
Bent
at least one of body part is bent (e.g., stoop or horse step)
10,229
Kneeling
at least one of the two knees touching something
1,053
Lying
sleeping or swimming, in horizontal position and on its back
1,123
Occlusion
only part of the body
4,040
Sitting
buttocks touching something
5,739
Upright
standing without any bend
4,492

Table 1: The description and number of 6 human body pose images

Evaluation Protocols£º

We design an evaluation protocol by dividing these images into three partitions. Particularly, the number of images(human bodies) are respectively 8,385(9,732), 8,110(8233), 10,180(11,767) for training set, validation set and test set. For performance evaluation, a detector outputs a list of bounding boxes with associated confidence (rank) and detections were assigned to ground-truth objects and judged to be true/false positives by measuring bounding box overlap. Following common practice in this field (cf., the Pascal VOC [4] and ILSVRC [5]), when testing we calculate the IoU (Intersection-over-Union) for each test image and instances whose IoU (Intersectionover-Union) with the ground-truth is more than 0.5 are considered as positive. The final performance is summarized using the average precision (AP) metric.

 

Download

MPHB-label(.mat): The bounding box annotation and source of human body in LSP/MPII-MPHB Dataset(1.11MB) - there is another .txt version(555KB) MPHB-label(.txt)

MPHB-imageAll images in LSP/MPII-MPHB Dataset(2.38GB) - for convenience, we have buffered a copy of all the images annotated to download but note that these images are collected from LSP and MPII datasets

CODE(.zip)

PoseLabelAll images in LSP/MPII-MPHBthe pose type annotation of each image.

 

 

Other Materials Related to [1] 

LSPLeeds Sports Pose Dataset

MPII: MPII Human Pose Dataset

¡¡

References:

[1] Y. Cai and X. Tan, ¡°Weakly supervised human body detection under arbitrary poses,¡± in International Conference on Image Processing. IEEE, 2016.

[2] S. Johnson and M. Everingham, ¡°Clustered pose and nonlinear appearance models for human pose estimation.¡± in BMVC, vol. 2, no. 4, 2010, p. 5.

[3] M. Andriluka, L. Pishchulin, P. Gehler, and B. Schiele, ¡°2d human pose estimation: New benchmark and state of the art analysis,¡± in Computer Vision and Pattern Recognition (CVPR), 2014 IEEE Conference on. IEEE, 2014, pp. 3686¨C3693.

[4] M. Everingham, L. Van Gool, C. K. Williams, J. Winn, and A. Zisserman, ¡°The pascal visual object classes (voc) challenge,¡± International journal of computer vision, vol. 88, no. 2, pp. 303¨C338, 2010.

[5] O. Russakovsky, J. Deng, H. Su, J. Krause, S. Satheesh, S. Ma, Z. Huang, A. Karpathy, A. Khosla, M. Bernstein et al., ¡°Imagenet large scale visual recognition challenge,¡± International Journal of Computer Vision, vol. 115, no. 3, pp. 211¨C252, 2015.

 

¡¡

Copyright and disclaimer:

Copyright 2016, Xiaoyang Tan

The dataset is provided for research purposes to a researcher only and not for any commercial use. Please do not release the data or redistribute this link to anyone else without our permission. Contact {x.tan}@nuaa.edu.cn if any question.

If you use this dataset, please cite it as,

Y. Cai and X. Tan, ¡°Weakly supervised human body detection under arbitrary poses,¡± in International Conference on Image Processing. IEEE, 2016.

¡¡

¡¡

¡¡