如何使用Python分析姿态估计数据集COCO？

2021-01-16 09:50

＃遍历所有图像
for img＿id， img＿fname， w， h， meta in get＿meta（coco）：
images＿data．append（｛
＇image＿id＇： int（img＿id），
＇path＇： img＿fname，
＇width＇： int（w），
＇height＇： int（h）
｝）

＃遍历所有元数据
for m in meta：
persons＿data．append（｛
＇image＿id＇： m［＇image＿id＇］，
＇is＿crowd＇： m［＇iscrowd＇］，
＇bbox＇： m［＇bbox＇］，
＇area＇： m［＇area＇］，
＇num＿keypoints＇： m［＇num＿keypoints＇］，
＇keypoints＇： m［＇keypoints＇］，
｝）

＃创建带有图像路径的数据帧
images＿df ＝ pd．DataFrame（images＿data）
images＿df．set＿index（＇image＿id＇， inplace＝True）

＃创建与人相关的数据帧
persons＿df ＝ pd．DataFrame（persons＿data）
persons＿df．set＿index（＇image＿id＇， inplace＝True）
return images＿df， persons＿df
我们使用get＿meta函数构造两个数据帧—一个用于图像路径，另一个用于人的元数据。在一个图像中可能有多个人，因此是一对多的关系。在下一步中，我们合并两个表（left join操作）并将训练集和验证集组合，另外，我们添加了一个新列source，值为0表示训练集，值为1表示验证集。这样的信息是必要的，因为我们需要知道应该在哪个文件夹中搜索图像。如你所知，这些图像位于两个文件夹中：train2017／和val2017／images＿df， persons＿df ＝ convert＿to＿df（train＿coco）
train＿coco＿df ＝ pd．merge（images＿df， persons＿df， right＿index＝True， left＿index＝True）
train＿coco＿df［＇source＇］＝ 0
images＿df， persons＿df ＝ convert＿to＿df（val＿coco）
val＿coco＿df ＝ pd．merge（images＿df， persons＿df， right＿index＝True， left＿index＝True）
val＿coco＿df［＇source＇］＝ 1
coco＿df ＝ pd．concat（［train＿coco＿df， val＿coco＿df］， ignore＿index＝True）
最后，我们有一个表示整个COCO数据集的数据帧。图像中有多少人现在我们可以执行第一个分析。COCO数据集包含多个人的图像，我们想知道有多少图像只包含一个人。代码如下：＃计数
annotated＿persons＿df ＝ coco＿df［coco＿df［＇is＿crowd＇］＝＝ 0］
crowd＿df ＝ coco＿df［coco＿df［＇is＿crowd＇］＝＝ 1］
print（＂Number of people in total：＂＋ str（len（annotated＿persons＿df）））
print（＂Number of crowd annotations：＂＋ str（len（crowd＿df）））
persons＿in＿img＿df ＝ pd．DataFrame（｛
＇cnt＇： annotated＿persons＿df［＇path＇］．value＿counts（）
｝）
persons＿in＿img＿df．reset＿index（level＝0， inplace＝True）
persons＿in＿img＿df．rename（columns ＝｛＇index＇：＇path＇｝， inplace ＝ True）
＃按cnt分组，这样我们就可以在一张图片中得到带有注释人数的数据帧
persons＿in＿img＿df ＝ persons＿in＿img＿df．groupby（［＇cnt＇］）．count（）
＃提取数组
x＿occurences ＝ persons＿in＿img＿df．index．values
y＿images ＝ persons＿in＿img＿df［＇path＇］．values
＃绘图
plt．bar（x＿occurences， y＿images）
plt．title（＇People on a single image ＇）
plt．xticks（x＿occurences， x＿occurences）
plt．xlabel（＇Number of people in a single image＇）
plt．ylabel（＇Number of images＇）
plt．show（）
结果图表：

如你所见，大多数COCO图片都包含一个人。但也有相当多的13个人的照片，让我们举几个例子：

好吧，甚至有一张图片有19个注解（非人群）：

这个图像的顶部区域不应该标记为一个人群吗？是的，应该，但是，我们有多个没有关键点的边界框！这样的注释应该像对待人群一样对待，这意味着它们应该被屏蔽。在这张图片中，只有中间的3个方框有一些关键点。让我们来优化查询，以获取包含有／没有关键点的人图像的统计信息，以及有／没有关键点的人的总数：annotated＿persons＿nokp＿df ＝ coco＿df［（coco＿df［＇is＿crowd＇］＝＝ 0）＆（coco＿df［＇num＿keypoints＇］＝＝ 0）］
annotated＿persons＿kp＿df ＝ coco＿df［（coco＿df［＇is＿crowd＇］＝＝ 0）＆（coco＿df［＇num＿keypoints＇］＞ 0）］
print（＂Number of people （with keypoints） in total：＂＋
str（len（annotated＿persons＿kp＿df）））
print（＂Number of people without any keypoints in total：＂＋
str（len（annotated＿persons＿nokp＿df）））
persons＿in＿img＿kp＿df ＝ pd．DataFrame（｛
＇cnt＇： annotated＿persons＿kp＿df［［＇path＇，＇source＇］］．value＿counts（）
｝）
persons＿in＿img＿kp＿df．reset＿index（level＝［0，1］， inplace＝True）
persons＿in＿img＿cnt＿df ＝ persons＿in＿img＿kp＿df．groupby（［＇cnt＇］）．count（）
x＿occurences＿kp ＝ persons＿in＿img＿cnt＿df．index．values
y＿images＿kp ＝ persons＿in＿img＿cnt＿df［＇path＇］．values
f ＝ plt．figure（figsize＝（14， 8））
width ＝ 0．4
plt．bar（x＿occurences＿kp， y＿images＿kp， width＝width， label＝＇with keypoints＇）
plt．bar（x＿occurences ＋ width， y＿images， width＝width， label＝＇no keypoints＇）
plt．title（＇People on a single image ＇）
plt．xticks（x＿occurences ＋ width／2， x＿occurences）
plt．xlabel（＇Number of people in a single image＇）
plt．ylabel（＇Number of images＇）
plt．legend（loc ＝＇best＇）
plt．show（）
现在我们可以看到区别是明显的。

虽然COCO官方页面上描述有25万人拥有关键点，而我们只有156165个这样的例子。他们可能应该删除了“带关键点”这几个字。添加额外列一旦我们将COCO转换成pandas数据帧，我们就可以很容易地添加额外的列，从现有的列中计算出来。我认为最好将所有的关键点坐标提取到单独的列中，此外，我们可以添加一个具有比例因子的列。特别是，关于一个人的边界框的规模信息是非常有用的，例如，我们可能希望丢弃所有太小规模的人，或者执行放大操作。为了实现这个目标，我们使用Python库sklearn中的transformer对象。一般来说，sklearn transformers是用于清理、减少、扩展和生成数据科学模型中的特征表示的强大工具。我们只会用一小部分的api。代码如下：from sklearn．base import BaseEstimator， TransformerMixin
class AttributesAdder（BaseEstimator， TransformerMixin）：
def ＿＿init＿＿（self， num＿keypoints， w＿ix， h＿ix， bbox＿ix， kp＿ix）：
＂＂＂
：param num＿keypoints：关键点的数量
：param w＿ix：包含图像宽度的列索引
：param h＿ix：包含图像高度的列索引
：param bbox＿ix：包含边框数据的列索引
：param kp＿ix：包含关键点数据的列索引
＂＂＂
self．num＿keypoints ＝ num＿keypoints
self．w＿ix ＝ w＿ix
self．h＿ix ＝ h＿ix
self．bbox＿ix ＝ bbox＿ix
self．kp＿ix ＝ kp＿ix

def fit（self， X， y＝None）：
return self

def transform（self， X）：

＃检索特定列

w ＝ X［：， self．w＿ix］
h ＝ X［：， self．h＿ix］
bbox ＝ np．array（X［：， self．bbox＿ix］．tolist（））＃ to matrix
keypoints ＝ np．array（X［：， self．kp＿ix］．tolist（））＃ to matrix