文章于2020年已更新
//www.greatytc.com/p/2206db894b28
工具准备
Darknet-YOLO:https://pjreddie.com/darknet/yolo/
labelImg:https://github.com/tzutalin/labelImg
创建文件夹
在darknet/scripts
目录下创建以下目录
├── VOCdevkit
│ └── VOC2007
│ ├── Annotations
│ │ ├── 0a0a0b1a-7c39d841.xml
│ │ └── lena.xml
│ ├── ImageSets
│ │ ├── Layout
│ │ ├── Main
│ │ │ ├── test.txt
│ │ │ ├── train.txt
│ │ │ └── val.txt
│ │ └── Segmentation
│ ├── JPEGImages
│ │ ├── 0a0a0b1a-7c39d841.jpg
│ │ └── lena.jpg
│ └── labels
│ └── 0a0a0b1a-7c39d841.txt
└── voc_label.py
其中
JPEGImages
下为训练测试集图片
Annotations
下为VOC格式的xml标注
如
<annotation>
<folder>JPEGImages</folder>
<filename>0a0a0b1a-7c39d841.jpg</filename>
<path>/home/dew/CV2018/yolo/darknet/scripts/VOCdevkit/VOC2007/JPEGImages/0a0a0b1a-7c39d841.jpg</path>
<source>
<database>Unknown</database>
</source>
<size>
<width>1280</width>
<height>720</height>
<depth>3</depth>
</size>
<segmented>0</segmented>
<object>
<name>car</name>
<pose>Unspecified</pose>
<truncated>0</truncated>
<difficult>0</difficult>
<bndbox>
<xmin>557</xmin>
<ymin>275</ymin>
<xmax>688</xmax>
<ymax>398</ymax>
</bndbox>
</object>
<object>
<name>car</name>
<pose>Unspecified</pose>
<truncated>0</truncated>
<difficult>0</difficult>
<bndbox>
<xmin>160</xmin>
<ymin>297</ymin>
<xmax>252</xmax>
<ymax>373</ymax>
</bndbox>
</object>
<object>
<name>car</name>
<pose>Unspecified</pose>
<truncated>0</truncated>
<difficult>0</difficult>
<bndbox>
<xmin>392</xmin>
<ymin>298</ymin>
<xmax>459</xmax>
<ymax>353</ymax>
</bndbox>
</object>
<object>
<name>car</name>
<pose>Unspecified</pose>
<truncated>0</truncated>
<difficult>0</difficult>
<bndbox>
<xmin>492</xmin>
<ymin>304</ymin>
<xmax>523</xmax>
<ymax>345</ymax>
</bndbox>
</object>
</annotation>
Main
下txt文件为对应的测试、训练文件名称
如:
0a0a0b1a-7c39d841
转换标注集格式
修改voc_label.py
, 如只有一个class:car
sets=[('2007', 'train'), ('2007', 'val'), ('2007', 'test')]
classes = ["car"]
'''
classes = ["aeroplane", "bicycle", "bird", "boat", "bottle", "bus", "car", "cat", "chair", "cow", "diningtable", "dog", "horse", "motorbike", "person", "pottedplant", "sheep", "sofa", "train", "tvmonitor"]
'''
运行文件script/voc_label.py
python ./voc_label.py
会在目录下生成一系列文件并将VOC格式标注转为YOLO格式txt标注(归一化处理)见/darknet/scripts/VOCdevkit/VOC2007/labels/0a0a0b1a-7c39d841.txt
0 0.485546875 0.465972222222 0.10234375 0.170833333333
0 0.16015625 0.463888888889 0.071875 0.105555555556
0 0.331640625 0.450694444444 0.05234375 0.0763888888889
0 0.395703125 0.449305555556 0.02421875 0.0569444444444
修改cfg/voc.data
classes= 1
train = /home/dew/Desktop/CV2018/yolo/darknet/scripts/2007_train.txt
valid = /home/dew/Desktop/CV2018/yolo/darknet/scripts/2007_val.txt
names = data/voc.names
backup = backup
修改cfg/yolov3-voc.cfg
查找带有[convolutional]以及[yolo]标签处(共3处)
修改
classes = 标注种类数
filters=3*(classes+1+4)
ramdom=0 //显存足够1,不足够0
修改data/voc.names
备份后将内容修改为训练集classes名
下载预训练权重文件(只包含卷积层)并训练
wget https://pjreddie.com/media/files/darknet53.conv.74
./darknet detector train cfg/voc.data cfg/yolov3-voc.cfg darknet53.conv.74
log说明
Region xx: cfg文件中yolo-layer的索引;
Avg IOU:当前迭代中,预测的box与标注的box的平均交并比,越大越好,期望数值为1;
Class: 标注物体的分类准确率,越大越好,期望数值为1;
obj: 越大越好,期望数值为1;
No obj: 越小越好;
.5R: 以IOU=0.5为阈值时候的recall; recall = 检出的正样本/实际的正样本
0.75R: 以IOU=0.75为阈值时候的recall;
count:正样本数目。