`RK1808官方提供了mobileNet SSD的多分类的demo,所以基于这个算法可以较容易开发自己的检测算法,所以用网上公开的人头检测数据集训练一个人头检测模型。
caffe 版的mobilenetSSD开源地址:
https://github.com/chuanqi305/MobileNet-SSD.git
原版的SSD开源地址:
https://github.com/weiliu89/caffe.git
1.下载上面两个开源代码,先确保编译原版的SSD caffe版本正常
2.将mobileNetSSD 的这个项目代码,cp到原版的SSD的example目录下
3.所用人头检测数据集为hollywoodHead
4.将数据集数据只作为了VOC格式
5.根据原版SSD的lmdb训练数据制作方法将hoollywoodHead的数据制作成train和test两个lmdb数据,方便后续训练加载
6.使用如下命令将你制作的lmdb数据链接到mobileNetSSD要求的目录下:
- ln -s PATH_TO_YOUR_TRAIN_LMDB trainval_lmdb
- ln -s PATH_TO_YOUR_TEST_LMDB test_lmdb
7.在mobileNetSSD目录下制作标签文件,我的标签文件如下:
9.运行目录下的gen_model.sh生成用于training 的prototxt文件
10.下载预训练模型,地址:
https://drive.google.com/file/d/0B3gersZ2cHIxVFI1Rjd5aDgwOG8/view
11.运行训练脚本train.sh,迭代了120000次,得到如下人头检测模型:
12.运行使用web 摄像头的测试脚本,脚本代码如下:
- import numpy as np
- import sys,os
- import cv2
- # caffe_root = '/media/isp/DATA/workspace/opencvLab/caffe/'
- # sys.path.insert(0, caffe_root + 'python')
- import caffe
- net_file= './example/MobileNetSSD_deploy.prototxt'
- caffe_model='./snapshot/mobilenet_iter_120000.caffemodel'
- test_dir = "images"
- if not os.path.exists(caffe_model):
- print(caffe_model + " does not exist")
- exit()
- if not os.path.exists(net_file):
- print(net_file + " does not exist")
- exit()
- net = caffe.Net(net_file,caffe_model,caffe.TEST)
- CLASSES = ('background',
- 'head')
- def preprocess(src):
- img = cv2.resize(src, (300,300))
- img = img - 127.5
- img = img * 0.007843
- return img
- def postprocess(img, out):
- h = img.shape[0]
- w = img.shape[1]
- box = out['detection_out'][0,0,:,3:7] * np.array([w, h, w, h])
- cls = out['detection_out'][0,0,:,1]
- conf = out['detection_out'][0,0,:,2]
- return (box.astype(np.int32), conf, cls)
- def detect(imgSRC):
- img = preprocess(imgSRC)
-
- img = img.astype(np.float32)
- img = img.transpose((2, 0, 1))
- net.blobs['data'].data[...] = img
- out = net.forward()
- box, conf, cls = postprocess(imgSRC, out)
- for i in range(len(box)):
- if conf[i] > 0.4:
- p1 = (box[i][0], box[i][1])
- p2 = (box[i][2], box[i][3])
- cv2.rectangle(imgSRC, p1, p2, (0,255,0))
- p3 = (max(p1[0], 15), max(p1[1], 15))
- title = "%s:%.2f" % (CLASSES[int(cls[i])], conf[i])
- cv2.putText(imgSRC, title, p3, cv2.FONT_ITALIC, 0.6, (0, 255, 0), 1)
- cv2.imshow("SSD", imgSRC)
-
- k = cv2.waitKey(1) & 0xff
- #Exit if ESC pressed
- if k == 27 : return False
- return True
- cap = cv2.VideoCapture(0)
- while(True):
- ret, frame = cap.read()
- detect(frame)
这个我训练好的模型:
测试结果:
`