0%

YOLOV5环境搭建及导出ONNX部署

在Windows10搭建yolov5的开发环境,简单训练模型,以及脱离深度学习环境部署模型

新手记录搭建yolo开放环境

软件安装

安装显卡驱动

根据电脑配置安装对应的显卡驱动,没有显卡不安装也行

安装Anaconda

由于深度学习环境的各种依赖特别繁琐和复杂,用Anaconda来管理深度学习环境会比较方便,相当于在电脑里面划分出一块空间单独给深度学习用,无论怎么折腾它都不会影响到电脑本来的开发环境,可以避免很多不必要的麻烦
打开Anaconda官网https://www.anaconda.com/products/distribution#Downloads,下载相应安装文件安装即可
注意:后续开发环境占用空间非常大,建议不要安装在C盘,并预留50G以上的空间

安装pycharm

python最好用的ide,一开始我用的VSCode,配置起来麻烦,用pycharm简单很多。没啥好说的,官网下载安装即可

环境配置

配置PyTorch

安装PyTorch

安装好Anaconda后打开Anaconda Prompt或者Anaconda Powershell Prompt,Anaconda Powershell Prompt比Anaconda Prompt多了一些Linux下的命令操作,使用上更加方便,用哪个都行。
使用命令查看当前有哪些环境

1
conda env list
1
2
3
4
5
6
(base) PS C:\Users\Kepler> conda env list
# conda environments:
#
base * D:\Anaconda3

(base) PS C:\Users\Kepler>

默认只有一个base环境,输入命令创建一个新的环境pytorch,询问是否安装的时候,输入y以安装

1
conda create -n pytorch python=3.8

安装完成后应该有两个环境

1
2
3
4
5
6
7
(base) PS C:\Users\Kepler> conda env list
# conda environments:
#
base * D:\Anaconda3
pytorch D:\Anaconda3\envs\pytorch

(base) PS C:\Users\Kepler>

输入命令切换环境

1
conda activate pytorch

换源加速安装

1
2
3
4
conda config --add channels https://mirrors.tuna.tsinghua.edu.cn/anaconda/pkgs/free/
conda config --add channels https://mirrors.tuna.tsinghua.edu.cn/anaconda/pkgs/main/
conda config --add channels https://mirrors.tuna.tsinghua.edu.cn/anaconda/cloud/pytorch/
conda config --set show_channel_urls yes

准备工作做完了,打开PyTorch官网,按下图复制安装命令,不可以复制完整,要不然下载很慢

1
conda install pytorch torchvision torchaudio cudatoolkit=11.3

等待安装完成,这里可能会安装失败,多试几次就可以了

验证CUDA、CUDNN

打开pycharm,新建一个工程,复制代码运行

1
2
3
4
5
import torch
print(torch.cuda.is_available())
print(torch.backends.cudnn.is_available())
print(torch.cuda_version)
print(torch.backends.cudnn.version())

输出以下信息则成功安装了CUDA和CUDNN

1
2
3
4
True
True
11.3
8200

配置YOLOV5

克隆YOLOV5

从yolov5的仓库下载或克隆一份代码到本地,用pycharm打开工程:

data:

主要是存放一些超参数的配置文件(这些文件(yaml文件)是用来配置训练集和测试集还有验证集的路径的,其中还包括目标检测的种类数和种类的名称)还有一些官方提供测试的图片。如果是训练自己的数据集的话,那么就需要修改其中的yaml文件。但是自己的数据集不建议放在这个路径下面,而是建议把数据集放到yolov5项目的同级目录下面

models:

里面主要是一些网络构建的配置文件和函数,其中包含了该项目的四个不同的版本,分别为是s、m、l、x。从名字就可以看出,这几个版本的大小。他们的检测测度分别都是从快到慢,但是精确度分别是从低到高。这就是所谓的鱼和熊掌不可兼得。如果训练自己的数据集的话,就需要修改这里面相对应的yaml文件来训练自己模型。

utils:

存放的是工具类的函数,里面有loss函数,metrics函数,plots函数等等。

weights:

放置训练好的权重参数。

detect.py:

利用训练好的权重参数进行目标检测,可以进行图像、视频和摄像头的检测。

train.py:

训练自己的数据集的函数。

test.py:

测试训练的结果的函数。

requirements.txt:

这是一个文本文件,里面写着使用yolov5项目的环境依赖包的一些版本,可以利用该文本导入相应版本的包

设置python解释器

设置工程的python解释器为之前在Anaconda中创建的pytorch,在pycharm右下角设置

安装依赖

在yolov5主目录打开终端输入命令安装

1
pip install -r requirements.txt

准备数据集

爬图

使用以下代码从百度图库爬取图片,根据需求修改keyword和max_download_images

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
import os
import re
from typing import List, Tuple
from urllib.parse import quote

import requests

# 关键词, 改为你想输入的词即可, 相当于在百度图片里搜索一样
keyword = '人'

# 最大下载数量
max_download_images = 30

url_init_first = 'https://image.baidu.com/search/flip?tn=baiduimage&word='
headers = {
'User-Agent': 'Mozilla/5.0 (Macintosh; Intel Mac OS X 11_2_3) AppleWebKit/537.36 (KHTML, like Gecko) '
'Chrome/88.0.4324.192 Safari/537.36'
}

def get_page_urls(page_url: str, headers: dict) -> Tuple[List[str], str]:
if not page_url:
return [], ''
try:
html = requests.get(page_url, headers=headers)
html.encoding = 'utf-8'
html = html.text
except IOError as e:
print(e)
return [], ''
pic_urls = re.findall('"objURL":"(.*?)",', html, re.S)
next_page_url = re.findall(re.compile(r'<a href="(.*)" class="n">下一页</a>'), html, flags=0)
next_page_url = 'http://image.baidu.com' + next_page_url[0] if next_page_url else ''
return pic_urls, next_page_url

def down_pic(pic_urls: List[str], max_download_images: int) -> None:
pic_urls = pic_urls[:max_download_images]
for i, pic_url in enumerate(pic_urls):
try:
pic = requests.get(pic_url, timeout=15)
image_output_path = './images/' + keyword + str(i + 1) + '.jpg'
with open(image_output_path, 'wb') as f:
f.write(pic.content)
print('成功下载第%s张图片: %s' % (str(i + 1), str(pic_url)))
except IOError as e:
print('下载第%s张图片时失败: %s' % (str(i + 1), str(pic_url)))
print(e)
continue

if __name__ == '__main__':
url_init = url_init_first + quote(keyword, safe='/')
all_pic_urls = []
page_urls, next_page_url = get_page_urls(url_init, headers)
all_pic_urls.extend(page_urls)

page_count = 0
if not os.path.exists('./images'):
os.mkdir('./images')

while 1:
page_urls, next_page_url = get_page_urls(next_page_url, headers)
page_count += 1
print('正在获取第%s个翻页的所有图片链接' % str(page_count))
if next_page_url == '' and page_urls == []:
print('已到最后一页,共计%s个翻页' % page_count)
break
all_pic_urls.extend(page_urls)
if len(all_pic_urls) >= max_download_images:
print('已达到设置的最大下载数量%s' % max_download_images)
break

down_pic(list(set(all_pic_urls)), max_download_images)

数据集打标签

利用labelimg给数据集打标签

安装labelimg

1
pip install labelimg -i https://pypi.tuna.tsinghua.edu.cn/simple

准备原始数据

新建如下目录结构

  1. 按上图新建好文件目录结构和文件
  2. 将上一步准备的数据复制到source_image/source_image/VOC/Image目录下
  3. 在predefined_classes.txt中写入训练class,如person dog

  1. 复制转换代码到converter.py中
    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    19
    20
    21
    22
    23
    24
    25
    26
    27
    28
    29
    30
    31
    32
    33
    34
    35
    36
    37
    38
    39
    40
    41
    42
    43
    44
    45
    46
    47
    48
    49
    50
    51
    52
    53
    54
    55
    56
    57
    58
    59
    60
    61
    62
    63
    64
    65
    66
    67
    68
    69
    70
    71
    72
    73
    74
    75
    76
    77
    78
    79
    80
    81
    82
    83
    84
    85
    86
    87
    88
    89
    90
    91
    92
    93
    94
    95
    96
    97
    98
    99
    100
    101
    102
    103
    104
    105
    106
    107
    108
    109
    110
    111
    112
    113
    114
    115
    116
    117
    118
    119
    120
    121
    122
    123
    124
    125
    126
    127
    128
    129
    130
    131
    132
    133
    134
    135
    136
    137
    138
    139
    import xml.etree.ElementTree as ET
    import pickle
    import os
    from os import listdir, getcwd
    from os.path import join
    import random
    from shutil import copyfile

    classes = ["person", "dog"]
    #classes=["ball"]

    TRAIN_RATIO = 80

    def clear_hidden_files(path):
    dir_list = os.listdir(path)
    for i in dir_list:
    abspath = os.path.join(os.path.abspath(path), i)
    if os.path.isfile(abspath):
    if i.startswith("._"):
    os.remove(abspath)
    else:
    clear_hidden_files(abspath)

    def convert(size, box):
    dw = 1./size[0]
    dh = 1./size[1]
    x = (box[0] + box[1])/2.0
    y = (box[2] + box[3])/2.0
    w = box[1] - box[0]
    h = box[3] - box[2]
    x = x*dw
    w = w*dw
    y = y*dh
    h = h*dh
    return (x,y,w,h)

    def convert_annotation(image_id):
    in_file = open('./source_image/VOC/Annotations/%s.xml' %image_id)
    out_file = open('./source_image/VOC/YOLOLabels/%s.txt' %image_id, 'w')
    tree=ET.parse(in_file)
    root = tree.getroot()
    size = root.find('size')
    w = int(size.find('width').text)
    h = int(size.find('height').text)

    for obj in root.iter('object'):
    difficult = obj.find('difficult').text
    cls = obj.find('name').text
    if cls not in classes or int(difficult) == 1:
    continue
    cls_id = classes.index(cls)
    xmlbox = obj.find('bndbox')
    b = (float(xmlbox.find('xmin').text), float(xmlbox.find('xmax').text), float(xmlbox.find('ymin').text), float(xmlbox.find('ymax').text))
    bb = convert((w,h), b)
    out_file.write(str(cls_id) + " " + " ".join([str(a) for a in bb]) + '\n')
    in_file.close()
    out_file.close()

    wd = os.getcwd()
    wd = os.getcwd()
    data_base_dir = os.path.join(wd, "source_image/")
    if not os.path.isdir(data_base_dir):
    os.mkdir(data_base_dir)
    work_sapce_dir = os.path.join(data_base_dir, "VOC/")
    if not os.path.isdir(work_sapce_dir):
    os.mkdir(work_sapce_dir)
    annotation_dir = os.path.join(work_sapce_dir, "Annotations/")
    if not os.path.isdir(annotation_dir):
    os.mkdir(annotation_dir)
    clear_hidden_files(annotation_dir)
    image_dir = os.path.join(work_sapce_dir, "Images/")
    if not os.path.isdir(image_dir):
    os.mkdir(image_dir)
    clear_hidden_files(image_dir)
    yolo_labels_dir = os.path.join(work_sapce_dir, "YOLOLabels/")
    if not os.path.isdir(yolo_labels_dir):
    os.mkdir(yolo_labels_dir)
    clear_hidden_files(yolo_labels_dir)
    yolov5_images_dir = os.path.join(data_base_dir, "images/")
    if not os.path.isdir(yolov5_images_dir):
    os.mkdir(yolov5_images_dir)
    clear_hidden_files(yolov5_images_dir)
    yolov5_labels_dir = os.path.join(data_base_dir, "labels/")
    if not os.path.isdir(yolov5_labels_dir):
    os.mkdir(yolov5_labels_dir)
    clear_hidden_files(yolov5_labels_dir)
    yolov5_images_train_dir = os.path.join(yolov5_images_dir, "train/")
    if not os.path.isdir(yolov5_images_train_dir):
    os.mkdir(yolov5_images_train_dir)
    clear_hidden_files(yolov5_images_train_dir)
    yolov5_images_test_dir = os.path.join(yolov5_images_dir, "val/")
    if not os.path.isdir(yolov5_images_test_dir):
    os.mkdir(yolov5_images_test_dir)
    clear_hidden_files(yolov5_images_test_dir)
    yolov5_labels_train_dir = os.path.join(yolov5_labels_dir, "train/")
    if not os.path.isdir(yolov5_labels_train_dir):
    os.mkdir(yolov5_labels_train_dir)
    clear_hidden_files(yolov5_labels_train_dir)
    yolov5_labels_test_dir = os.path.join(yolov5_labels_dir, "val/")
    if not os.path.isdir(yolov5_labels_test_dir):
    os.mkdir(yolov5_labels_test_dir)
    clear_hidden_files(yolov5_labels_test_dir)

    train_file = open(os.path.join(wd, "yolov5_train.txt"), 'w')
    test_file = open(os.path.join(wd, "yolov5_val.txt"), 'w')
    train_file.close()
    test_file.close()
    train_file = open(os.path.join(wd, "yolov5_train.txt"), 'a')
    test_file = open(os.path.join(wd, "yolov5_val.txt"), 'a')
    list_imgs = os.listdir(image_dir) # list image files
    prob = random.randint(1, 100)
    print("Probability: %d" % prob)
    for i in range(0,len(list_imgs)):
    path = os.path.join(image_dir,list_imgs[i])
    if os.path.isfile(path):
    image_path = image_dir + list_imgs[i]
    voc_path = list_imgs[i]
    (nameWithoutExtention, extention) = os.path.splitext(os.path.basename(image_path))
    (voc_nameWithoutExtention, voc_extention) = os.path.splitext(os.path.basename(voc_path))
    annotation_name = nameWithoutExtention + '.xml'
    annotation_path = os.path.join(annotation_dir, annotation_name)
    label_name = nameWithoutExtention + '.txt'
    label_path = os.path.join(yolo_labels_dir, label_name)
    prob = random.randint(1, 100)
    print("Probability: %d" % prob)
    if(prob < TRAIN_RATIO): # train dataset
    if os.path.exists(annotation_path):
    train_file.write(image_path + '\n')
    convert_annotation(nameWithoutExtention) # convert label
    copyfile(image_path, yolov5_images_train_dir + voc_path)
    copyfile(label_path, yolov5_labels_train_dir + label_name)
    else: # image dataset
    if os.path.exists(annotation_path):
    test_file.write(image_path + '\n')
    convert_annotation(nameWithoutExtention) # convert label
    copyfile(image_path, yolov5_images_test_dir + voc_path)
    copyfile(label_path, yolov5_labels_test_dir + label_name)
    train_file.close()
    test_file.close()

使用labelimg打标签

在source_image/source_image/VOC/目录下打开终端,输入命令启动labelimg

1
labelimg Images predefined_classes.txt

注意设置目标格式

标签格式转换

运行source_image/source_image/目录下coverter.py,将VOC格式标签xml文件转换成yolo格式标签txt文件,并将数据划分成训练数据和验证数据
转换后文件中会被填充入数据

训练模型

获取预训练权重

一般为了缩短网络的训练时间,并达到更好的精度,我们一般加载预训练权重进行网络的训练。而yolov5的5.0版本给我们提供了几个预训练权重,我们可以对应我们不同的需求选择不同的版本的预训练权重。通过如下的图可以获得权重的名字和大小信息,可以预料的到,预训练权重越大,训练出来的精度就会相对来说越高,但是其检测的速度就会越慢。预训练权重可以通过这个网址进行下载,本次训练自己的数据集用的预训练权重为yolov5s.pt

修改配置文件

拷贝数据集

将之前准备好的数据集拷贝到yolov5主目录中

修改voc.yaml

找到data目录下的voc.yaml文件,将该文件复制一份,将复制的文件重命名,最好和项目相关,这样方便后面操作。我这里修改为person_dog.yaml。该项目是对人和狗的识别

按图片修改person-dog内容(根据自己的实际情况修改)

修改模型配置文件

找到models目录下的yolov5s.yaml文件,将该文件复制一份,将复制的文件重命名,最好和项目相关,这样方便后面操作。我这里修改为person_dog.yaml。该项目是对人和狗的识别

按图片修改内容,只需要改nc的值即可(根据自己的实际情况修改)

修改训练参数

打开train.py文件,修改以下几个参数

1
2
3
4
5
6
parser.add_argument('--weights', type=str, default='weights/yolov5s.pt', help='initial weights path') #预训练权重文件
parser.add_argument('--cfg', type=str, default='models/person-dog.yaml', help='model.yaml path') #模型配置
parser.add_argument('--data', type=str, default='data/person-dog.yaml', help='data.yaml path') # 数据集
parser.add_argument('--epochs', type=int, default=300) # 训练轮数
parser.add_argument('--batch-size', type=int, default=8, help='total batch size for all GPUs')
parser.add_argument('--workers', type=int, default=8, help='maximum number of dataloader workers')

改完后运行train.py开始训练

如遇虚拟内存爆了,将utils路径下datasets.py的num_workers改为0

tensorbord

运行

1
tensorboard --logdir=runs/train

点击生成的网址即可跳转浏览器查看训练过程数据

推理验证

训练结束后会在yolo主目录生成run文件夹,权重文件在run/train/exp/weights/下,best.pt是最好的权重文件,last.pt是最后一轮训练的权重文件。
打开主目录下的detect.py文件,修改设置进行推理测试。

传入权重文件路径,就是刚刚训练的结果

1
parser.add_argument('--weights', nargs='+', type=str, default='runs/train/exp/weights/best.pt', help='model.pt path(s)') 

传入要测试的图片路径,改为’0’则是打开摄像头

1
parser.add_argument('--source', type=str, default='person.jpg', help='source') 

嵌入式部署基础

在进行嵌入式部署的时候,为了简化依赖,需要将pt文件转为onnx文件进行推理,只需要安装opencv就可以完成推理

pt、pth、onnx转换

pt转pth代码

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
import torch
import pickle
import argparse
from collections import OrderedDict

if __name__ == '__main__':
device = 'cuda' if torch.cuda.is_available() else 'cpu'

parser = argparse.ArgumentParser()
parser.add_argument('--source', default='best')
args = parser.parse_args()

modelfile = args.source + '.pt'
utl_model = torch.load(modelfile, map_location=device)
utl_param = utl_model['model'].model
torch.save(utl_param.state_dict(), args.source + '.pth')
own_state = utl_param.state_dict()
print(len(own_state))

numpy_param = OrderedDict()
for name in own_state:
numpy_param[name] = own_state[name].data.cpu().numpy()
print(len(numpy_param))
with open(args.source + '_numpy_param.pkl', 'wb') as fw:
pickle.dump(numpy_param, fw)

pth转onnx代码

1
见目录

使用OpenCV推理ONNX

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
import cv2
import argparse
import numpy as np

class yolov5:
def __init__(self, yolo_type, confThreshold=0.5, nmsThreshold=0.5, objThreshold=0.5):
self.classes = ['person', 'dog']
self.colors = [np.random.randint(0, 255, size=3).tolist() for _ in range(len(self.classes))]
num_classes = len(self.classes)
anchors = [[10, 13, 16, 30, 33, 23], [30, 61, 62, 45, 59, 119], [116, 90, 156, 198, 373, 326]]
self.nl = len(anchors)
self.na = len(anchors[0])
self.no = num_classes + 5
self.grid = [np.zeros(1)] * self.nl
self.stride = np.array([8., 16., 32.])
self.anchor_grid = np.asarray(anchors, dtype=np.float32).reshape(self.nl, 1, -1, 1, 1, 2)

self.net = cv2.dnn.readNet(yolo_type + '.onnx')
self.confThreshold = confThreshold
self.nmsThreshold = nmsThreshold
self.objThreshold = objThreshold

def _make_grid(self, nx=20, ny=20):
xv, yv = np.meshgrid(np.arange(ny), np.arange(nx))
return np.stack((xv, yv), 2).reshape((1, 1, ny, nx, 2)).astype(np.float32)

def postprocess(self, frame, outs):
frameHeight = frame.shape[0]
frameWidth = frame.shape[1]
ratioh, ratiow = frameHeight / 640, frameWidth / 640
# Scan through all the bounding boxes output from the network and keep only the
# ones with high confidence scores. Assign the box's class label as the class with the highest score.
classIds = []
confidences = []
boxes = []
for out in outs:
for detection in out:
scores = detection[5:]
classId = np.argmax(scores)
confidence = scores[classId]
if confidence > self.confThreshold and detection[4] > self.objThreshold:
center_x = int(detection[0] * ratiow)
center_y = int(detection[1] * ratioh)
width = int(detection[2] * ratiow)
height = int(detection[3] * ratioh)
left = int(center_x - width / 2)
top = int(center_y - height / 2)
classIds.append(classId)
confidences.append(float(confidence))
boxes.append([left, top, width, height])

# Perform non maximum suppression to eliminate redundant overlapping boxes with
# lower confidences.
indices = cv2.dnn.NMSBoxes(boxes, confidences, self.confThreshold, self.nmsThreshold)
for i in indices:
j = i
box = boxes[j]
left = box[0]
top = box[1]
width = box[2]
height = box[3]
frame = self.drawPred(frame, classIds[j], confidences[j], left, top, left + width, top + height)
return frame

def drawPred(self, frame, classId, conf, left, top, right, bottom):
# Draw a bounding box.
cv2.rectangle(frame, (left, top), (right, bottom), (0, 0, 255), thickness=4)

label = '%.2f' % conf
label = '%s:%s' % (self.classes[classId], label)

# Display the label at the top of the bounding box
labelSize, baseLine = cv2.getTextSize(label, cv2.FONT_HERSHEY_SIMPLEX, 0.5, 1)
top = max(top, labelSize[1])
# cv.rectangle(frame, (left, top - round(1.5 * labelSize[1])), (left + round(1.5 * labelSize[0]), top + baseLine), (255,255,255), cv.FILLED)
cv2.putText(frame, label, (left, top - 10), cv2.FONT_HERSHEY_SIMPLEX, 1, (0, 255, 0), thickness=2)
return frame

def detect(self, srcimg):
blob = cv2.dnn.blobFromImage(srcimg, 1 / 255.0, (640, 640), [0, 0, 0], swapRB=True, crop=False)
# Sets the input to the network
self.net.setInput(blob)

# Runs the forward pass to get output of the output layers
outs = self.net.forward(self.net.getUnconnectedOutLayersNames())

z = [] # inference output
for i in range(self.nl):
bs, ns, ny, nx = outs[i].shape # x(bs,255,20,20) to x(bs,3,20,20,85)
# outs[i] = outs[i].view(bs, self.na, self.no, ny, nx).permute(0, 1, 3, 4, 2).contiguous()
out = outs[i].reshape(bs, self.na, self.no, ny, nx).transpose(0, 1, 3, 4, 2)
# outs[i].resize(bs, 3, ny, nx, int(ns/3))
if self.grid[i].shape[2:4] != out.shape[2:4]:
self.grid[i] = self._make_grid(nx, ny)

y = 1 / (1 + np.exp(-out)) ### sigmoid
###其实只需要对x,y,w,h做sigmoid变换的, 不过全做sigmoid变换对结果影响不大,因为sigmoid是单调递增函数,那么就不影响类别置信度的排序关系,因此不影响后面的NMS
###不过设断点查看类别置信度,都是负数,看来有必要做sigmoid变换把概率值强行拉回到0到1的区间内
y[..., 0:2] = (y[..., 0:2] * 2. - 0.5 + self.grid[i]) * int(self.stride[i])
y[..., 2:4] = (y[..., 2:4] * 2) ** 2 * self.anchor_grid[i] # wh
z.append(y.reshape(bs, -1, self.no))
z = np.concatenate(z, axis=1)
return z

if __name__ == "__main__":
parser = argparse.ArgumentParser()
parser.add_argument("--source", default='0', type=str, help="image path")
parser.add_argument('--net_type', default='best', choices=['yolov5s', 'yolov5l', 'yolov5m', 'yolov5x'])
parser.add_argument('--confThreshold', default=0.5, type=float, help='class confidence')
parser.add_argument('--nmsThreshold', default=0.5, type=float, help='nms iou thresh')
parser.add_argument('--objThreshold', default=0.5, type=float, help='object confidence')
args = parser.parse_args()

if not args.source.isdigit():
yolonet = yolov5(args.net_type, confThreshold=args.confThreshold, nmsThreshold=args.nmsThreshold, objThreshold=args.objThreshold)
frame = cv2.imread(args.source)
dets = yolonet.detect(frame)
frame = yolonet.postprocess(frame, dets)
cv2.imshow('result', frame)
cv2.waitKey(0)
cv2.destroyAllWindows()
else:
cap = cv2.VideoCapture(int(args.source))
while cap.isOpened():
ok, frame = cap.read()
if not ok:
break
yolonet = yolov5(args.net_type, confThreshold=args.confThreshold, nmsThreshold=args.nmsThreshold, objThreshold=args.objThreshold)
dets = yolonet.detect(frame)
frame = yolonet.postprocess(frame, dets)
cv2.imshow('result', frame)
c = cv2.waitKey(1) & 0xFF
if c == 27 or c == ord('q'):
break
cap.release()
cv2.destroyAllWindows()

参考链接:

  1. https://blog.csdn.net/didiaopao?type=blog (主要参考对象,yolov5配置全网最详细保姆级教程)
  2. https://blog.csdn.net/nihate/article/details/112731327 (opencv推理)
  3. https://github.com/ultralytics/yolov5