HiEuler-Pico-OpenEuler Yolov8模型训练和转换——数据集制作（二）_专栏

HiEuler-Pico-OpenEuler Yolov8模型训练和转换——yolov8环境搭建（一）

继上一篇

文章目录

2、数据集制作
2.4 图像预处理
- 2.4.1 PS批量处理图像
- 2.4.2 opencv批量处理图像
2.5 数据集标注
2.6 转换输出yolo（txt）格式数据集

2、数据集制作

以下数据集制作均是在windows下进行。

2.1 数据集格式介绍

yolov8的数据集格式是yolo(txt)格式，整个数据集文件夹先是分为图片和标签文件夹，分别存储图片和标签文件，而图片和标签文件夹下均再继续分为训练集、验证集和测试集，训练时训练集和验证集是必须的，测试集后续预测才需要。

数据集文件目录格式如下，images存放图片，labels存放标签文件，其中两者的子目录：train为训练集，val为验证集，test为测试集；两者的子目录下存放的图片和标签文件除扩展名外名称是一样，是相互对应的。

参考链接：[YOLOv8] - YOLO数据集格式介绍和案例

2.2 拍摄视频

使用手机采集视频，视频分辨率和帧率越高越好，数据集图片的清晰度直接影响了模型的最终效果，可以环绕目标物体缓慢移动，录制视频中动作要缓慢并且清晰无遮挡；一般手机录制视频比例不能为1:1，所以录制时需要注意录制目标需要在一个大致为1:1的范围内。

拍摄完成后将视频通过USB-TypeC线拷贝至windows继续后面步骤，由于微信对于大于25M视频会压缩处理再发送，对视频传输的质量有影响，所以如若用微信等聊天软件传输，请将视频压缩打包后再传输

2.3 ffmpeg视频抽帧

2.3.1 ffmpeg安装

windows版本ffmpeg下载链接：Builds - CODEX FFMPEG @ gyan.dev

安装完成后解压，右键【此电脑】点击【属性】，点击【高级系统设置】，进入【环境变量】，添加系统变量

添加变量

win+r输出cmd打开终端，查看是否安装成功

ffmpeg -version
1

ffmpeg安装参考链接：【最新】windows电脑FFmpeg安装教程手把手详解

2.3.2 抽帧生成图片

使用ffmpeg对视频抽帧生成大量图片，注意输出的图片质量，建议输出png无损格式图片，win+r输出cmd打开终端执行以下命令

#参数-vf "fps=15"一秒抽15张图片，-q:v 1为输出图片质量，1为最高，<video_name>为视频名称，<pic_anme>为输出图片名称
ffmpeg -i <video_name>.mp4 -q:v 1 -vf "fps=15" <pic_anme>_%06d.png
1
2

2.4 图像预处理

此小节将上述抽帧生成的图像进行训练前的图像预处理

CV610要求yolov8输入图片大小为640:640，所以这里的图像预处理主要为裁剪图片和修改图片大小，而且事先进行图像预处理可减小数据集大小，方便数据集上传至服务器上，并且极大的缩短训练时间。

PS批量处理相对于opencv批量处理生成的图片质量好一点，时间充足的话推荐PS批量处理，时间有限可选择opencv批量处理

以下图像批量处理请注意图片裁剪的位置应大致包含拍摄时的目标

2.4.1 PS批量处理图像

需要安装PS软件，PS软件需自行寻找安装包安装

打开图片，点击窗口选择动作，点击录制动作

填写动作名称后开始录制，先对图片进行1:1裁剪，再调整图像像素大小为640:640，注意修改裁剪图片的位置，批处理时它是固定的

使用脚本工具批量处理图片

2.4.2 opencv批量处理图像

请检查是否已安装python和opencv-python

创建一个image_processing.py文件，添加以下内容

import os
import cv2
import cv2 as cv
from numpy import *
import numpy as np

def image_processing(open_dir,save_dir):
    conter = 0
    for filename in os.listdir(open_dir):
        # 获取得到文件后缀
        split_file = os.path.splitext(filename)
        new_filename= split_file[0]+'.jpg'
        img_dir = os.path.join(open_dir,filename)
        save_img_dir = os.path.join(save_dir,new_filename)

        #图片处理
        img = cv2.imread(img_dir)  
        height, width = img.shape[:2]

        #裁剪位置，这里裁剪的是图片中心，根据情况而定
        y0 = int(height/2-width/2)
        y1 = int(height/2+width/2)
        x0 = int(0)
        x1 = int(width)
        img = img[y0:y1, x0:x1]  # 裁剪坐标为[y0:y1, x0:x1]

        #调整图片大小，第二个参数
        img = cv2.resize(img,(640,640),interpolation = cv2.INTER_AREA)

        cv2.imwrite(save_img_dir, img,[cv2.IMWRITE_JPEG_QUALITY, 100])
        conter = conter + 1
        print("Image_processing Successful!",conter)
    print("Image_processing over!")


if __name__=="__main__":
    #图片输入路径
    open_dir="./111"
    #图片输出路径
    save_dir="./222"
    image_processing(open_dir,save_dir)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
<

注意根据图片修改裁剪图片的位置，以及图片像素大小

win+r输入cmd打开终端，到image_processing.py文件所在目录下执行

2.5 数据集标注

yolov8数据集采用yolo(txt)格式，采用vott工具标注输出VOC数据集再使用脚本将VOC格式变为yolo格式。

使用labelme可参考链接：基于YOLO目标算法实战

2.5.1 安装vott工具

vott下载链接：VoTT download | SourceForge.net

vott介绍以及标注操作讲解链接：VoTT:Visual Object Tagging Tool: An electron app for building end to end Object Detection Models from Images and Videos. - GitCode

2.5.2 vott创建项目

vott格式设置

2.5.3 标注

2.5.4 保存输出VOC数据集

项目文件和导出的数据集在设置的输出路径中

2.6 转换输出yolo（txt）格式数据集

这一步需使用python脚本，请检查是否已安装python

创建一个voc_to_yolo.py文件，添加以下内容：

import os
import xml.etree.ElementTree as ET
import random
from shutil import copyfile
import argparse

def get_normalized_box(box, image_width, image_height):
    xmin = float(box.find("xmin").text) / image_width
    ymin = float(box.find("ymin").text) / image_height
    xmax = float(box.find("xmax").text) / image_width
    ymax = float(box.find("ymax").text) / image_height

    return ((xmin + xmax) / 2, (ymin + ymax) / 2, xmax - xmin, ymax - ymin)

count_zero_dimensions = 0
files_with_zero_dimensions = []
train_images_count = 0
val_images_count = 0
test_images_count = 0
total_images_count = 0
original_images_total_count = 0

def convert_xml_to_txt(xml_path, out_path, split, class_mapping):
    global count_zero_dimensions, files_with_zero_dimensions, train_images_count, \
        val_images_count, test_images_count, total_images_count, original_images_total_count

    if not os.path.exists(out_path):
        os.makedirs(out_path)

    filename = os.path.splitext(os.path.basename(xml_path))[0]
    txt_file = open(os.path.join(out_path, filename + ".txt"), "w")
    root = ET.parse(xml_path).getroot()
    size = root.find("size")
    width = int(size.find("width").text)
    height = int(size.find("height").text)

    if width == 0 or height == 0:
        count_zero_dimensions += 1
        files_with_zero_dimensions.append(filename)
        print(f"Warning: {filename} has zero width or height. Excluding this data.")
        return

    for obj in root.iter("object"):
        name = obj.find("name").text
        index = class_mapping.get(name)
        if index is not None:
            box = get_normalized_box(obj.find("bndbox"), width, height)
            txt_file.write("%s %f %f %f %f\n" % (index, *box))
        else:
            print(f"Warning: Unknown class '{name}' in {xml_path}")

    txt_file.close()
    print(f"{xml_path} converted for {split}")

    if split == "train":
        train_images_count += 1
    elif split == "val":
        val_images_count += 1
    elif split == "test":
        test_images_count += 1

    total_images_count += 1

def split_dataset(original_images_folder, annotations_folder, out_path, class_mapping,
                  train_ratio=0.8, val_ratio=0.1, test_ratio=0.1):
    global original_images_total_count, count_zero_dimensions, files_with_zero_dimensions, \
        train_images_count, val_images_count, test_images_count, total_images_count

    file_list = os.listdir(annotations_folder)
    random.shuffle(file_list)

    train_split = int(len(file_list) * train_ratio)
    val_split = int(len(file_list) * (train_ratio + val_ratio))

    train_files = file_list[:train_split]
    val_files = file_list[train_split:val_split]
    test_files = file_list[val_split:]

    # Create output folders
    os.makedirs(os.path.join(out_path, "images", "train"), exist_ok=True)
    os.makedirs(os.path.join(out_path, "images", "val"), exist_ok=True)
    os.makedirs(os.path.join(out_path, "images", "test"), exist_ok=True)
    os.makedirs(os.path.join(out_path, "labels", "train"), exist_ok=True)
    os.makedirs(os.path.join(out_path, "labels", "val"), exist_ok=True)
    os.makedirs(os.path.join(out_path, "labels", "test"), exist_ok=True)

    for file in file_list:
        annotation_path = os.path.join(annotations_folder, file)
        original_images_total_count += 1

        # Only convert and copy when width and height are both non-zero
        root = ET.parse(annotation_path).getroot()
        size = root.find("size")
        width = int(size.find("width").text)
        height = int(size.find("height").text)

        if width == 0 or height == 0:
            count_zero_dimensions += 1
            files_with_zero_dimensions.append(file)
            print(f"Warning: {file} contains zero width or height...Excluding this data")
            continue

        for split, files in [("train", train_files), ("val", val_files), ("test", test_files)]:
            if file in files:
                output_folder_images = os.path.join(out_path, "images", split)
                output_folder_labels = os.path.join(out_path, "labels", split)
                convert_xml_to_txt(annotation_path, output_folder_labels, split, class_mapping)
                copyfile(os.path.join(original_images_folder, file.replace(".xml", ".jpg")),
                         os.path.join(output_folder_images, os.path.basename(file.replace(".xml", ".jpg"))))

    print(f'Total occurrences of zero width or height: {count_zero_dimensions}')
    print(f'Files with zero width or height: {tuple(files_with_zero_dimensions)}')
    print(f'Total number of images in the original dataset: {original_images_total_count}')
    print(f'Total number of images in the dataset after excluding zero width or height: {total_images_count}')
    print(f'Number of images in the training set: {train_images_count}')
    print(f'Number of images in the validation set: {val_images_count}')
    print(f'Number of images in the test set: {test_images_count}')

if __name__ == '__main__':
    parser = argparse.ArgumentParser(description='Process some images and annotations.')
    # 把标签和代表标签的数字以字典形式对应好
    parser.add_argument('--class_mapping', type=dict, default={'test':0},
                        help='Mapping of class names to indices')
    # default改为只存放图片的地址
    parser.add_argument('--original_images_folder', type=str,default='./voc/JPEGImages',
                        help='Path to the folder containing original images')
    # default改为只存放xml的地址
    parser.add_argument('--annotations_folder', type=str,default='./voc/Annotations',
                        help='Path to the folder containing annotations')
    # 指定一个路径存放转化后的训练集，验证集，测试集，最后一级地址可以不用建文件夹会自动生成
    parser.add_argument('--out_path', type=str,default='./yolo',
                        help='Output path for processed images and labels')
    # 指定训练集，验证集，测试集所占比例
    parser.add_argument('--train_ratio', type=float, default=0.8, help='Ratio of images for training set')
    parser.add_argument('--val_ratio', type=float, default=0.1, help='Ratio of images for validation set')
    parser.add_argument('--test_ratio', type=float, default=0.1, help='Ratio of images for test set')

    args = parser.parse_args()

    original_images_folder = args.original_images_folder
    annotations_folder = args.annotations_folder
    out_path = args.out_path
    class_mapping = args.class_mapping

    split_dataset(original_images_folder, annotations_folder, out_path, class_mapping,
                  train_ratio=args.train_ratio, val_ratio=args.val_ratio, test_ratio=args.test_ratio)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
<