【详细记录】rv1126 跑通 yolov5_专栏

在前面，已经将 onnx模型转为 rknn模型。

yolov5 onnx模型转为 rknn模型

这里探讨的是：rknn模型在rv1126开发板上运行

1、rknn模型在PC端进行推理测试，评估模型精度

这里是在上面那个博客的提到的docker环境，即

2、模型预编译

执行过第一步，可以发现rknn模型推理的时候会很慢，所以需要对模型进行预编译。预编译的时候需要经过EASY EAI Nano主板的环境，开发板与Ubuntu必须保证adb连接稳定。

但是Ubuntu环境与docker环境对adb设备资源是竞争关系，所以需要关掉ubuntu环境的adb服务，且在docker环境通过apt-get安装adb包

在Ubuntu环境关闭adb服务：

adb kill-server
1

在docker环境安装adb安装包：

apt-get install adb

adb service

adb devices
1
2
3
4
5

运行precompile_rknn.py脚本把模型执行预编译

python precompile_rknn.py
1

执行效果如下图所示，生成预编译模型yolov5_coco_rv1126_pre.rknn

3、推理代码编译

开发环境准备（交叉环境）：
网盘链接：百度网盘请输入提取码

提取码：i1ii

记得修改 run.sh中

VOL_SRC=你存放上面文件的目录you

--user=root
1
2
3

然后在PC端Ubuntu系统中执行run脚本，进入EASY-EAI编译环境，具体如下所示。

cd ~/develop_environment
./run.sh
1
2

如此，Ubuntu 环境与 Docker 容器建立映射关系， Docker 容器与开发板建立映射关系

推理代码编译：
百度网盘链接: (百度网盘请输入提取码提取码：1jfb)。

解压后，在交叉环境中对其进行编译操作：

./build.sh
1

注：

*由于依赖库部署在板卡上，因此交叉编译过程中必须保持adb连接。

复制编译结果到开发板

在交叉编译环境中，将编译结果以及预编译的模型复制到/mnt/userdata目录，而该目录映射到开发板

cp yolov5_detect_demo_release/ /mnt/userdata/ -rf
1

通过按键Ctrl+Shift+T创建一个新窗口，执行adb shell命令，进入板卡运行环境:

adb shell
1

进入板卡后，定位到例程上传的位置，如下所示：

cd /userdata/yolov5_detect_demo_release/
1

运行例程命令如下所示：

运行之前还要复制测试图片、预编译模型文件到该文件夹

./yolov5_detect_demo
1

执行结果如下图所示，算法执行时间约为50ms:

退出板卡环境，取回测试图片:

exit
adb pull /userdata/yolov5_detect_demo_release/result.jpg .
1
2

与onnx模型推理结果对比，rknn模型对远处的车、以及不明显的人不敏感，但是近处的物体识别效果很好与onnx模型一致！

以下是两份代码

rknn模型在PC端推理代码(python)：


import os
import urllib
import traceback
import time
import sys
import numpy as np
import cv2
import random
from rknn.api import RKNN


RKNN_MODEL = 'yolov5_coco_rv1126.rknn'
IMG_PATH = './test.jpg'
DATASET = './dataset.txt'


BOX_THRESH = 0.25
NMS_THRESH = 0.6
IMG_SIZE = 640


CLASSES = ("person", "bicycle", "car","motorbike ","aeroplane ","bus ","train","truck ","boat","traffic light",
           "fire hydrant","stop sign ","parking meter","bench","bird","cat","dog ","horse ","sheep","cow","elephant",
           "bear","zebra ","giraffe","backpack","umbrella","handbag","tie","suitcase","frisbee","skis","snowboard","sports ball","kite",
           "baseball bat","baseball glove","skateboard","surfboard","tennis racket","bottle","wine glass","cup","fork","knife",
           "spoon","bowl","banana","apple","sandwich","orange","broccoli","carrot","hot dog","pizza ","donut","cake","chair","sofa",
           "pottedplant","bed","diningtable","toilet ","tvmonitor","laptop","mouse","remote ","keyboard ","cell phone","microwave ",
           "oven ","toaster","sink","refrigerator ","book","clock","vase","scissors ","teddy bear ","hair drier", "toothbrush")



def sigmoid(x):
    return 1 / (1 + np.exp(-x))

def xywh2xyxy(x):
    # Convert [x, y, w, h] to [x1, y1, x2, y2]
    y = np.copy(x)
    y[:, 0] = x[:, 0] - x[:, 2] / 2  # top left x
    y[:, 1] = x[:, 1] - x[:, 3] / 2  # top left y
    y[:, 2] = x[:, 0] + x[:, 2] / 2  # bottom right x
    y[:, 3] = x[:, 1] + x[:, 3] / 2  # bottom right y
    return y

def process(input, mask, anchors):

    anchors = [anchors[i] for i in mask]
    grid_h, grid_w = map(int, input.shape[0:2])

    box_confidence = sigmoid(input[..., 4])
    box_confidence = np.expand_dims(box_confidence, axis=-1)

    box_class_probs = sigmoid(input[..., 5:])

    box_xy = sigmoid(input[..., :2])*2 - 0.5

    col = np.tile(np.arange(0, grid_w), grid_w).reshape(-1, grid_w)
    row = np.tile(np.arange(0, grid_h).reshape(-1, 1), grid_h)
    col = col.reshape(grid_h, grid_w, 1, 1).repeat(3, axis=-2)
    row = row.reshape(grid_h, grid_w, 1, 1).repeat(3, axis=-2)
    grid = np.concatenate((col, row), axis=-1)
    box_xy += grid
    box_xy *= int(IMG_SIZE/grid_h)

    box_wh = pow(sigmoid(input[..., 2:4])*2, 2)
    box_wh = box_wh * anchors

    box = np.concatenate((box_xy, box_wh), axis=-1)

    return box, box_confidence, box_class_probs

def filter_boxes(boxes, box_confidences, box_class_probs):
    """Filter boxes with box threshold. It's a bit different with origin yolov5 post process!

    # Arguments
        boxes: ndarray, boxes of objects.
        box_confidences: ndarray, confidences of objects.
        box_class_probs: ndarray, class_probs of objects.

    # Returns
        boxes: ndarray, filtered boxes.
        classes: ndarray, classes for boxes.
        scores: ndarray, scores for boxes.
    """
    box_scores = box_confidences * box_class_probs
    box_classes = np.argmax(box_class_probs, axis=-1)
    box_class_scores = np.max(box_scores, axis=-1)
    pos = np.where(box_confidences[...,0] >= BOX_THRESH)


    boxes = boxes[pos]
    classes = box_classes[pos]
    scores = box_class_scores[pos]

    return boxes, classes, scores

def nms_boxes(boxes, scores):
    """Suppress non-maximal boxes.

    # Arguments
        boxes: ndarray, boxes of objects.
        scores: ndarray, scores of objects.

    # Returns
        keep: ndarray, index of effective boxes.
    """
    x = boxes[:, 0]
    y = boxes[:, 1]
    w = boxes[:, 2] - boxes[:, 0]
    h = boxes[:, 3] - boxes[:, 1]

    areas = w * h
    order = scores.argsort()[::-1]

    keep = []
    while order.size > 0:
        i = order[0]
        keep.append(i)

        xx1 = np.maximum(x[i], x[order[1:]])
        yy1 = np.maximum(y[i], y[order[1:]])
        xx2 = np.minimum(x[i] + w[i], x[order[1:]] + w[order[1:]])
        yy2 = np.minimum(y[i] + h[i], y[order[1:]] + h[order[1:]])

        w1 = np.maximum(0.0, xx2 - xx1 + 0.00001)
        h1 = np.maximum(0.0, yy2 - yy1 + 0.00001)
        inter = w1 * h1

        ovr = inter / (areas[i] + areas[order[1:]] - inter)
        inds = np.where(ovr <= NMS_THRESH)[0]
        order = order[inds + 1]
    keep = np.array(keep)
    return keep


def yolov5_post_process(input_data):
    masks = [[0, 1, 2], [3, 4, 5], [6, 7, 8]]
    anchors = [[10, 13], [16, 30], [33, 23], [30, 61], [62, 45],
              [59, 119], [116, 90], [156, 198], [373, 326]]

    boxes, classes, scores = [], [], []
    for input,mask in zip(input_data, masks):
        b, c, s = process(input, mask, anchors)
        b, c, s = filter_boxes(b, c, s)
        boxes.append(b)
        classes.append(c)
        scores.append(s)

    boxes = np.concatenate(boxes)
    boxes = xywh2xyxy(boxes)
    classes = np.concatenate(classes)
    scores = np.concatenate(scores)

    nboxes, nclasses, nscores = [], [], []
    for c in set(classes):
        inds = np.where(classes == c)
        b = boxes[inds]
        c = classes[inds]
        s = scores[inds]

        keep = nms_boxes(b, s)

        nboxes.append(b[keep])
        nclasses.append(c[keep])
        nscores.append(s[keep])

    if not nclasses and not nscores:
        return None, None, None

    boxes = np.concatenate(nboxes)
    classes = np.concatenate(nclasses)
    scores = np.concatenate(nscores)

    return boxes, classes, scores

def scale_coords(x1, y1, x2, y2, dst_width, dst_height):

    dst_top, dst_left, dst_right, dst_bottom = 0, 0, 0, 0
    gain = 0

    if dst_width > dst_height:
        image_max_len = dst_width
        gain = IMG_SIZE / image_max_len
        resized_height = dst_height * gain
        height_pading = (IMG_SIZE - resized_height)/2
        print("height_pading:", height_pading)
        y1 = (y1 - height_pading)
        y2 = (y2 - height_pading)

    print("gain:", gain)
    dst_x1 = int(x1 / gain)
    dst_y1 = int(y1 / gain)
    dst_x2 = int(x2 / gain)
    dst_y2 = int(y2 / gain)

    return dst_x1, dst_y1, dst_x2, dst_y2

def plot_one_box(x, img, color=None, label=None, line_thickness=None):
    tl = line_thickness or round(0.002 * (img.shape[0] + img.shape[1]) / 2) + 1  # line/font thickness
    color = color or [random.randint(0, 255) for _ in range(3)]
    c1, c2 = (int(x[0]), int(x[1])), (int(x[2]), int(x[3]))
    cv2.rectangle(img, c1, c2, color, thickness=tl, lineType=cv2.LINE_AA)
    if label:
        tf = max(tl - 1, 1)  # font thickness
        t_size = cv2.getTextSize(label, 0, fontScale=tl / 3, thickness=tf)[0]
        c2 = c1[0] + t_size[0], c1[1] - t_size[1] - 3
        cv2.rectangle(img, c1, c2, color, -1, cv2.LINE_AA)  # filled
        cv2.putText(img, label, (c1[0], c1[1] - 2), 0, tl / 3, [225, 255, 255], thickness=tf, lineType=cv2.LINE_AA)

def draw(image, boxes, scores, classes):
    """Draw the boxes on the image.
    # Argument:
        image: original image.
        boxes: ndarray, boxes of objects.
        classes: ndarray, classes of objects.
        scores: ndarray, scores of objects.
        all_classes: all classes name.
    """
    for box, score, cl in zip(boxes, scores, classes):

        x1, y1, x2, y2 = box
        print('class: {}, score: {}'.format(CLASSES[cl], score))
        print('box coordinate x1,y1,x2,y2: [{}, {}, {}, {}]'.format(x1, y1, x2, y2))
        x1 = int(x1)
        y1 = int(y1)
        x2 = int(x2)
        y2 = int(y2)

        dst_x1, dst_y1, dst_x2, dst_y2 = scale_coords(x1, y1, x2, y2, image.shape[1], image.shape[0])
        #print("img.cols:", image.cols)

        plot_one_box((dst_x1, dst_y1, dst_x2, dst_y2), image, label='{0} {1:.2f}'.format(CLASSES[cl], score))


        '''
        cv2.rectangle(image, (dst_x1, dst_y1), (dst_x2, dst_y2), (255, 0, 0), 2)
        cv2.putText(image, '{0} {1:.2f}'.format(CLASSES[cl], score),
                    (dst_x1, dst_y1 - 6),
                    cv2.FONT_HERSHEY_SIMPLEX,
                    0.6, (0, 0, 255), 2)
        '''


def letterbox(im, new_shape=(640, 640), color=(0, 0, 0)):
    # Resize and pad image while meeting stride-multiple constraints
    shape = im.shape[:2]  # current shape [height, width]
    if isinstance(new_shape, int):
        new_shape = (new_shape, new_shape)

    # Scale ratio (new / old)
    r = min(new_shape[0] / shape[0], new_shape[1] / shape[1])

    # Compute padding
    ratio = r, r  # width, height ratios
    new_unpad = int(round(shape[1] * r)), int(round(shape[0] * r))
    dw, dh = new_shape[1] - new_unpad[0], new_shape[0] - new_unpad[1]  # wh padding

    dw /= 2  # divide padding into 2 sides
    dh /= 2

    if shape[::-1] != new_unpad:  # resize
        im = cv2.resize(im, new_unpad, interpolation=cv2.INTER_LINEAR)
    top, bottom = int(round(dh - 0.1)), int(round(dh + 0.1))
    left, right = int(round(dw - 0.1)), int(round(dw + 0.1))
    im = cv2.copyMakeBorder(im, top, bottom, left, right, cv2.BORDER_CONSTANT, value=color)  # add border
    return im, ratio, (dw, dh)


if __name__ == '__main__':

    # Create RKNN object
    rknn = RKNN()

    print('--> Loading model')
    ret = rknn.load_rknn(RKNN_MODEL)
    if ret != 0:
        print('load rknn model failed')
        exit(ret)
    print('done')

    # init runtime environment
    print('--> Init runtime environment')
    ret = rknn.init_runtime()
    # ret = rknn.init_runtime('rv1126', device_id='1126')
    if ret != 0:
        print('Init runtime environment failed')
        exit(ret)
    print('done')

    # Set inputs
    img = cv2.imread(IMG_PATH)
    letter_img, ratio, (dw, dh) = letterbox(img, new_shape=(IMG_SIZE, IMG_SIZE))
    letter_img = cv2.cvtColor(letter_img, cv2.COLOR_BGR2RGB)


    # Inference
    print('--> Running model')
    outputs = rknn.inference(inputs=[letter_img])

    print('--> inference done')

    # post process
    input0_data = outputs[0]
    input1_data = outputs[1]
    input2_data = outputs[2]

    input0_data = input0_data.reshape([3,-1]+list(input0_data.shape[-2:]))
    input1_data = input1_data.reshape([3,-1]+list(input1_data.shape[-2:]))
    input2_data = input2_data.reshape([3,-1]+list(input2_data.shape[-2:]))

    input_data = list()
    input_data.append(np.transpose(input0_data, (2, 3, 0, 1)))
    input_data.append(np.transpose(input1_data, (2, 3, 0, 1)))
    input_data.append(np.transpose(input2_data, (2, 3, 0, 1)))

    print('--> transpose done')

    boxes, classes, scores = yolov5_post_process(input_data)

    print('--> get result done')

    img_1 = cv2.cvtColor(img, cv2.COLOR_RGB2BGR)
    if boxes is not None:
        draw(img, boxes, scores, classes)

    cv2.imwrite('./result.jpg', img)
    cv2.imshow("post process result", img_1)
    cv2.waitKeyEx(0)

    rknn.release()
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
<

rknn模型预编译代码（Python）：

import sys
import random
import os
import argparse

from rknn.api import RKNN


def precompile_file(fi, fo, target):
    print("precompile {} to {}".format(fi, fo))
    src_rknn_model_path = fi
    dst_rknn_model_path = fo

    rknn = RKNN(verbose=True)
    rknn.load_rknn(src_rknn_model_path)
    rknn.init_runtime(rknn2precompile=True, target=target)
    rknn.export_rknn_precompile_model(export_path=dst_rknn_model_path)


def precompile_dir(d, out_dir, target):
    """
    decrypt a directory assigned by <d>
    """
    file_list = os.listdir(d)
    file_count = len(file_list)
    for i in range(file_count):
        f = os.path.join(d, file_list[i])
        target_file_name = file_list[i]
        neof = os.path.join(out_dir, target_file_name)
        precompile_file(f, neof, target)
        print('Progress:%d/%d' % (i + 1, file_count))
    print('Directory <%s> has been decrypted.' % (d))


if __name__ == '__main__':

    precompile_file('./yolov5_coco_rv1126.rknn','./yolov5_coco_rv1126_pre.rknn', 'rv1126')
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
<

rknn模型C++推理代码—yolov5_detect.h

#ifndef _YOLOV5_DETECT_H_
#define _YOLOV5_DETECT_H_

#include "yolov5_detect_postprocess.h"
#include "rknn_api.h"
#include <opencv2/opencv.hpp>




/* 
 * COCO检测初始化函数
 * ctx:输入参数,rknn_context句柄
 * path:输入参数,算法模型路径
 */
int coco_detect_init(rknn_context *ctx, const char * path);


/* 
 * COCO检测执行函数
 * ctx:输入参数,rknn_context句柄
 * input_image:输入参数,图像数据输入(cv::Mat是Opencv的类型)
 * output_dets:输出参数，目标检测框输出
 */
int coco_detect_run(rknn_context ctx, cv::Mat input_image, coco_detect_result_group_t *detect_result_group);


/* 
 * COCO检测释放函数
 * ctx:输入参数,rknn_context句柄
 */
int coco_detect_release(rknn_context ctx);




#endif
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
<

yolov5_detect.cpp

#include <iostream>
#include <fstream>
#include <vector>
#include <cstdint>
#include <stdio.h>
#include <stdlib.h>
#include <math.h>
#include <unistd.h>
#include <time.h>
#include <math.h>
#include <fcntl.h>
#include <opencv2/opencv.hpp>
#include "yolov5_detect.h"
#include "rknn_api.h"

#include <sys/time.h>

using namespace std;
using namespace cv;


//unsigned char *model;
//detection* dets;

static void printRKNNTensor(rknn_tensor_attr *attr)
{
    printf("index=%d name=%s n_dims=%d dims=[%d %d %d %d] n_elems=%d size=%d "
           "fmt=%d type=%d qnt_type=%d fl=%d zp=%d scale=%f\n",
           attr->index, attr->name, attr->n_dims, attr->dims[3], attr->dims[2],
           attr->dims[1], attr->dims[0], attr->n_elems, attr->size, 0, attr->type,
           attr->qnt_type, attr->fl, attr->zp, attr->scale);
}


// 调整图片尺寸达到模型输入尺寸要求
static int letter_box(cv::Mat input_image, cv::Mat *output_image, int model_input_size)
{

    // 计算缩放比
    int input_width, input_height;

    input_width = input_image.cols;
    input_height = input_image.rows;
    float ratio;
    ratio = min((float)model_input_size / input_width, (float)model_input_size / input_height); // 选择较小的缩放比


    // 计算缩放后的宽高尺寸
    int new_width, new_height;
    new_width = round(ratio * input_width );
    new_height = round(ratio * input_height);


    // 计算padding量
    // 长或者宽，至少有一个在缩放后满足模型需求了；另一个需要padding
    int height_padding = 0;
    int width_padding = 0;
    int top = 0;
    int bottom = 0;
    int left = 0;
    int right = 0;
    if( new_width >= new_height) // 宽已经满足要求了，高需要padding
    {
        height_padding = new_width - new_height; // 计算padding量
        if( (height_padding % 2) == 0 ) // 如果padding量是偶数
        {
            top = (int)((float)(height_padding/2)); // 直接除2就好
            bottom = (int)((float)(height_padding/2));
        }
        else // padding量是奇数
        {
            top = (int)((float)(height_padding/2)); 
            bottom = (int)((float)(height_padding/2))+1;    
        }
    }
    else   // 高已经满足要求了，宽需要padding
    {
        width_padding = new_height - new_width;
        if( (width_padding % 2) == 0 )
        {
            left = (int)((float)(width_padding/2));
            right = (int)((float)(width_padding/2));
        }
        else
        {
            left = (int)((float)(width_padding/2));
            right = (int)((float)(width_padding/2))+1;
        }

    }


    // 对长宽进行缩放
    cv::Mat resize_img;
    cv::resize(input_image, resize_img, cv::Size(new_width, new_height));

    // padding操作
    cv::copyMakeBorder(resize_img, *output_image, top, bottom, left, right, cv::BORDER_CONSTANT, cv::Scalar(0, 0, 0));

    return 0;
}


// 模型为二进制格式存储，将其整个加载到内存中
int coco_detect_init(rknn_context *ctx, const char * path)
{
    int ret;

    // Load model
    FILE *fp = fopen(path, "rb"); // 打开指定路径的模型文件
    if(fp == NULL)
    {
        printf("fopen %s fail!\n", path);
        return -1;
    }
    fseek(fp, 0, SEEK_END);   // SEEK_EN为文件尾，文件指针移向文件的末尾
    int model_len = ftell(fp);   // 计算得到文件指针的偏移量
    unsigned char *model_data = (unsigned char*)malloc(model_len); // 分配与模型文件长度相等的内存块，用于存储模型数据

    fseek(fp, 0, SEEK_SET);   //SEEK_SET为文件头，文件指针重新移动到文件开头
    if(model_len != fread(model_data, 1, model_len, fp)) // 将模型文件中的数据读取到之前分配的内存块； 如果读取的数据长度与模型文件长度不一致，则
    {
        printf("fread %s fail!\n", path);
        free(model_data); // 释放内存块
        return -1;
    }
    fclose(fp); // 关闭文件

    //init
    ret = rknn_init(ctx, model_data, model_len, RKNN_FLAG_PRIOR_MEDIUM); // 初始化rknn模型上下文
    if(ret < 0)
    {
        printf("rknn_init fail! ret=%d\n", ret);
        return -1;
    }

    free(model_data);

    return 0;
}


// 目标框的坐标信息映射到原图上
static int scale_coords(coco_detect_result_group_t *detect_result_group, int img_width, int img_height, int model_size)
{
    for (int i = 0; i < detect_result_group->count; i++)
    {
        coco_detect_result_t *det_result = &(detect_result_group->results[i]);


        int x1 = det_result->box.left;
        int y1 = det_result->box.top;
        int x2 = det_result->box.right;
        int y2 = det_result->box.bottom;


        if( img_width >= img_height )
        {
            int image_max_len = img_width;
            float gain;
            gain = (float)model_size / image_max_len;
            int resized_height = img_height * gain;
            int height_pading = (model_size - resized_height)/2;
            y1 = (y1 - height_pading);
            y2 = (y2 - height_pading);
            x1 = int(x1 / gain);
            y1 = int(y1 / gain);
            x2 = int(x2 / gain);
            y2 = int(y2 / gain);

            det_result->box.left = x1;
            det_result->box.top = y1;
            det_result->box.right = x2;
            det_result->box.bottom = y2;
        }
        else
        {
            int image_max_len = img_height;
            float gain;
            gain = (float)model_size / image_max_len;
            int resized_width = img_width * gain;
            int width_pading = (model_size - resized_width)/2;
            x1 = (x1 - width_pading);
            x2 = (x2 - width_pading);
            x1 = int(x1 / gain);
            y1 = int(y1 / gain);
            x2 = int(x2 / gain);
            y2 = int(y2 / gain);

            det_result->box.left = x1;
            det_result->box.top = y1;
            det_result->box.right = x2;
            det_result->box.bottom = y2;    
        }

    }

    return 0;
}


int coco_detect_run(rknn_context ctx, cv::Mat input_image, coco_detect_result_group_t *detect_result_group)
{
    int img_width = 0;
    int img_height = 0;
    int img_channel = 0;

    size_t actual_size = 0;
    const float vis_threshold = 0.1;
    const float nms_threshold = 0.5;
    const float conf_threshold = 0.2;
    int ret;

    img_width = input_image.cols;
    img_height = input_image.rows;


    // 查询SDK版本、模型输入输出张量数量
    rknn_sdk_version version;
    ret = rknn_query(ctx, RKNN_QUERY_SDK_VERSION, &version,
                     sizeof(rknn_sdk_version)); // 查询SDK版本
    if (ret < 0)
    {
        printf("rknn_init error ret=%d\n", ret);
        return -1;
    }
    /*
    printf("sdk version: %s driver version: %s\n", version.api_version,
           version.drv_version);
    */


    // 查询并保存输入和输出张量属性
    rknn_input_output_num io_num; // 用于存储查询到的输入和输出张量数量
    ret = rknn_query(ctx, RKNN_QUERY_IN_OUT_NUM, &io_num, sizeof(io_num));  // 查询模型输入输出张量数量
    if (ret < 0)
    {
        printf("rknn_init error ret=%d\n", ret);
        return -1;
    }
    /*
    printf("model input num: %d, output num: %d\n", io_num.n_input,
           io_num.n_output);
    */

    rknn_tensor_attr input_attrs[io_num.n_input]; // 创建一个数组，用于存储输入张量的属性信息
    memset(input_attrs, 0, sizeof(input_attrs)); // 将数组的所有元素清零
    for (int i = 0; i < io_num.n_input; i++)
    {
        input_attrs[i].index = i;
        ret = rknn_query(ctx, RKNN_QUERY_INPUT_ATTR, &(input_attrs[i]),
                         sizeof(rknn_tensor_attr)); // 查询模型的输入张量属性，并保存在相应的结构体中
                                     // 包括 索引（index）、数据格式（fmt）、数据类型（type）、通道数（channel）、宽度（dims[0]）和高度（dims[1]）
        if (ret < 0)
        {
            printf("rknn_init error ret=%d\n", ret);
            return -1;
        }
        //printRKNNTensor(&(input_attrs[i]));
    }

    rknn_tensor_attr output_attrs[io_num.n_output];
    memset(output_attrs, 0, sizeof(output_attrs));
    for (int i = 0; i < io_num.n_output; i++)
    {
        output_attrs[i].index = i;
        ret = rknn_query(ctx, RKNN_QUERY_OUTPUT_ATTR, &(output_attrs[i]), 
                         sizeof(rknn_tensor_attr));
        //printRKNNTensor(&(output_attrs[i]));
    }


    // 从输入张量属性中获取输入的高和宽
    int input_channel = 3;
    int input_width = 0;
    int input_height = 0;
    if (input_attrs[0].fmt == RKNN_TENSOR_NCHW) // 检查图片通道顺序
    {
        //printf("model is NCHW input fmt\n");
        input_width = input_attrs[0].dims[0];
        input_height = input_attrs[0].dims[1];
    }
    else
    {
        //printf("model is NHWC input fmt\n");
        input_width = input_attrs[0].dims[1];
        input_height = input_attrs[0].dims[2];
    }

    /*
    printf("model input height=%d, width=%d, channel=%d\n", height, width,
           channel);
    */

    // 输入张量初始化
    /* Init input tensor */
    rknn_input inputs[1];
    memset(inputs, 0, sizeof(inputs));
    inputs[0].index = 0;
    inputs[0].type = RKNN_TENSOR_UINT8;
    inputs[0].size = input_width * input_height * input_channel;
    inputs[0].fmt = RKNN_TENSOR_NHWC;
    inputs[0].pass_through = 0;

    // 输出张量初始化
    /* Init output tensor */
    rknn_output outputs[io_num.n_output];
    memset(outputs, 0, sizeof(outputs));

    for (int i = 0; i < io_num.n_output; i++)
    {
        outputs[i].want_float = 0; // 输出张量的数据类型不需要转换为浮点数
    }

    // 对输入图像进行信封处理，将其调整为模型制定的输入尺寸
    cv::Mat letter_image;
    letter_box(input_image, &letter_image, input_width);
    inputs[0].buf = letter_image.data; // 预处理后的图像数据赋值给inputs[0].buf

    // 推理，获取模型输出
    rknn_inputs_set(ctx, io_num.n_input, inputs); //输入张量与ctx（RKNN模型上下文）关联起来
    ret = rknn_run(ctx, NULL); // 运行 RKNN 模型进行推理。此时模型会根据输入张量的数据进行前向传播，生成模型的输出结果
    ret = rknn_outputs_get(ctx, io_num.n_output, outputs, NULL); // 获取模型的输出结果； io_num.n_output表示输出张量的数量；outputs是输出张量数组

    // Post process

    // 获取模型输出的缩放因子和零点信息，用于后处理
    // 缩放因子是模型量化过程中得到的，会使用缩放因子和零点来确定如何将浮点数映射到整数范围内。
    // 缩放因子表示浮点数在量化后，映射到整数范围内所需的缩放比例；
    // 零点表示浮点数映射到整数范围内时的偏移量
    std::vector<float> out_scales; // 用于存储张量的缩放因子
    std::vector<uint8_t> out_zps; // 用于存储张量的零点
    for (int i = 0; i < io_num.n_output; ++i)
    {
        out_scales.push_back(output_attrs[i].scale); // 获取第i个输出张量的缩放因子；
        out_zps.push_back(output_attrs[i].zp); // 获取第i个输出张量的零点
    }


    // 后处理
    yolov5_post_process_u8((uint8_t *)outputs[0].buf, (uint8_t *)outputs[1].buf, (uint8_t *)outputs[2].buf, input_height, input_width,
                       conf_threshold, nms_threshold, out_zps, out_scales, detect_result_group);


    /*
    yolov5_post_process_fp((float *)outputs[0].buf, (float *)outputs[1].buf, (float *)outputs[2].buf, input_height, input_width,
                        conf_threshold, nms_threshold, &detect_result_group);
    */

    // 释放模型输出资源
    rknn_outputs_release(ctx, io_num.n_output, outputs);

    // 对后处理得到的目标框进行缩放，以适应原始图像的尺寸
    scale_coords(detect_result_group, img_width, img_height, input_width);

    return 0;
}

int coco_detect_release(rknn_context ctx)
{
    rknn_destroy(ctx);
    return 0;
}
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
<

yolov5_detect_postprocess.h

#ifndef _YOLOV5_DETECT_POSTPROCESS_H_
#define _YOLOV5_DETECT_POSTPROCESS_H_

#include <stdint.h>

#define COCO_NAME_MAX_SIZE 16
#define COCO_NUMB_MAX_SIZE 200
#define COCO_CLASS_NUM     80
#define COCO_PROP_BOX_SIZE     (5+COCO_CLASS_NUM)

typedef struct _COCO_BOX_RECT
{
    int left;
    int right;
    int top;
    int bottom;
} COCO_BOX_RECT;

typedef struct __coco_detect_result_t
{
    char name[COCO_NAME_MAX_SIZE];
    int class_index;
    COCO_BOX_RECT box;
    float prop;
} coco_detect_result_t;

typedef struct _detect_result_group_t
{
    int id;
    int count;
    coco_detect_result_t results[COCO_NUMB_MAX_SIZE];
} coco_detect_result_group_t;

int yolov5_post_process_u8(uint8_t *input0, uint8_t *input1, uint8_t *input2, int model_in_h, int model_in_w,
                 float conf_threshold, float nms_threshold,
                 std::vector<uint8_t> &qnt_zps, std::vector<float> &qnt_scales,
                 coco_detect_result_group_t *group);

int yolov5_post_process_fp(float *input0, float *input1, float *input2, int model_in_h, int model_in_w,
                 float conf_threshold, float nms_threshold, 
                 coco_detect_result_group_t *group);

#endif //_RKNN_ZERO_COPY_DEMO_POSTPROCESS_H_
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
<

yolov5_detect_postprocess.cpp

// Copyright (c) 2021 by Rockchip Electronics Co., Ltd. All Rights Reserved.
//
// Licensed under the Apache License, Version 2.0 (the "License");
// you may not use this file except in compliance with the License.
// You may obtain a copy of the License at
//
//     http://www.apache.org/licenses/LICENSE-2.0
//
// Unless required by applicable law or agreed to in writing, software
// distributed under the License is distributed on an "AS IS" BASIS,
// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
// See the License for the specific language governing permissions and
// limitations under the License.

#include <stdio.h>
#include <stdlib.h>
#include <math.h>
#include <string.h>
#include <sys/time.h>
#include <vector>
#include "yolov5_detect_postprocess.h"
#include <stdint.h>


static char labels[COCO_CLASS_NUM][30] = {"person", "bicycle", "car","motorbike ","aeroplane ","bus ","train","truck ","boat","traffic light",
           "fire hydrant","stop sign ","parking meter","bench","bird","cat","dog ","horse ","sheep","cow","elephant",
           "bear","zebra ","giraffe","backpack","umbrella","handbag","tie","suitcase","frisbee","skis","snowboard","sports ball","kite",
           "baseball bat","baseball glove","skateboard","surfboard","tennis racket","bottle","wine glass","cup","fork","knife",
           "spoon","bowl","banana","apple","sandwich","orange","broccoli","carrot","hot dog","pizza ","donut","cake","chair","sofa",
           "pottedplant","bed","diningtable","toilet ","tvmonitor","laptop","mouse","remote ","keyboard ","cell phone","microwave ",
           "oven ","toaster","sink","refrigerator ","book","clock","vase","scissors ","teddy bear ","hair drier", "toothbrush"};

const int anchor0[6] = {10, 13, 16, 30, 33, 23};
const int anchor1[6] = {30, 61, 62, 45, 59, 119};
const int anchor2[6] = {116, 90, 156, 198, 373, 326};


// 将一个浮点数val限制在一个指定的最小值min和最大值max的范围内
inline static int clamp(float val, int min, int max)
{
    return val > min ? (val < max ? val : max) : min;
}


// 计算两个矩形的重叠度 
// 在计算之前需要对x轴进行排序，左边那个矩形是0,右边那个是1
// yolov5输出结果，检测框已经经过排序处理，按照从左到右，从上到下的顺序排列
static float CalculateOverlap(float xmin0, float ymin0, float xmax0, float ymax0, float xmin1, float ymin1, float xmax1, float ymax1)
{
    float w = fmax(0.f, fmin(xmax0, xmax1) - fmax(xmin0, xmin1) + 1.0); // 重叠部分的宽
    float h = fmax(0.f, fmin(ymax0, ymax1) - fmax(ymin0, ymin1) + 1.0); // 重叠部分的高
    float i = w * h; // 重叠部分的面积,即两矩形交集
    float u = (xmax0 - xmin0 + 1.0) * (ymax0 - ymin0 + 1.0) + (xmax1 - xmin1 + 1.0) * (ymax1 - ymin1 + 1.0) - i; // 两矩形并集面积
    return u <= 0.f ? 0.f : (i / u); // 交并比
}

// 
static int nms(int validCount, std::vector<float> &outputLocations, std::vector<int> &order, float threshold)
{
    for (int i = 0; i < validCount; ++i) // 遍历所有检测框
    {
        if (order[i] == -1)
        {
            continue;
        }
        int n = order[i]; // 检测框的索引顺序，即对应于outputLocations中检测框的顺序
        for (int j = i + 1; j < validCount; ++j) // 从 n+1 开始遍历剩余的检测框
        {
            int m = order[j];
            if (m == -1)
            {
                continue;
            }

            // 当前检测框
            float xmin0 = outputLocations[n * 4 + 0]; // xmin
            float ymin0 = outputLocations[n * 4 + 1]; // ymin
            float xmax0 = outputLocations[n * 4 + 0] + outputLocations[n * 4 + 2]; // xmin + w = xmax
            float ymax0 = outputLocations[n * 4 + 1] + outputLocations[n * 4 + 3]; // ymin + h = ymax

        // 剩余检测框
            float xmin1 = outputLocations[m * 4 + 0];
            float ymin1 = outputLocations[m * 4 + 1];
            float xmax1 = outputLocations[m * 4 + 0] + outputLocations[m * 4 + 2];
            float ymax1 = outputLocations[m * 4 + 1] + outputLocations[m * 4 + 3];

            float iou = CalculateOverlap(xmin0, ymin0, xmax0, ymax0, xmin1, ymin1, xmax1, ymax1); // 计算交并比

            if (iou > threshold) // 交并比大于阈值，即认为是同一个目标的检测框
            {
                order[j] = -1;
            }
        }
    }
    return 0;
}

static int quick_sort_indice_inverse(
    std::vector<float> &input, // 待排序的向量
    int left, // 排序范围的左右边界
    int right,
    std::vector<int> &indices) // 与input相对应的索引向量
{
    float key;
    int key_index;
    int low = left;
    int high = right;
    if (left < right)
    {
        key_index = indices[left];
        key = input[left];
        while (low < high)
        {
            while (low < high && input[high] <= key)
            {
                high--;
            }
            input[low] = input[high];
            indices[low] = indices[high];
            while (low < high && input[low] >= key)
            {
                low++;
            }
            input[high] = input[low];
            indices[high] = indices[low];
        }
        input[low] = key;
        indices[low] = key_index;
        quick_sort_indice_inverse(input, left, low - 1, indices);
        quick_sort_indice_inverse(input, low + 1, right, indices);
    }
    return low;
}

static float sigmoid(float x)
{
    return 1.0 / (1.0 + expf(-x));
}

static float unsigmoid(float y)
{
    return -1.0 * logf((1.0 / y) - 1.0);
}

inline static int32_t __clip(float val, float min, float max)
{
    float f = val <= min ? min : (val >= max ? max : val);
    return f;
}

// 将一个浮点数（32）进行量化转换为一个固定范围内的整数值（uint8_t），并添加零点偏移（zp）和缩放因子（scale）的调整
static uint8_t qnt_f32_to_affine(float f32, uint8_t zp, float scale)
{
    float dst_val = (f32 / scale) + zp;
    uint8_t res = (uint8_t)__clip(dst_val, 0, 255); // 将结果限制到0~255之间，并转化为uint8_t
    return res;
}

// 一个经过量化转换和调整的整数值（qnt）反量化回浮点数
static float deqnt_affine_to_f32(uint8_t qnt, uint8_t zp, float scale)
{
    return ((float)qnt - (float)zp) * scale;
}


// 从输入数据中提取边界框，对边界框进行解码和筛选，并将结果存储到相应的向量中
static int process_u8(uint8_t *input, int *anchor, int grid_h, int grid_w, int height, int width, int stride,
                   std::vector<float> &boxes, std::vector<float> &boxScores, std::vector<int> &classId,
                   float threshold, uint8_t zp, float scale)
{

    int validCount = 0;
    int grid_len = grid_h * grid_w; // 网格的高度*宽度
    float thres = unsigmoid(threshold);
    uint8_t thres_u8 = qnt_f32_to_affine(thres, zp, scale); // 置信度量化为整数
    for (int a = 0; a < 3; a++) // 每个网格位置有三个锚框
    {
        for (int i = 0; i < grid_h; i++) // 当前网格的列索引，可以理解为图像的行数
        {
            for (int j = 0; j < grid_w; j++) // 当前网格的行索引，可以理解为图像的列数
            {
                uint8_t box_confidence = input[(COCO_PROP_BOX_SIZE * a + 4) * grid_len + i * grid_w + j]; // dan
                // COCO_PROP_BOX_SIZE为常量，每个边界框的属性数量，这里应该是85；
                // +4是为了跳过边界框信息，以便直接获取边界框的置信度
                // (COCO_PROP_BOX_SIZE * a + 4) * grid_len
                // grid_len表示网格的总长度；
                // 不过这里我也看不懂rknn的输出,毕竟这句代码我看得太困惑，与onnx模型后处理的时候不一样

                if (box_confidence >= thres_u8) // 如果置信度大于阈值
                {
                    int offset = (COCO_PROP_BOX_SIZE * a) * grid_len + i * grid_w + j; // 计算偏移量
                    uint8_t *in_ptr = input + offset; // 获取输入指针

                    // 计算bounding box的x坐标
                    float box_x = sigmoid(deqnt_affine_to_f32(*in_ptr, zp, scale)) * 2.0 - 0.5; 
                    float box_y = sigmoid(deqnt_affine_to_f32(in_ptr[grid_len], zp, scale)) * 2.0 - 0.5;
                    float box_w = sigmoid(deqnt_affine_to_f32(in_ptr[2 * grid_len], zp, scale)) * 2.0;
                    float box_h = sigmoid(deqnt_affine_to_f32(in_ptr[3 * grid_len], zp, scale)) * 2.0;

                    // 根据当前点的位置和步长，缩放box的x和y坐标
                    box_x = (box_x + j) * (float)stride; 
                    box_y = (box_y + i) * (float)stride;

                    // 根据anchor的尺寸缩放box的宽度和高度
                    box_w = box_w * box_w * (float)anchor[a * 2]; 
                    box_h = box_h * box_h * (float)anchor[a * 2 + 1];

                    // 将box的坐标转换为左上角坐标和宽度、高度，并存储在boxes数组中
                    box_x -= (box_w / 2.0); 
                    box_y -= (box_h / 2.0);
                    boxes.push_back(box_x);
                    boxes.push_back(box_y);
                    boxes.push_back(box_w);
                    boxes.push_back(box_h);

            // 获取最大类别概率值和对应的类别ID
                    uint8_t maxClassProbs = in_ptr[5 * grid_len];
                    int maxClassId = 0;
                    for (int k = 1; k < COCO_CLASS_NUM; ++k)
                    {
                        uint8_t prob = in_ptr[(5 + k) * grid_len];
                        if (prob > maxClassProbs)
                        {
                            maxClassId = k;
                            maxClassProbs = prob;
                        }
                    }

                    // 将box_confidence和类别概率值进行逆量化并转换位浮点数
                    float box_conf_f32 = sigmoid(deqnt_affine_to_f32(box_confidence, zp, scale));
                    float class_prob_f32 = sigmoid(deqnt_affine_to_f32(maxClassProbs, zp, scale));

                    // 计算Box_scores，并存储在boxScores中
                    boxScores.push_back(box_conf_f32* class_prob_f32);

                    // 将最大类别的ID存储在classId数组中
                    classId.push_back(maxClassId);

                    // 增加有效目标框的数量
                    validCount++;
                }
            }
        }
    }
    return validCount;
}

static int process_fp(float *input, int *anchor, int grid_h, int grid_w, int height, int width, int stride,
                   std::vector<float> &boxes, std::vector<float> &boxScores, std::vector<int> &classId,
                   float threshold)
{

    int validCount = 0;
    int grid_len = grid_h * grid_w;
    float thres_sigmoid = unsigmoid(threshold);
    for (int a = 0; a < 3; a++)
    {
        for (int i = 0; i < grid_h; i++)
        {
            for (int j = 0; j < grid_w; j++)
            {
                float box_confidence = input[(COCO_PROP_BOX_SIZE * a + 4) * grid_len + i * grid_w + j];
                if (box_confidence >= thres_sigmoid)
                {
                    int offset = (COCO_PROP_BOX_SIZE * a) * grid_len + i * grid_w + j;
                    float *in_ptr = input + offset;
                    float box_x = sigmoid(*in_ptr) * 2.0 - 0.5;
                    float box_y = sigmoid(in_ptr[grid_len]) * 2.0 - 0.5;
                    float box_w = sigmoid(in_ptr[2 * grid_len]) * 2.0;
                    float box_h = sigmoid(in_ptr[3 * grid_len]) * 2.0;
                    box_x = (box_x + j) * (float)stride;
                    box_y = (box_y + i) * (float)stride;
                    box_w = box_w * box_w * (float)anchor[a * 2];
                    box_h = box_h * box_h * (float)anchor[a * 2 + 1];
                    box_x -= (box_w / 2.0);
                    box_y -= (box_h / 2.0);
                    boxes.push_back(box_x);
                    boxes.push_back(box_y);
                    boxes.push_back(box_w);
                    boxes.push_back(box_h);

                    float maxClassProbs = in_ptr[5 * grid_len];
                    int maxClassId = 0;
                    for (int k = 1; k < COCO_CLASS_NUM; ++k)
                    {
                        float prob = in_ptr[(5 + k) * grid_len];
                        if (prob > maxClassProbs)
                        {
                            maxClassId = k;
                            maxClassProbs = prob;
                        }
                    }
                    float box_conf_f32 = sigmoid(box_confidence);
                    float class_prob_f32 = sigmoid(maxClassProbs);
                    boxScores.push_back(box_conf_f32* class_prob_f32);
                    classId.push_back(maxClassId);
                    validCount++;
                }
            }
        }
    }
    return validCount;
}

int yolov5_post_process_u8(uint8_t *input0, uint8_t *input1, uint8_t *input2, int model_in_h, int model_in_w,
                 float conf_threshold, float nms_threshold,
                 std::vector<uint8_t> &qnt_zps, std::vector<float> &qnt_scales,
                 coco_detect_result_group_t *group)
{
    static int init = -1;
    if (init == -1)
    {
    /*
        int ret = 0;
        ret = loadLabelName(LABEL_NALE_TXT_PATH, labels);
        if (ret < 0)
        {
            return -1;
        }
    */
        init = 0;
    }

    // 初始化输出参数
    memset(group, 0, sizeof(coco_detect_result_group_t));

    // 定义储存结果的数组
    std::vector<float> filterBoxes;
    std::vector<float> boxesScore;
    std::vector<int> classId;

    // 第一个输入的步长和网格大小
    int stride0 = 8;
    int grid_h0 = model_in_h / stride0;
    int grid_w0 = model_in_w / stride0;
    int validCount0 = 0;

    // 处理第一个输入，提取bounding box和类别信息
    validCount0 = process_u8(input0, (int *)anchor0, grid_h0, grid_w0, model_in_h, model_in_w,
                          stride0, filterBoxes, boxesScore, classId, conf_threshold, qnt_zps[0], qnt_scales[0]);


    // 第二个输入的步长和网格大小
    int stride1 = 16;
    int grid_h1 = model_in_h / stride1;
    int grid_w1 = model_in_w / stride1;
    int validCount1 = 0;

    // 处理第二个输入，提取bounding box和类别信息
    validCount1 = process_u8(input1, (int *)anchor1, grid_h1, grid_w1, model_in_h, model_in_w,
                          stride1, filterBoxes, boxesScore, classId, conf_threshold, qnt_zps[1], qnt_scales[1]);

    // 第三个输入的步长和网格大小
    int stride2 = 32;
    int grid_h2 = model_in_h / stride2;
    int grid_w2 = model_in_w / stride2;
    int validCount2 = 0;

    // 处理第三个输入，提取bounding box和类别信息
    validCount2 = process_u8(input2, (int *)anchor2, grid_h2, grid_w2, model_in_h, model_in_w,
                          stride2, filterBoxes, boxesScore, classId, conf_threshold, qnt_zps[2], qnt_scales[2]);

    // 计算有效目标框的总数
    int validCount = validCount0 + validCount1 + validCount2;

    // 没有检测到目标
    if (validCount <= 0)
    {
        return 0;
    }

    // 创建索引数组
    std::vector<int> indexArray;
    for (int i = 0; i < validCount; ++i)
    {
        indexArray.push_back(i);
    }


    // 对得分进行降序排序，并更新索引数组
    quick_sort_indice_inverse(boxesScore, 0, validCount - 1, indexArray);

    // 进行非极大值抑制，去除冗余框
    nms(validCount, filterBoxes, indexArray, nms_threshold);

    int last_count = 0;
    group->count = 0;
    /* 处理有效的检测目标框 */
    for (int i = 0; i < validCount; ++i)
    {

        if (indexArray[i] == -1 || boxesScore[i] < conf_threshold || last_count >= COCO_NUMB_MAX_SIZE)
        {
            continue;
        }
        int n = indexArray[i];

    // 计算每个目标框的坐标
        float x1 = filterBoxes[n * 4 + 0];
        float y1 = filterBoxes[n * 4 + 1];
        float x2 = x1 + filterBoxes[n * 4 + 2];
        float y2 = y1 + filterBoxes[n * 4 + 3];
        int id = classId[n];

    /*
        group->results[last_count].box.left = (int)((clamp(x1, 0, model_in_w) - w_offset) / resize_scale);
        group->results[last_count].box.top = (int)((clamp(y1, 0, model_in_h) - h_offset) / resize_scale);
        group->results[last_count].box.right = (int)((clamp(x2, 0, model_in_w) - w_offset) / resize_scale);
        group->results[last_count].box.bottom = (int)((clamp(y2, 0, model_in_h)  - h_offset) / resize_scale);
    */

    // 更新目标检测框的边界框坐标、置信度和类别信息
        group->results[last_count].box.left = (int) clamp(x1, 0, model_in_w);
        group->results[last_count].box.top = (int) clamp(y1, 0, model_in_h);
        group->results[last_count].box.right = (int) clamp(x2, 0, model_in_w);
        group->results[last_count].box.bottom = (int) clamp(y2, 0, model_in_h);

        group->results[last_count].prop = boxesScore[i];
        group->results[last_count].class_index = id;
        char *label = labels[id];
        strncpy(group->results[last_count].name, label, COCO_NAME_MAX_SIZE);

        // printf("result %2d: (%4d, %4d, %4d, %4d), %s\n", i, group->results[last_count].box.left, group->results[last_count].box.top,
        //        group->results[last_count].box.right, group->results[last_count].box.bottom, label);
        last_count++;
    }

    // 更新目标检结果的数量
    group->count = last_count;

    return 0;
}


int yolov5_post_process_fp(float *input0, float *input1, float *input2, int model_in_h, int model_in_w,
                 float conf_threshold, float nms_threshold,
                 coco_detect_result_group_t *group)
{
    static int init = -1;
    if (init == -1)
    {
    /*
        int ret = 0;
        ret = loadLabelName(LABEL_NALE_TXT_PATH, labels);
        if (ret < 0)
        {
            return -1;
        }
    */

        init = 0;
    }
    memset(group, 0, sizeof(coco_detect_result_group_t));

    std::vector<float> filterBoxes;
    std::vector<float> boxesScore;
    std::vector<int> classId;
    int stride0 = 8;
    int grid_h0 = model_in_h / stride0;
    int grid_w0 = model_in_w / stride0;
    int validCount0 = 0;
    validCount0 = process_fp(input0, (int *)anchor0, grid_h0, grid_w0, model_in_h, model_in_w,
                          stride0, filterBoxes, boxesScore, classId, conf_threshold);

    int stride1 = 16;
    int grid_h1 = model_in_h / stride1;
    int grid_w1 = model_in_w / stride1;
    int validCount1 = 0;
    validCount1 = process_fp(input1, (int *)anchor1, grid_h1, grid_w1, model_in_h, model_in_w,
                          stride1, filterBoxes, boxesScore, classId, conf_threshold);

    int stride2 = 32;
    int grid_h2 = model_in_h / stride2;
    int grid_w2 = model_in_w / stride2;
    int validCount2 = 0;
    validCount2 = process_fp(input2, (int *)anchor2, grid_h2, grid_w2, model_in_h, model_in_w,
                          stride2, filterBoxes, boxesScore, classId, conf_threshold);

    int validCount = validCount0 + validCount1 + validCount2;
    // no object detect
    if (validCount <= 0)
    {
        return 0;
    }

    std::vector<int> indexArray;
    for (int i = 0; i < validCount; ++i)
    {
        indexArray.push_back(i);
    }

    quick_sort_indice_inverse(boxesScore, 0, validCount - 1, indexArray);

    nms(validCount, filterBoxes, indexArray, nms_threshold);

    int last_count = 0;
    group->count = 0;
    /* box valid detect target */
    for (int i = 0; i < validCount; ++i)
    {

        if (indexArray[i] == -1 || boxesScore[i] < conf_threshold || last_count >= COCO_NUMB_MAX_SIZE)
        {
            continue;
        }
        int n = indexArray[i];

        float x1 = filterBoxes[n * 4 + 0];
        float y1 = filterBoxes[n * 4 + 1];
        float x2 = x1 + filterBoxes[n * 4 + 2];
        float y2 = y1 + filterBoxes[n * 4 + 3];
        int id = classId[n];

    /*
        group->results[last_count].box.left = (int)((clamp(x1, 0, model_in_w) - w_offset) / resize_scale);
        group->results[last_count].box.top = (int)((clamp(y1, 0, model_in_h) - h_offset) / resize_scale);
        group->results[last_count].box.right = (int)((clamp(x2, 0, model_in_w) - w_offset) / resize_scale);
        group->results[last_count].box.bottom = (int)((clamp(y2, 0, model_in_h)  - h_offset) / resize_scale);
    */
        group->results[last_count].box.left = (int) clamp(x1, 0, model_in_w);
        group->results[last_count].box.top = (int) clamp(y1, 0, model_in_h);
        group->results[last_count].box.right = (int) clamp(x2, 0, model_in_w);
        group->results[last_count].box.bottom = (int) clamp(y2, 0, model_in_h);

        group->results[last_count].prop = boxesScore[i];
        group->results[last_count].class_index = id;
        char *label = labels[id];
        strncpy(group->results[last_count].name, label, COCO_NAME_MAX_SIZE);

        // printf("result %2d: (%4d, %4d, %4d, %4d), %s\n", i, group->results[last_count].box.left, group->results[last_count].box.top,
        //        group->results[last_count].box.right, group->results[last_count].box.bottom, label);
        last_count++;
    }
    group->count = last_count;

    return 0;
}
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
<

CMakeLists.txt

cmake_minimum_required(VERSION 2.8.4)

STRING(REGEX REPLACE ".*/(.*)" "\\1" CURRENT_FOLDER ${CMAKE_CURRENT_SOURCE_DIR} )
MESSAGE("current project: " ${CURRENT_FOLDER})

set(CMAKE_SYSTEM_NAME Linux)
set(CMAKE_CROSSCOMPILING TRUE)

cmake_host_system_information(RESULT arch_value QUERY OS_PLATFORM)

if(NOT "${arch_value}" STREQUAL "armv7l")
   include ($ENV{HOME}/configs/cross.cmake)
endif()

project(yolov5_detect_demo)

## 算法头文件
set(sdk_inc include/)

## 算法源码
file(GLOB file_source lib/*.cpp *.cpp)
set(source ${file_source})

find_package(OpenCV REQUIRED)

add_executable(yolov5_detect_demo ${source})
target_include_directories(yolov5_detect_demo PUBLIC ${sdk_inc} ${OpenCV_INCLUDE_DIRS})
target_link_libraries(yolov5_detect_demo pthread rknn_api ${OpenCV_LIBS})
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
<

【详细记录】rv1126 跑通 yolov5

1、rknn模型在PC端进行推理测试，评估模型精度

2、模型预编译

3、推理代码编译

技术小宅

审核成功

审核失败