20230314 ax

创始人

2024-06-03 11:45:03

0次

AX平台模型转换说明&示例

该示例将介绍算法如何在爱芯芯片上编译、量化及运行
文件存储链接如下：
SDK存储在了公司微盘 baiduwangpan
ax量化使用说明文档：neuwizard_doc_release.zip
ax官方示例网站：GitHub - AXERA-TECH/ax-samples: Samples code for world class Artificial Intelligence SoCs for computer vision applications. ，里边有百度云的网址（一些示例模型）
本地gitlab测试samples：baiduwangpan

一、模型量化工具使用说明
新版本SDK模型量化方式为PTQ，老版本SDK模型量化方式为QAT；这里介绍PTQ的操作流程
使用ax提供的docker镜像

$ tar -xvf axera_neuwizard_v0.5.34.2.tar.gz  # 工具链由内部工作人员发送给用户
$ cd axera_neuwizard_v0.5.34.2
$ ls .
axera_neuwizard_0.5.34.2.tgz  install.sh  VERSION
$ sudo ./install.sh

查看导入的镜像

$ sudo docker image ls
# 打印以下数据
REPOSITORY                               TAG           IMAGE ID       CREATED         SIZE
axera/neuwizard                          0.5.34.2      2e325090273c   7 weeks ago     2.87GB

启用容器，-v控制容器内外映射文件夹

docker run -it --name ax_tran2 -v /home/liangsa:/home/liangsa --gpus all --shm-size 64G axera/neuwizard:0.6.0.8 /bin/bash

板端如果没有烧写最新的SDK,需要替换一下文件：
libax_run_joint.so 将SDK中的此文件替换掉板端的文件，可以find一下所在位置

二、yolov5模型量化
1.pt文件转onnx文件(注意使用模型对应的训练工程，不同版本可能会存在差异)
a.将yolov5工程中models/yolo.py中Detect类的forward函数改为如下代码：

    def forward(self, x):# x = x.copy()  # for profilingz = []  # inference outputfor i in range(self.nl):x[i] = self.m[i](x[i])  # convreturn x[0],x[1],x[2]

b.再将export.py中export_onnx函数中torch.onnx.export部分的output_names参数进行更改：

output_names=['output']     改为    output_names = ['output1','output2','output3']

c.需要添加simplify进行导出

python export.py --weights model_file/yolov5s.pt --simplify --include onnx

2.准备1000张量化图片，打包（从训练集抽取，尽量将种类场景覆盖丰富）

tar cf fish_images.tar fish_images

3.编写配置文件yolov5FishingDetectv3.prototxt

# 基本配置参数：输入输出
input_type: INPUT_TYPE_ONNX
output_type: OUTPUT_TYPE_JOINT# 选择硬件平台
target_hardware: TARGET_HARDWARE_AX630# CPU 后端选择，默认采用 AXE
cpu_backend_settings {onnx_setting {mode: DISABLED}axe_setting {mode: ENABLEDaxe_param {optimize_slim_model: true}}
}# input onnx训练时输入图像颜色空间
src_input_tensors {color_space: TENSOR_COLOR_SPACE_RGB
}
# 转为joint板端模型后的输入图像颜色空间
dst_input_tensors {color_space: TENSOR_COLOR_SPACE_RGB
}# neuwizard 工具的配置参数
neuwizard_conf {operator_conf {input_conf_items {attributes {input_modifications {affine_preprocess {slope: 1slope_divisor: 255bias: 0}}}}}dataset_conf_calibration {path: "/home/liangsa/mount_axion/toolchain_new/try_fish1/fish_images.tar" # 数据集图片的 tar 包，用于编译过程中对模型校准type: DATASET_TYPE_TAR         # 数据集类型：tar 包size: 256                      # 编译过程中校准所需要的实际数据个数为 256}dataset_conf_error_measurement {path: "/home/liangsa/mount_axion/toolchain_new/try_fish1/fish_images.tar" # 用于编译过程中对分type: DATASET_TYPE_TARsize: 4                        # 对分过程所需实际数据个数为 4batch_size: 1}}dst_output_tensors {tensor_layout:NHWC    #模型输出通道  注意在yolov5中与onnx的输出顺序相比已经发生了变化
}# pulsar compiler 的配置参数
pulsar_conf {virtual_npu: VIRTUAL_NPU_MODE_221  #模型分配NPU的方式batch_size: 1debug : false
}

4.转换命令
转换完成后将得到joint文件和output_config.prototxt配置文件（用于pulsar run中对分）

pulsar build --input yolov5FishingDetectv3.onnx --output yolov5FishingDetectv3.joint --config yolov5FishingDetectv3.prototxt --output_config output_config.prototxt

三、模型对分
对分目的是评估onnx和joint间的差异，当前仅支持一幅图像的对分
--config ----- 上一步build中生成的配置文件

pulsar run yolov5FishingDetectv3.onnx yolov5FishingDetectv3.joint --input JPG/20210415_104301_ch3.jpg --config output_config.prototxt --output_gt gt

看cosine-sim的值，0.9991代表和onnx结果相差0.0009

四、模型编译（需要问后端协调可用来编译和运行的服务器/盒子）
本地gitlab测试samples：

一、编译examples中的
ax_yolov5s_steps_custom.cc
ax_yolov5s_steps_custom_batch.cc
ax_classification_steps_faceattr.cc
ax_track_demo.cc

二、交叉编译指令

$ mkdir build
$ cd build
$ cmake -DCMAKE_TOOLCHAIN_FILE=../toolchains/aarch64-linux-gnu.toolchain.cmake -DBSP_MSP_DIR=${AX630_SDK_XXX}/msp/out/ -DAXERA_TARGET_CHIP=ax630a ..
$ make -j8

三、相机/盒子使用
测试过的demo放在了ax_run_demo文件夹中
1.图像检测：（详细配置说明见下一节）
①单幅图像测试
ax_yolov5s_custom可以进行单幅图像的测试，同时可以通过增加repeat参数进行模型耗时计算
--model ------- joint模型
--image ------- 图像文件
--config ------- 配置文件，详细配置在批量图像测试中做了说明
--size ------- h,w 默认640,640 当模型输入尺寸为其他值时，改为对应参数
--repeat ------- 指定模型运行次数，可以用来评估模型板端耗时

./ax_yolov5s_custom --model ax_trans_AllObjectDetect_/yolov5AllObjectDetect.joint --image ax_trans_AllObjectDetect_/JPG/1666772661_2544_NM_YT.jpg --config ax_trans_AllObjectDetect_/AllObjectDetect.config.json

生成的result文件夹存储了叠加了检测框的图像

②批量图像测试
ax_yolov5s_custom_batch可以完成批量图像的检测，输出叠加检测框的图像和结果文件
a.准备图像
准备file.txt，其中存储目标图像的路径，每行代表一幅图像文件，如：
ax_trans_AllObjectDetect_/JPG/1666772661_2544_NM_YT.jpg
ax_trans_AllObjectDetect_/JPG/1666773681_34632_NM_YT.jpg
可以使用 ls -R ax_trans_AllObjectDetect_/JPG/*.jpg >file.txt指令生成file.txt文件

b.准备配置文件AllObjectDetect.config.json
prob_threshhold ------- 置信度过滤阈值
nms_threshhold ------- nms过滤阈值
anchors ------- yolov5使用的anchors
class_names ------- 模型输出的类别

c.执行操作
--model ------- joint模型
--images_list ------- 图像文件列表
--config ------- 配置文件
--size ------- h,w 默认640,640 当模型输入尺寸为其他值时，改为对应参数

./ax_yolov5s_custom_batch --model ax_trans_AllObjectDetect_/yolov5AllObjectDetect.joint --images_list ax_trans_AllObjectDetect_/file.txt --config ax_trans_AllObjectDetect_/AllObjectDetect.config.json
生成的result文件夹存储了叠加了检测框的图像
生成的result.txt是json格式的文件，内容对应了处理的每幅图像的名称、检测框的编号、类别、左上角xy和右下角xy

2.图像属性分类
ax_classification_steps_faceattr可以完成人脸属性分类，将打印人脸属性分类结果
./ax_classification_steps_faceattr --model ax_trans_FaceAttr_/FaceAttr.joint --image ax_trans_FaceAttr_/JPG/2.jpg

3.目标跟踪
ax_track_demo 将全目标检测与bytetrack跟踪加入其中
使用的同一张图片做循环模拟的视频流的输入（因为读视频报错）
./ax_track_demo 输出的结果保存在了./result中

五、板端图像测试
这里介绍yolov5s的单幅图像和批量图像的测试程序
1.获取yolov5存储在模型中的anchors（这里确保anchors获取到了正确的值，不同版本可能会有获取的差异）
方式1和方式2是由于所有yolov5版本号导致的差异
方式1：

####python
weights = '/home/liangsa/project/wisdom_site/yolov5-zl/model_file/yolov5FishingDetectv3.pt'
model = torch.load(str(weights[0] if isinstance(weights, list) else weights), map_location='cpu')
model2 = model['ema' if model.get('ema') else 'model'].float().fuse().model.state_dict()#打印的第一个是anchor   第二个是anchor_grid(差了stride的倍数8  16  32)
for k,v in model2.items():if 'anchor' in k:print(v.numpy().flatten().tolist())

方式2：

import torch
import sys
sys.path.append("/home/liangsa/project/wisdom_site/yolov5-master")weights = '/home/liangsa/project/ax_program/yolov5-master/model_file/Hardhat_Vest/best.pt'
model = torch.load(str(weights[0] if isinstance(weights, list) else weights), map_location='cpu')
# model1 = model['model'].state_dict()
# model2 = model['model'].model.state_dict()
model3 = model['ema' if model.get('ema') else 'model'] #ema移动平均线
model4 = model3.float().fuse().model.state_dict()anchor_list = []#6.0仅存储了一个
for k,v in model4.items():if 'anchor' in k:# print(k)# print(v)print(v.numpy().flatten().tolist())anchor_list.append(v.numpy().flatten().tolist())if len(anchor_list) == 1:anchor_out = []for i,an in enumerate(anchor_list[0]):if i < 6:anchor_out.append(an*8)elif i >= 6 and i < 12:anchor_out.append(an*16)else:anchor_out.append(an*32)print(anchor_out)print(model['model'].yaml['anchors'])  #yaml中存储的不一定为最终的

使用打印的第二个值作为配置文件中的anchors值
2.单幅图像测试
ax_yolov5s_custom可以进行单幅图像的测试，同时可以通过增加repeat参数进行模型耗时计算
--model ------- joint模型
--image ------ 图像文件
--config ------- 配置文件，详细配置在批量图像测试中做了说明
--size ------- h,w 默认640,640 当模型输入图像尺寸为其他值时，改为对应参数
--repeat ------- 指定模型运行次数，可以用来评估模型板端耗时
生成的result文件夹存储了叠加了检测框的图像

./ax_yolov5s_custom --model yolov5FishingDetectv3.joint --image JPG2/20210415_111546_ch3.jpg --config fish.config.json

3.批量图像测试

ax_yolov5s_custom_batch可以完成批量图像的检测，输出叠加检测框的图像和结果文件
a.准备图像
准备file.txt，其中存储目标图像的路径，每行代表一幅图像文件，如：

./JPG/1.jpg
./JPG/2.jpg
./JPG/3.jpg

可以使用 ls -R ./JPG/*.jpg >file.txt指令生成file.txt文件
b.准备配置文件fish.config.json
prob_threshhold ------- 置信度过滤阈值
nms_threshhold ------- nms过滤阈值
anchors ------- yolov5使用的anchors
class_names ------- 模型输出的类别

{"post_params": {"prob_threshhold": 0.35,"nms_threshhold": 0.45,"anchors": [25.015625, 51.40625, 56.1875, 26.359375, 61.03125, 57.5625, 124.625, 43.15625, 59.6875, 110.0625, 139.5, 75.5625, 113.375, 173.75, 270.25, 89.9375, 268.5, 245.875],"class_names":["fish"]}
}

c.执行操作
--model ------- joint模型
--images_list ------ 图像文件列表
--config ------- 配置文件
--size ------- h,w 默认640,640 当模型输入图像尺寸为其他值时，改为对应参数

./ax_yolov5s_custom_batch --model yolov5FishingDetectv3.joint --images_list file.txt --config fish.config.json

Python复制代码

./ax_yolov5s_custom_batch --model yolov5FishingDetectv3.joint --images_list file.txt --config fish.config.json

生成的result文件夹存储了叠加了检测框的图像
生成的result.txt是json格式的文件，内容对应了处理的每幅图像的名称、检测框的编号、类别、左上角xy和右下角xy

六、可用编译/测试环境
截至20230303
编译环境：
ssh lhz@10.0.10.110 密码lhz

docker run -it --name liangsa_ax_test -p 8010:8010 -v /data2/workspace/lhz:/home/lhz --shm-size 32G liuhz_ubuntu18.04:v1.1 /bin/bash

进入samples文件夹：

mkdir build

cd build

cmake -DCMAKE_TOOLCHAIN_FILE=../toolchains/aarch64-linux-gnu.toolchain.cmake -DBSP_MSP_DIR=/home/lhz/AX630A_SDK_V1.50.0_20220328180857_NO1097/msp/out/ -DAXERA_TARGET_CHIP=ax630a ..

make -j8

测试/运行环境：

把在编译环境生成的可执行文件放到这个里边运行即可。

词库加载错误:未能找到文件“E:\highferrum_mysql\Configuration\Dict_Stopwords.txt”。

上一篇：Java编程入门先学什么？Java零基础学习路线分享！

下一篇：Java中性能超越各个BeanUtils的Object对象属性映射框架MapStruct常规使用方式和技巧

20230314 ax

AX平台模型转换说明&示例

相关内容

热门资讯