2025年BM1684实战记录

BM1684实战记录一 环境搭建 工作需要 用到了算能的 1684 芯片 在此记录一下基于官方手册的实操过程 参考官方手册 BMNNSDK2 入门手册 链接 https sophgo doc gitbook io bmnnsdk2 bm1684 1 1 服务器环境 SDK 复现需要借助一定的开发环境

大家好,我是讯享网,很高兴认识大家。

一、环境搭建

工作需要,用到了算能的1684芯片,在此记录一下基于官方手册的实操过程

参考官方手册:BMNNSDK2 入门手册

链接:https://sophgo-doc.gitbook.io/bmnnsdk2-bm1684

1.1 服务器环境

        SDK复现需要借助一定的开发环境,这里基于公司公有服务器,通过ssh方式使用,步骤如下: 

  1. 申请特定服务器账号
  2. 借助ssh工具,登录到服务器,由于电脑没法安装软件,这里采用了win10自带的cmd终端(需要接入内网),也可以采用常用的MobaXterm、SecureCRT等软件
    讯享网
  3. 下载必要的成果物:SDK最新包+docker镜像,可以采用wget命令:
    # docker镜像 wget https://sophon-file.sophon.cn/sophon-prod-s3/drive/22/03/19/13/bmnnsdk2-bm1684-ubuntu-docker-py37.zip #SDK wget https://sophon-file.sophon.cn/sophon-prod-s3/drive/22/05/31/11/bmnnsdk2_bm1684_v2.7.0_patched.zip 

    讯享网

下载好的成果物如下:

1.2 SDK环境

        通过上述操作,我们已经下载了必备的成果物,这里先解压SDK整包,解压后,可以通过校验MD5码,防止文件被篡改,带来一些不必要的麻烦,命令如下:

讯享网(base) xxx@bitmain-SYS-4028GR-TR2:~$unzip bmnnsdk2_bm1684_v2.7.0_patched.zip Archive: bmnnsdk2_bm1684_v2.7.0_patched.zip creating: bmnnsdk2_bm1684_v2.7.0_patched/ inflating: bmnnsdk2_bm1684_v2.7.0_patched/bmnnsdk2.MD5 inflating: __MACOSX/bmnnsdk2_bm1684_v2.7.0_patched/._bmnnsdk2.MD5 inflating: bmnnsdk2_bm1684_v2.7.0_patched/release_version.txt inflating: __MACOSX/bmnnsdk2_bm1684_v2.7.0_patched/._release_version.txt inflating: bmnnsdk2_bm1684_v2.7.0_patched/bmnnsdk2-bm1684_v2.7.0.tar.gz inflating: __MACOSX/bmnnsdk2_bm1684_v2.7.0_patched/._bmnnsdk2-bm1684_v2.7.0.tar.gz (base) xxx@bitmain-SYS-4028GR-TR2:~/bmnnsdk2_bm1684_v2.7.0_patched$cat bmnnsdk2.MD5 6ae7d9b5a8564eb66f4fc2d39f ./bmnnsdk2-bm1684_v2.7.0.tar.gz bf2c9e43bc8f ./release_version.txt (base) xxx@bitmain-SYS-4028GR-TR2:~/bmnnsdk2_bm1684_v2.7.0_patched$md5sum ./* 6ae7d9b5a8564eb66f4fc2d39f ./bmnnsdk2-bm1684_v2.7.0.tar.gz 7719bf8cd5d5de8388ebcddda6f2c4be ./bmnnsdk2.MD5 bf2c9e43bc8f ./release_version.txt 

 继续解压缩SDK真正成果物,如下:

(base) xxx@bitmain-SYS-4028GR-TR2:~/bmnnsdk2_bm1684_v2.7.0_patched$tar -zxvf bmnnsdk2-bm1684_v2.7.0.tar.gz bmnnsdk2-bm1684_v2.7.0/ bmnnsdk2-bm1684_v2.7.0/release_version.txt ...... 

至此,SDK包的环境已经处理完毕。

1.3 docker环境

        经过上述操作,我们已经进入到服务器环境,并且下载好了相关成果物。为了方便便捷复现SDK,这里直接基于官方docker镜像,不再采用自搭docker。

        docker采用ubuntu-docker-py37,首先需要解压该docker压缩包,解压缩后,可以通过校验MD5码,防止文件被篡改,带来一些不必要的麻烦,命令如下:

讯享网base) xxx@bitmain-SYS-4028GR-TR2:~$unzip bmnnsdk2-bm1684-ubuntu-docker-py37.zip Archive: bmnnsdk2-bm1684-ubuntu-docker-py37.zip creating: bmnnsdk2-bm1684-ubuntu-docker-py37/ extracting: bmnnsdk2-bm1684-ubuntu-docker-py37/bmnnsdk2-bm1684-ubuntu.docker extracting: bmnnsdk2-bm1684-ubuntu-docker-py37/bmnnsdk2.MD5 extracting: bmnnsdk2-bm1684-ubuntu-docker-py37/Dockerfile.bm1684 extracting: bmnnsdk2-bm1684-ubuntu-docker-py37/release_version.txt (base) xxx@bitmain-SYS-4028GR-TR2:~/bmnnsdk2-bm1684-ubuntu-docker-py37$cat bmnnsdk2.MD5 cf91eb0ff60f28e368bba1c357d2e7e5 ./Dockerfile.bm1684 c181ce60245b4fe07596d8a ./release_version.txt 105a4d5d13a41d97353fd2dab88b4802 ./bmnnsdk2-bm1684-ubuntu.docker (base) xxx@bitmain-SYS-4028GR-TR2:~/bmnnsdk2-bm1684-ubuntu-docker-py37$md5sum ./* 105a4d5d13a41d97353fd2dab88b4802 ./bmnnsdk2-bm1684-ubuntu.docker 7b1fdecee114e6d2d82c21286e9b1a39 ./bmnnsdk2.MD5 cf91eb0ff60f28e368bba1c357d2e7e5 ./Dockerfile.bm1684 c181ce60245b4fe07596d8a ./release_version.txt 

        参考官方说明,SDK包中有docker运行的脚本docker_run_bmnnsdk.sh,不过考虑到当前公用服务器,该脚本大概率会被执行了很多遍,相关container已经被多次创建,这里为了方便识别,需要修改脚本中内容,重命名container名称,脚本修改点如下:

if [ -c "/dev/bm-sophon0" ]; then for dev in $(ls /dev/bm-sophon*); do mount_options+="--device="$dev:$dev" " done CMD="docker run \ --name ubuntu16.0-py37-wnb \ --network=host \ --workdir=/workspace \ --privileged=true \ ${mount_options} \ --device=/dev/bmdev-ctl:/dev/bmdev-ctl \ -v /dev/shm --tmpfs /dev/shm:exec \ -v $WORKSPACE:/workspace \ -v /dev:/dev \ -v /etc/localtime:/etc/localtime \ -e LOCAL_USER_ID=`id -u` \ -it $REPO/$IMAGE:$TAG \ bash " else CMD="docker run \ --name ubuntu16.0-py37-wnb \ --network=host \ --workdir=/workspace \ --privileged=true \ -v $WORKSPACE:/workspace \ -v /dev/shm --tmpfs /dev/shm:exec \ -v /etc/localtime:/etc/localtime \ -e LOCAL_USER_ID=`id -u` \ -it $REPO/$IMAGE:$TAG \ bash " fi 

下面创建container,采用官方脚本,容器创建后,会默认进入,命令如下:

讯享网(base) xxx@bitmain-SYS-4028GR-TR2:~/bmnnsdk2_bm1684_v2.7.0_patched/bmnnsdk2-bm1684_v2.7.0$./docker_run_bmnnsdk.sh /mnt/sdb2/xxx/bmnnsdk2_bm1684_v2.7.0_patched/bmnnsdk2-bm1684_v2.7.0 /mnt/sdb2/xxx/bmnnsdk2_bm1684_v2.7.0_patched/bmnnsdk2-bm1684_v2.7.0 bmnnsdk2-bm1684/dev:ubuntu16.04 docker run --name ubuntu16.0-py37-wnb --network=host --workdir=/workspace --privileged=true --device=/dev/bm-sophon0:/dev/bm-sophon0 --device=/dev/bm-sophon1:/dev/bm-sophon1 --device=/dev/bm-sophon2:/dev/bm-sophon2 --device=/dev/bm-sophon3:/dev/bm-sophon3 --device=/dev/bm-sophon4:/dev/bm-sophon4 --device=/dev/bm-sophon5:/dev/bm-sophon5 --device=/dev/bm-sophon6:/dev/bm-sophon6 --device=/dev/bm-sophon7:/dev/bm-sophon7 --device=/dev/bm-sophon8:/dev/bm-sophon8 --device=/dev/bmdev-ctl:/dev/bmdev-ctl -v /dev/shm --tmpfs /dev/shm:exec -v /mnt/sdb2/xxx/bmnnsdk2_bm1684_v2.7.0_patched/bmnnsdk2-bm1684_v2.7.0:/workspace -v /dev:/dev -v /etc/localtime:/etc/localtime -e LOCAL_USER_ID=1032 -it bmnnsdk2-bm1684/dev:ubuntu16.04 bash root@bitmain-SYS-4028GR-TR2:/workspace# 

注:

        上述方式运行的container,在退出后,container会自动退出,为了方便反复使用,可以通过如下命令进入:

(base) xxx@bitmain-SYS-4028GR-TR2:~/bmnnsdk2_bm1684_v2.7.0_patched/bmnnsdk2-bm1684_v2.7.0$docker start ubuntu16.0-py37-wnb ubuntu16.0-py37-wnb (base) xxx@bitmain-SYS-4028GR-TR2:~/bmnnsdk2_bm1684_v2.7.0_patched/bmnnsdk2-bm1684_v2.7.0$docker exec -it ubuntu16.0-py37-wnb bash root@bitmain-SYS-4028GR-TR2:/workspace#

至此,基本环境就搭建完毕了。

二、example重现

下面基于上述环境,进行SDK中example重现,目录结构如下:

讯享网#examples目录结构 . |-- Resnet_classify |-- RetinaFace |-- SSD_object |-- YOLOX_object |-- YOLOv3_object |-- YOLOv5_object |-- calibration |-- centernet |-- multimedia |-- nntc |-- okkernel `-- sail 

在复现example之前,还需要在docker中安装SDK中必须库和设置环境变量,命令如下:

root@bitmain-SYS-4028GR-TR2:/workspace/scripts# ./install_lib.sh nntc
linux is Ubuntu16.04.5LTS\n\l
bmnetc and bmlang USING_CXX11_ABI=1
Install lib done !
root@bitmain-SYS-4028GR-TR2:/workspace/scripts# source envsetup_pcie.sh
/workspace/scripts /workspace/scripts
......
Successfully installed Flask-2.1.2 brotli-1.0.9 click-8.1.3 dash-2.5.1 dash-bootstrap-components-1.2.0 dash-core-components-2.0.0 dash-cytoscape-0.3.0 dash-draggable-0.1.2 dash-html-components-2.0.0 dash-split-pane-1.0.0 dash-table-5.0.0 flask-compress-1.12 ipykernel-5.3.4 itsdangerous-2.1.2 jsonschema-3.2.0 ufw-1.0.0 ufwio-0.9.0
root@bitmain-SYS-4028GR-TR2:/workspace/scripts# source envsetup_cmodel.sh
/workspace/scripts /workspace/scripts
......
Installing collected packages: ufw
Successfully installed ufw-1.0.0

2.1 SSD_object(caffe)

2.1.1模型迁移

        首先,下载原生caffe模型,并做软连接,采用model目录下脚本实现,如下:

讯享网root@bitmain-SYS-4028GR-TR2:/workspace/examples/SSD_object/model# ./download_ssd_model.sh Downloading models_VGGNet_VOC0712_SSD_300x300.tar.gz... ...... All done! 

  • bmodel(fp32)模型生成:将原生模型转换成适合算能TPU的bmodel(fp32)模型,命令如下:
root@bitmain-SYS-4028GR-TR2:/workspace/examples/SSD_object/model# ./gen_bmodel.sh /workspace/examples/SSD_object/model ...... Success: combined to [out/fp32_ssd300.bmodel]. #生成的模型文件 ./out/ |-- fp32_ssd300.bmodel |-- ssd300 `-- ssd300_4batch 

  • bmodel(int8)模型生成:将原生模型转换成int8的bmodel模型,中间会将模型先转换为fp32格式的umodel格式(UFamework下的模型格式)模型,之后再借助该中间模型生成int8的umodel模型,最后再生成int8的bmodel模型,命令如下:
讯享网root@bitmain-SYS-4028GR-TR2:/workspace/examples/SSD_object/model# ./gen_umodel_int8bmodel.sh /workspace/examples/SSD_object/model /workspace/examples/SSD_object/model /workspace/examples/SSD_object/model ...... Success: combined to [out/int8_ssd300.bmodel]. combine bmodel ok /workspace/examples/SSD_object/model 

此时,可以看到该目录下有新目录out生成,该目录结构如下:

. |-- fp32_ssd300.bmodel |-- int8_ssd300.bmodel |-- ssd300 `-- ssd300_4batch 

2.1.2 精度回归

上章我们将原生caffe模型编译,生成了fp32、int8的bmodel,这里通过自带精度校验工具进行模型精度回归。

        该回归需要借助模型迁移中生成的输入、输出数据,命令如下:        

讯享网root@bitmain-SYS-4028GR-TR2:/workspace/examples/SSD_object/model# bmrt_test --context_dir=./out/ssd300 [BMRT][deal_with_options:1412] INFO:Loop num: 1 ...... [BMRT][bmrt_test:1043] INFO:+++ The network[VGG_VOC0712_SSD_300x300_deploy] stage[0] cmp success +++ [BMRT][bmrt_test:1063] INFO:load input time(s): 0.031876 [BMRT][bmrt_test:1064] INFO:calculate time(s): 0.037262 [BMRT][bmrt_test:1065] INFO:get output time(s): 0.000046 [BMRT][bmrt_test:1066] INFO:compare time(s): 0.006667 

2.1.3 算法迁移

        该部分做的主要工作是使用SDK提供的软件接口,实现模型前后处理逻辑。这里基于example中已经替换过的CPP进行编译、测试,算法源码如下,这里摘取一部分作为示例,主要是其中一些格式转换、缩放等接口替换为SDK中实现:

 // resize && split by bmcv for (size_t i = 0; i < input.size(); i++) { LOG_TS(ts_, "ssd pre-process-vpp") bmcv_image_vpp_convert (bm_handle_, 1, input[i], &resize_bmcv_[i], &crop_rect_); LOG_TS(ts_, "ssd pre-process-vpp") } // do linear transform LOG_TS(ts_, "ssd pre-process-linear_tranform") bmcv_image_convert_to (bm_handle_, input.size(), linear_trans_param_, resize_bmcv_, linear_trans_bmcv_); LOG_TS(ts_, "ssd pre-process-linear_tranform") 

        下面进行源码编译,【环境搭建】章节中,已经将编译需要的依赖及工具链配置好,这里直接编译即可,编译完之后,会在当前目录生成pcie、arm版本的可执行程序:

讯享网root@bitmain-SYS-4028GR-TR2:/workspace/examples/SSD_object/cpp_cv_bmcv_bmrt# make -f Makefile.pcie root@bitmain-SYS-4028GR-TR2:/workspace/examples/SSD_object/cpp_cv_bmcv_bmrt# make -f Makefile.arm #成果物 |-- ssd300_cv_bmcv_bmrt.arm `-- ssd300_cv_bmcv_bmrt.pcie 

        由于docker环境下是通过PCIE方式插入BM1684(可以通过lspci命令确认),这里可以直接运行ssd300_cv_bmcv_bmrt.pcie,发现如下报错:

root@bitmain-SYS-4028GR-TR2:/workspace/examples/SSD_object/cpp_cv_bmcv_bmrt# ./ssd300_cv_bmcv_bmrt.p cie image /workspace/res/image/vehicle_1.jpg ../model/out/fp32_ssd300.bmodel 1 0 ./ssd300_cv_bmcv_bmrt.pcie: error while loading shared libraries: libavcodec.so.58: cannot open shared object file: No such file or directory 

        通过排查,发现是环境配置章节中,需要根据环境,配置PCIE或者SOC模式,按照PCIE模式重新配置后,再运行后,demo能够正常执行:

讯享网root@bitmain-SYS-4028GR-TR2:/workspace/examples/SSD_object/cpp_cv_bmcv_bmrt# ./ssd300_cv_bmcv_bmrt.pcie image /workspace/res/image/vehicle_1.jpg ../model/out/fp32_ssd300.bmodel 1 0 [/home/jenkins/workspace/all_in_one_sa5/daily_build/bmetc/sa5/middleware-soc/bm_opencv/modules/core/src/cv_bmcpu.cpp:49->InternalBMCpuRegister]total 9 devices need to enable on-chip CPU. It may need serveral minutes for loading, please be patient.... ...... [ ssd overall] loops: 1 avg:  us [ read image] loops: 1 avg:  us [ attach input] loops: 1 avg: 2291 us [ detection] loops: 1 avg: 86327 us [ ssd pre-process] loops: 1 avg: 48232 us [ ssd pre-process-vpp] loops: 1 avg: 1300 us [ssd pre-process-linear_tranform] loops: 1 avg: 46928 us [ ssd inference] loops: 1 avg: 37930 us [ ssd post-process] loops: 1 avg: 161 us [/home/jenkins/workspace/all_in_one_sa5/daily_build/bmetc/sa5/middleware-soc/bm_opencv/modules/core/src/cv_bmcpu.cpp:113->~InternalBMCpuRegister]deconstructor function is called 

2.2 VQ-VAE(tensorflow)

        直接参考官方SDK中examples/nntc/bmnett示例,命令如下,直接执行模型转换脚本:

root@bitmain-SYS-4028GR-TR2:/workspace/examples/nntc/bmnett# ./bmnett_build_bmodel.sh Namespace(check_ops=True, cmp=True, const_names=None, descs=None, dyn=False, enable_profile=False, input_folder='', input_names=('P ...... BMLIB Send Quit Message Compiling succeeded. #成果物目录 ./output/ `-- vqvae |-- compilation.bmodel |-- input_ref_data.dat |-- io_info.dat `-- output_ref_data.dat 

2.3 LeNet(MXNet)

直接参考官方SDK中examples/nntc/bmnetm示例,命令如下,直接执行模型转换脚本:

讯享网root@bitmain-SYS-4028GR-TR2:/workspace/examples/nntc/bmnetm# ./bmnetm_build_bmodel.sh args: Namespace(cmp=None, debug=0, dyn=False, enable_profile=False, input_data='', input_names='data', list_ops=False, log_dir='', ...... I0712 11:56:00. 1480 bmcompiler_bmodel.cpp:154] [BMCompiler:I] save_tensor output name [softmax_output] BMLIB Send Quit Message #生成物目录 ./output/ `-- lenet |-- compilation.bmodel |-- input_ref_data.dat |-- io_info.dat `-- output_ref_data.dat 

2.4 Anchors(Pytorch)

        直接参考官方SDK中examples/nntc/bmnetp示例,命令如下,直接执行模型转换脚本:

root@bitmain-SYS-4028GR-TR2:/workspace/examples/nntc/bmnetp# ./bmnetp_build_bmodel.sh Namespace(cmp=True, desc=None, descs=None, dyn=False, enable_profile=False, input_structure=None, log_dir ...... BMLIB Send Quit Message Compiling succeeded. #生成物目录 ./output/ `-- anchors |-- compilation.bmodel |-- input_ref_data.dat |-- io_info.dat `-- output_ref_data.dat 

2.5 Yolov3-tiny(Darknet)

        直接参考官方SDK中examples/nntc/bmnetd示例,命令如下,直接执行模型转换脚本:

讯享网root@bitmain-SYS-4028GR-TR2:/workspace/examples/nntc/bmnetd# ./bmnetd_build_bmodel.sh ...... * Store bmodel of BMCompiler... ============================================================ BMLIB Send Quit Message #生成物目录 ./output/ `-- anchors |-- compilation.bmodel |-- input_ref_data.dat |-- io_info.dat `-- output_ref_data.dat 

2.6 Onnx&Paddle

其他深度学习框架的模型均能够转换到onnx格式,官方example未给具体示例展示

三、实战

3.1 模型迁移

为了减少运算量、提高模型性能等,一般都需要将模型转换为INT8,步骤如下图所示:

 

3.1.1 量化数据集准备

        参考官方SDK中examples/calibration/create_lmdb_demo,先下载数据集,这里采用的是coco128数据集,命令如下(如果无法运行,可以通过chmod增加运行权限,官方未加该权限):

root@bitmain-SYS-4028GR-TR2:/workspace/examples/calibration/create_lmdb_demo# chmod +x download_coco128.sh root@bitmain-SYS-4028GR-TR2:/workspace/examples/calibration/create_lmdb_demo# ./download_coco128.sh ...... inflating: coco128/README.txt 

        之后制作lmdb数据库文件,后面校准需要使用到该格式数据集,注意根据实际图片路径配置,官方给的路径参数有误,命令如下:

讯享网root@bitmain-SYS-4028GR-TR2:/workspace/examples/calibration/create_lmdb_demo# python3 convert_imageset.py --imageset_rootfolder=./coco128/images/train2017 --imageset_lmdbfolder=./coco128 --resize_height=256 --resize_width=256 --shuffle=True --bgr2rgb=False --gray=False reading image /workspace/examples/calibration/create_lmdb_demo/coco128/images/train2017/000000000634.jpg ...... reading image /workspace/examples/calibration/create_lmdb_demo/coco128/images/train2017/000000000359.jpg original shape: (332, 500, 3) cv_imge after resize (256, 256, 3) #目录结构 coco128/ |-- LICENSE |-- README.txt |-- data.mdb //即制作的数据库文件 |-- images `-- labels 

小讯
上一篇 2025-02-26 08:35
下一篇 2025-02-07 23:35

相关推荐

版权声明:本文内容由互联网用户自发贡献,该文观点仅代表作者本人。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如发现本站有涉嫌侵权/违法违规的内容,请联系我们,一经查实,本站将立刻删除。
如需转载请保留出处:https://51itzy.com/kjqy/30674.html