yolov7 tensorrt模型加速部署实战
Posted 视觉算法和深度学习(模型训练、加速、部署)
tags:
篇首语:本文由小常识网(cha138.com)小编为大家整理,主要介绍了yolov7 tensorrt模型加速部署实战相关的知识,希望对你有一定的参考价值。
0. linux环境配置
基于tensorrt+cuda c++实现模型end2end的gpu加速,支持win10、linux,在2023年已经更新模型:YOLOv8, YOLOv7, YOLOv6, YOLOv5, YOLOv4, YOLOv3, YOLOX, YOLOR,pphumanseg,u2net,EfficientDet。
Windows10教程正在制作,可以关注仓库:https://github.com/FeiYull/TensorRT-Alpha
参考我这篇保姆教程《yolov8 tensorrt模型加速部署【实战】》第二章Ubuntu18.04环境配置
1. 导出onnx
直接在网盘下载onnx[weiyun]:weiyun or google driver 或者手动导出onnx:
git clone https://github.com/WongKinYiu/yolov7
git checkout 072f76c72c641c7a1ee482e39f604f6f8ef7ee92
# 640
python export.py --weights yolov7-tiny.pt --dynamic --grid
python export.py --weights yolov7.pt --dynamic --grid
python export.py --weights yolov7x.pt --dynamic --grid
# 1280
python export.py --weights yolov7-w6.pt --dynamic --grid --img-size 1280
2.编译 onnx
使用tensorrt官方工具编译onnx文件。
# 把你的onnx文件放到这个路径:tensorrt-alpha/data/yolov7
cd tensorrt-alpha/data/yolov7
export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:~/TensorRT-8.4.2.4/lib
# 640
../../../../TensorRT-8.4.2.4/bin/trtexec --onnx=yolov7-tiny.onnx --saveEngine=yolov7-tiny.trt --buildOnly --minShapes=images:1x3x640x640 --optShapes=images:4x3x640x640 --maxShapes=images:8x3x640x640
../../../../TensorRT-8.4.2.4/bin/trtexec --onnx=yolov7.onnx --saveEngine=yolov7.trt --buildOnly --minShapes=images:1x3x640x640 --optShapes=images:4x3x640x640 --maxShapes=images:8x3x640x640
../../../../TensorRT-8.4.2.4/bin/trtexec --onnx=yolov7x.onnx --saveEngine=yolov7x.trt --buildOnly --minShapes=images:1x3x640x640 --optShapes=images:4x3x640x640 --maxShapes=images:8x3x640x640
# 1280
../../../../TensorRT-8.4.2.4/bin/trtexec --onnx=yolov7-w6.onnx --saveEngine=yolov7-w6.trt --buildOnly --minShapes=images:1x3x1280x1280 --optShapes=images:4x3x1280x1280 --maxShapes=images:8x3x1280x1280
# note:if report an error(Error Code 1: Cuda Runtime (an illegal memory access was encountered "bool context = m_context->executeV2((void**)bindings)" returns false)
when running the model(yolov7-w6), just lower the batch_size.
4.运行
git clone https://github.com/FeiYull/tensorrt-alpha
cd tensorrt-alpha/yolov7
mkdir build
cd build
cmake ..
make -j10
# note: the dstImage will be saved in tensorrt-alpha/yolov7/build by default
## 640
# infer image
./app_yolov7 --model=../../data/yolov7/yolov7-tiny.trt --size=640 --batch_size=1 --img=../../data/6406401.jpg --show --savePath
./app_yolov7 --model=../../data/yolov7/yolov7-w6.trt --size=1280 --batch_size=1 --img=../../data/6406401.jpg --show --savePath
# infer video
./app_yolov7 --model=../../data/yolov7/yolov7-tiny.trt --size=640 --batch_size=8 --video=../../data/people.mp4 --show --savePath=../
# infer camera
./app_yolov7 --model=../../data/yolov7/yolov7-tiny.trt --size=640 --batch_size=4 --cam_id=0 --show
下面右图是运行效果,左边是yolov7官方效果。这里给一个视频测试:
yolov7-tiny : Offical( left ) vs Ours( right )
CV&DL
以上是关于yolov7 tensorrt模型加速部署实战的主要内容,如果未能解决你的问题,请参考以下文章
win10下 yolov8 tensorrt模型加速部署实战