Model Optimization and Pruning of Poacher-detecting YOLOv5
Optimizing a YOLOv5 model for NVIDIA Jetson Nano to increase the inference speed and reduce memory footprint, focusing on inference speed not the absolute mAP.
This is an account of the 10-week teamwork of the AI for Wildlife Challenge 3 engineers working on a poacher-detecting model running to edge hardware of an autonomous drone. The group was split into 4 teams: Model Optimization, CI/CD, Hardware and Autonomous Flight. In this article, you'll learn about the results of the Model Optimization team.
The Model Optimization team’s goal for the challenge was to optimize the YOLOv5 model for the NVIDIA Jetson Nano, a small computer for AI IoT applications, to i) increase the inference speed and ii) reduce memory footprint. Our focus was mainly on the inference speed, not the absolute mAP per se.
We started by reviewing literature and our preliminary research resulted in 4 potential paths to explore:
Convert the YOLOv5 model to TensoRT format Which could be used to get better optimisation
Look at PyTorch native optimizations
Apply Ultralytics’ optimisation / pruning techniques. Review YOLOv5 repo and find out what exactly are they doing in terms of optimisation / pruning while training and conversion (eg. mixed-precision training?) in comparison with the native PyTorch optimization toolkit
Review ONNX conversion and optimizations to convert the trained YOLO model to ONNX and use ONNX optimisation toolkit
We experimented with optimizing sparse models with ONNX. For CPU inference, DeepSparse Engine produced a speedup. However, it was much slower than native PyTorch for GPU.
We ran experiments with NeuralMagic and Nebullvm recipe based optimization libraries. The former did not produce significant improvements of the results, the latter proved to be a lot to work with when setting up the environment.
“At the beginning of the challenge, I felt I did not belong or not as skilled or knowledgeable as the other members. I could barely understand the jargon and how I would be of value to the team or the challenge. However, through engagement and asking questions (everyone was friendly and helpful), I quickly understood that FruitPunchAI challenges are about learning, impact and networks. I came to understand it is a platform to enhance my DS / ML skills and our society with AI. By the end of the challenge, I had gained confidence and an appreciation of how not knowing is an opportunity to learn. It is also encouraging that our contributions will be helping the rangers.” - Sabelo Mcebo Makhanya, Model Optimization Team
We also tried converting the YOLOv5 models from the previous challenge to the Tensor RT Engine INT8 calibrated to the Jetson Nano 4GB itself. It failed. TensorRT engines turned out to be hardware-specific. One cannot convert a model to INT8 on some device and run an inference with it on the Jetson Nano. However, we could build and run FP16 on the Nano.
Our results showed the YOLOv5 Small with image size 640 x 640 in FP16 mode was the most sensible to be used on the current dataset and a desktop GPU.
Final Jetson Nano Results
The results showed that we don’t have to use the higher power mode. We can use the lower power mode without sacrificing performance or accuracy.
We concluded that YOLOv5s was the better choice given the accuracy and inference speed from the results. Input image size 640 x 640 is suitable for the current dataset(s). FP16 precision was a go to, since we didn’t lose accuracy while boosting inference speed. On the Jetson Nano, TensoRT was the best choice with the best performance of all the optimizations we tried. It’s also great for CI/CD automation as for the codebase.
As for the next steps it would be worthwhile to explore different hardware accelerators. On-drone tests should come handy to test the baseline and define SMART goals for further model optimization. And with the YOLOv5 architecture constantly upgrading, there’s always potential to explore structured pruning.
Model Optimization Team
Sahil Chachra, Manu Chauhan, Jaka Cikač, Sabelo Makhanya, Sinan Robillard
AI for Wildlife 3 Engineers:
Model Optimization - Sahil Chachra, Manu Chauhan, Jaka Cikač, Sabelo Makhanya, Sinan Robillard
CI/ CD - Aisha Kala, Mayur Ranchod, Sabelo Makhanya, Rowanne Trapmann, Ethan Kraus, Barbra Apilli, Samantha Biegel, Adrian Azoitei