NPU bad detection with Yolov5 - i.MX8MP

cancel
Showing results for 
Search instead for 
Did you mean: 

NPU bad detection with Yolov5 - i.MX8MP

Jump to solution
1,412 Views
simoberny
Contributor II

Hi, 

I'm quite struggling for some time now trying to get NPU detection to work with a C++ program. The same code on the CPU gets optimal results, but using VX delegate the detections are completely wrong. The code seems to run smoothly and inference shows good timing (yolov5s model with 448x448 input ~ 70ms). 

Right now I'm trying with Yolov5 (uint8 quantized), but I have tried with different pre-trained models obtaining the same behavior, good detection on CPU, and random detection on NPU. 

To obtain the model I used the export from yolov5 repo: 

 python export.py --weights yolov5s.pt  --imgsz 448 --include tflite --int8

I've also tried TFlite hub models like SSD and mobilenet, that have already been converted to uint8. 

 

In the attachment the piece of code I am using for the inference and the converted yolov5n model. 

What could it be the cause? 

 

Thanks,

Best regards

0 Kudos
1 Solution
1,283 Views
Bio_TICFSL
NXP TechSupport
NXP TechSupport

Hi,

 

At least, You have to change to 5.15.71 BSP.

Regards

View solution in original post

21 Replies
377 Views
bb4567
Contributor I

We are working on a project using a Yolov5 model.  As others have experienced the model runs fine on the CPU, valid bounding boxes, and results that match what are expected.  But we are having issues getting the model working with the NPU.  We have processed the model as described in the Object Detection using YOLOv5 document mentioned in this discussion.  The closest I've gotten is if I convert the saved_model.pb flavor (the frozen graph).   But if I literally follow the documentation I get the error about the missing header and the failure to build the shader.  At the end of the run we don't get any results.   I'm trying to zero in on where the problem lies.   We are currently using BSP version 5.15.52.  But I see mention above that you need a minimum of 5.15.71.  So could someone provide some guidance on how to resolve this issue.  Note we are using C++.  Thanks..

0 Kudos
1,189 Views
sams4
Contributor I

@simoberny : I am facing the same issue . For me the detections are showing correctly when printed on console. The issue is with the bounding box co-ordinates. Co-ordinates of detected objects are random and some are negative as well. Can you or NXP support help on this matter?

Thanks!!!

0 Kudos
1,154 Views
simoberny
Contributor II

Which version of BSP? 

So you have correct labels and predictions, but wrong bounding boxes? 

In my case, everything seems wrong. The results seem totally random. 

 

From the Variscite customer helpdesk, they say that the model should be rebuilt to be NPU-compatible. They sent me an optimized small mobile-net SSD model and the detections are perfect. But is actually a pain in the ass to train. 

At least I know for sure that the problem is entirely related to the model itself. For now, I'll use CPU with a smaller YOLO model, In the hope of finding a way to use it with the NPU.

 

Bests

 

0 Kudos
1,048 Views
sams4
Contributor I

@simoberny : The BSP is upgraded to 5.15.32_2.0.0 version and it worked.

Detections and bounding boxes started appearing.

Thanks!!!

0 Kudos
1,385 Views
Bio_TICFSL
NXP TechSupport
NXP TechSupport

Hello,

Attached you will find some benchmark on vx delegate and MX8MPlus, also it is an appnote on object detection.

Hope this helps

862 Views
hy982530
Contributor I

Hello, @Bio_TICFSL 

My version of BSP is 5.15.71.

I execute yolov5s-32fp-256.tflite on npu according to your teaching. My program is python and the .tflite has correct result on cpu. However, on npu, I get correct labels and predictions, but wrong bounding boxes. 

in addition, some error appear when I use VX delegate. Can you tell me what's wrong?

hy982530_0-1679544743981.png

Thank you, 

Best regards 

 

0 Kudos
815 Views
hy982530
Contributor I

I added cl_viv_vx_ext.h in version 5.15.71, and the error message in the picture disappeared.

However, I got the same result: correct labels and predictions, but wrong bounding boxes.

0 Kudos
819 Views
hy982530
Contributor I

I roll back the version of BSP to 5.15.32 and the bounding boxes are correct!

I want to confirm whether version 5.15.71 is the cause of the incorrect results. Has anyone encountered the same problem?

Thanks

0 Kudos
229 Views
simoberny
Contributor II

I'am using now 5.15.71 and result are correct using Yolov5. 

To be precise I'm using a yolov5s model trained on car recognition. I partially followed the official guide that was uploaded on this topic, but in the conversion to .tflite I used a uint8 quantization as input. In this way the NPU is fully exploited.

Results are very similar in terms of correctness to the CPU ones. 

What version of tflite do you have? I'am currently with 2.9.1

Bests

0 Kudos
225 Views
bb4567
Contributor I
Thanks. We are also running 2.9.1 of tflite. Our model is trained to recognize 10 classes of objects, It is also a yolov5s model before conversion. Runs fine on the CPU, and also runs fine with the Coral TPU. But the NPU is not loving the model. So I tested with the NXP pre-coverted .tflite model for their YOLOv5 example and that fails in the same manner. So something else has to be the issue in our configuration.
0 Kudos
222 Views
simoberny
Contributor II

Ah sorry I saw the message now, by now I had replied to the other message

Ok actually in your case it seems like something much more difficult to fix, it doesn't seem related only to the versions. Anyway, just as a last try if you want, take a look at the files I uploaded.

To be honest, in my opinion there is too much useless documentation on these NPUs, and too little of what is really needed. I don't know how many hours I wasted on it, only to find out, here on this forum, that a higher kernel version was needed.

0 Kudos
218 Views
bb4567
Contributor I
Yes, the documentation has been a frustration for me as well. And not very good explanations from NXP support. For example one of the issues we get is it appears that the code thinks part way though execution that it should now be running on the GPU and so attempts to dynamically build a shader, which of course fails. That was a similar error that a couple of others on the forums have reported, but not explanation of what triggers that.
0 Kudos
1,382 Views
simoberny
Contributor II

Thanks for the response and the documentation. 

The guide actually describes what I already did.

For the sake of scruple, I followed all the steps and recreated a new model. But the situation remains the same, on CPU it works perfectly, instead on NPU I have no result except random detection with really low confidence. I tried both with INT8 quantized and FLOAT.

I am on Yocto 5.15.52-2.1.0 which uses Tensorflow 2.5.0 as default. I'm now trying to compile a newer version. 

Another strange behavior is that when I use the VX delegate I can't gently close the application, because Segmentation Fault occurs. VX delegate is compiled to the last version with git official repo. 

Thanks

0 Kudos
1,323 Views
simoberny
Contributor II

I wanted to clarify that the version I'm working on is 5.10.52.

Also, the yolov5_decode python script used in the guide is not accessible

0 Kudos
1,284 Views
Bio_TICFSL
NXP TechSupport
NXP TechSupport

Hi,

 

At least, You have to change to 5.15.71 BSP.

Regards

235 Views
simoberny
Contributor II

Good morning, 

I confirm that starting from version 5.15.71, the NPU works perfectly even with more complex models such as Yolov5. I get great results with reliability equal to the CPU.

I was initially unable to upgrade as there were no updated Basler drivers available for the release.

I'am now obtaining ~10 FPS on a real time vehicles detection and tracking application, using Variscite i.MX8 Plus module on custom board + Basler da2500-60mci camera + Yocto Kirkstone 5.15.71_2.2.0. 

Thanks for support,

Best regards

0 Kudos
228 Views
bb4567
Contributor I

Glad you got it to work for you.

 

Unfortunately even though we are now at 5.15.71 BSP we continue to have the same issues when trying to use the NPU.   Including using the .tflite model that is in the zip file along with the example script mentioned in the NXP YOLOv5 document.  So that should eliminate anything strange about our model.

So it sounds like there might still be a mismatch of some kind in the Yocto build for our board.  The board is from a different vendor than NXP, and the Yocto build details are from that company as well.  So at this point I'm thinking that there is a subtle difference in the build that is causing our issues.

0 Kudos
224 Views
simoberny
Contributor II

1. Who is the board vendor? 

2. Kernel version is 5.15.71, but what version of Tflite does the vendor preinstalled in the yocto recipe? Variscite preloaded tensorflow 2.9.1.

3. Your issue is now wrong bounding boxes right?

 

I attach the small model I'm currently using with the inference code. I hope it helps you. 

0 Kudos
52 Views
bb4567
Contributor I

Thanks again for the model.  I've been traveling for a while and had meant to get back to you.

I tried with your model and it fails for us in the same way as our model.

What board are you running with?  We are currently working with a TechNexion development board.

We did a full Yocto build and the version for TFlite and others appear to match, but no joy.

This is even when running the Python test script that in in the Zipfile from the NXP Yolo how to document.  Using the NXP model.

So we are still chasing some other issue.

Thanks,

 

0 Kudos
214 Views
bb4567
Contributor I
Thanks, I'll try your model to see if it behaves any differently, but I suspect it will be just like the one from the example in the NXP Yolov5 doc. We ultimately have bad bounding boxes, but also invalid detections as well. So we will have to keep digging.
0 Kudos