i.MX 8M PLUS yolox_s.onnx convert to tflite using eIQ tool but but performance is not good.

cancel
Showing results for 
Show  only  | Search instead for 
Did you mean: 

i.MX 8M PLUS yolox_s.onnx convert to tflite using eIQ tool but but performance is not good.

1,408 Views
woohyoungshin
Contributor II

I downloaded in https://yolox.readthedocs.io/en/latest/demo/onnx_readme.html , the yolox_s.onnx

and convert to tflite using eIQ program.

convert is well working.  and  I inference in NXP MX 8M PLUS by tflite

CPU delegate is good performance.

but libvx_delegate (NPU) is poor performance.

what is the reason?

I attached tflite converted file (yolox.tfilte).

CPU INFERENCE RESULT : cpu infer.jpg 

NPU INFERNECE RESULT : eIQ_converted_tflite.jpg

 

when I infered NPU,   ERROR occur as below.

E [/usr/src/debug/tim-vx/1.1.39-r0/git/src/tim/vx/internal/src/vsi_nn_graph.c:vsi_nn_SetupGraph: 770] CHECK STATUS(-10: the supplied parameter information does not match the kernel contract. )

Tags (1)
0 Kudos
Reply
8 Replies

1,375 Views
brian14
NXP TechSupport
NXP TechSupport

Hi @woohyoungshin

Thank you for contacting NXP Support.

Could you please tell me your BSP and eIQ version?

I will try to replicate on my side and verify this issue.

Have a great day!

0 Kudos
Reply

1,348 Views
woohyoungshin
Contributor II
here is information.

BSP : Yocto kirkstone-5.15.32

eIQ version : 2.9.9

0 Kudos
Reply

1,319 Views
brian14
NXP TechSupport
NXP TechSupport

Thank you for your reply.

I will test and contact you as soon as possible.

Have a great day!

1,276 Views
woohyoungshin
Contributor II
How is the work going? Is it going well?
0 Kudos
Reply

1,234 Views
brian14
NXP TechSupport
NXP TechSupport

Hi @woohyoungshin

Sorry for the delayed reply.

I have been working on your case and I found that in our latest BSP release the benchmarks for the iMX8MP shows the following:

For CPU using 4 cores:

brian14_0-1700180438218.png

For NPU:

brian14_1-1700180447233.png

With this results we can see a time decrease from 1.427 seconds to 121.617 milliseconds. (~x11 faster than NPU).

In addition to that, after ONNX to TFLite conversion we can see that there are many transpose and Conv2D operators that affects significantly the time for CPU inference. Here is the op profiling:

brian14_2-1700180486288.png

We can see that CONV_2D and TRANSPOSE take around 1 second.

In contrast, on the NPU those operators are fully supported and accelerate the inference.

Based on the release notes from iMX Machine Learning those operators are bug fixed from your BSP version to the latest BSP version.

Therefore, I would like to suggest upgrading your BSP version to the latest release and testing your model.

I hope this information will be helpful.

Have a great day!

0 Kudos
Reply

1,199 Views
woohyoungshin
Contributor II
From my understanding, while the execution speed was fast, there were issues with the results of the execution. I’m curious if the results come out well when performing inference on images with the above results. Could you clarify this?
0 Kudos
Reply

1,194 Views
woohyoungshin
Contributor II
As seen in the attached picture (eIQ_converted_tflite.jpg), there was an issue with accuracy
0 Kudos
Reply

1,201 Views
woohyoungshin
Contributor II
Could I know the BSP version at which the bug was fixed?
0 Kudos
Reply