i.MX 8M PLUS yolox_s.onnx convert to tflite using eIQ tool but but performance is not good.

キャンセル
次の結果を表示 
表示  限定  | 次の代わりに検索 
もしかして: 

i.MX 8M PLUS yolox_s.onnx convert to tflite using eIQ tool but but performance is not good.

1,902件の閲覧回数
woohyoungshin
Contributor II

I downloaded in https://yolox.readthedocs.io/en/latest/demo/onnx_readme.html , the yolox_s.onnx

and convert to tflite using eIQ program.

convert is well working.  and  I inference in NXP MX 8M PLUS by tflite

CPU delegate is good performance.

but libvx_delegate (NPU) is poor performance.

what is the reason?

I attached tflite converted file (yolox.tfilte).

CPU INFERENCE RESULT : cpu infer.jpg 

NPU INFERNECE RESULT : eIQ_converted_tflite.jpg

 

when I infered NPU,   ERROR occur as below.

E [/usr/src/debug/tim-vx/1.1.39-r0/git/src/tim/vx/internal/src/vsi_nn_graph.c:vsi_nn_SetupGraph: 770] CHECK STATUS(-10: the supplied parameter information does not match the kernel contract. )

タグ(1)
0 件の賞賛
返信
8 返答(返信)

1,869件の閲覧回数
brian14
NXP TechSupport
NXP TechSupport

Hi @woohyoungshin

Thank you for contacting NXP Support.

Could you please tell me your BSP and eIQ version?

I will try to replicate on my side and verify this issue.

Have a great day!

0 件の賞賛
返信

1,842件の閲覧回数
woohyoungshin
Contributor II
here is information.

BSP : Yocto kirkstone-5.15.32

eIQ version : 2.9.9

0 件の賞賛
返信

1,813件の閲覧回数
brian14
NXP TechSupport
NXP TechSupport

Thank you for your reply.

I will test and contact you as soon as possible.

Have a great day!

1,770件の閲覧回数
woohyoungshin
Contributor II
How is the work going? Is it going well?
0 件の賞賛
返信

1,728件の閲覧回数
brian14
NXP TechSupport
NXP TechSupport

Hi @woohyoungshin

Sorry for the delayed reply.

I have been working on your case and I found that in our latest BSP release the benchmarks for the iMX8MP shows the following:

For CPU using 4 cores:

brian14_0-1700180438218.png

For NPU:

brian14_1-1700180447233.png

With this results we can see a time decrease from 1.427 seconds to 121.617 milliseconds. (~x11 faster than NPU).

In addition to that, after ONNX to TFLite conversion we can see that there are many transpose and Conv2D operators that affects significantly the time for CPU inference. Here is the op profiling:

brian14_2-1700180486288.png

We can see that CONV_2D and TRANSPOSE take around 1 second.

In contrast, on the NPU those operators are fully supported and accelerate the inference.

Based on the release notes from iMX Machine Learning those operators are bug fixed from your BSP version to the latest BSP version.

Therefore, I would like to suggest upgrading your BSP version to the latest release and testing your model.

I hope this information will be helpful.

Have a great day!

0 件の賞賛
返信

1,693件の閲覧回数
woohyoungshin
Contributor II
From my understanding, while the execution speed was fast, there were issues with the results of the execution. I’m curious if the results come out well when performing inference on images with the above results. Could you clarify this?
0 件の賞賛
返信

1,688件の閲覧回数
woohyoungshin
Contributor II
As seen in the attached picture (eIQ_converted_tflite.jpg), there was an issue with accuracy
0 件の賞賛
返信

1,695件の閲覧回数
woohyoungshin
Contributor II
Could I know the BSP version at which the bug was fixed?
0 件の賞賛
返信