IMX8MP EVK / tflite Post Process execution core

取消
显示结果 
显示  仅  | 搜索替代 
您的意思是: 
已解决

IMX8MP EVK / tflite Post Process execution core

跳至解决方案
1,324 次查看
tetsuro-okuyama
Contributor V

Hi all,


I’ve tried to use SSD Mobilenet V2 .tflite(trained/converted through Object Detection API) on IMX8MP EVK.

Here is benchmark_model result of the .tflite

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Operator-wise Profiling Info for Regular Benchmark Runs:
============================== Run Order ==============================

             [node type]

          [start]

  [first]

 [avg ms]

     [%]

  [cdf%]

  [mem KB]

[times called]

                QUANTIZE

0

0.615

0.607

2.66%

2.66%

0

1

     TfLiteNnapiDelegate

0.607

18.077

17.979

78.89%

81.56%

0

1

              DEQUANTIZE

18.587

0.025

0.025

0.11%

81.67%

0

1

              DEQUANTIZE

18.613

0.473

0.474

2.08%

83.75%

0

1

TFLite_Detection_PostProcess

19.087

3.746

3.704

16.25%

100.00%

0

1

============================== Top by Computation Time ==============================

             [node type]

          [start]

  [first]

 [avg ms]

     [%]

  [cdf%]

  [mem KB]

[times called]

     TfLiteNnapiDelegate

0.607

18.077

17.979

78.89%

78.89%

0

1

TFLite_Detection_PostProcess

19.087

3.746

3.704

16.25%

95.15%

0

1

                QUANTIZE

0

0.615

0.607

2.66%

97.81%

0

1

              DEQUANTIZE

18.613

0.473

0.474

2.08%

99.89%

0

1

              DEQUANTIZE

18.587

0.025

0.025

0.11%

100.00%

0

1

Number of nodes executed: 5
============================== Summary by node type ==============================

             [Node type]

  [count]

  [avg ms]

    [avg %]

    [cdf %]

  [mem KB]

[times called]

     TfLiteNnapiDelegate

1

17.978

78.90%

78.90%

0

1

TFLite_Detection_PostProcess

1

3.704

16.26%

95.15%

0

1

                QUANTIZE

1

0.606

2.66%

97.81%

0

1

              DEQUANTIZE

2

0.499

2.19%

100.00%

0

2

 

Timings (microseconds): count=50 first=22936 curr=22800 min=22694 max=22972 avg=22789 std=58
Memory (bytes): count=0
5 nodes observed

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

It seems that Post-Process are executed on not NPU(NNAPI) but CPU core.

If so, I assume this behavior differs from Processor Reference Manual which explains PPU in NPU should execute Post-Process(Non-max Suppression).

Also I can see unexpected processing delay in Post-Process on on IMX8MP EVK now.

So I need countermeasure for this behavior.

Thanks in advance.

 

Best Regards,

0 项奖励
回复
1 解答
1,317 次查看
tetsuro-okuyama
Contributor V

Hi, all

Sorry for double post.
This post on the left was judged to be spam, so I posted it twice for testing.


在原帖中查看解决方案

0 项奖励
回复
1 回复
1,318 次查看
tetsuro-okuyama
Contributor V

Hi, all

Sorry for double post.
This post on the left was judged to be spam, so I posted it twice for testing.


0 项奖励
回复