IMX8MP EVK / tflite Post Process execution core

取消
显示结果 
显示  仅  | 搜索替代 
您的意思是: 

IMX8MP EVK / tflite Post Process execution core

1,112 次查看
Kei-Ueda
Contributor I

 

Hi all,

I’ve tried to use SSD Mobilenet V2 .tflite(trained/converted through Object Detection API) on IMX8MP EVK.

Here is benchmark_model result of the .tflite

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Operator-wise Profiling Info for Regular Benchmark Runs:
============================== Run Order ==============================

             [node type]

          [start]

  [first]

 [avg ms]

     [%]

  [cdf%]

  [mem KB]

[times called]

                QUANTIZE

0

0.615

0.607

2.66%

2.66%

0

1

     TfLiteNnapiDelegate

0.607

18.077

17.979

78.89%

81.56%

0

1

              DEQUANTIZE

18.587

0.025

0.025

0.11%

81.67%

0

1

              DEQUANTIZE

18.613

0.473

0.474

2.08%

83.75%

0

1

TFLite_Detection_PostProcess

19.087

3.746

3.704

16.25%

100.00%

0

1

============================== Top by Computation Time ==============================

             [node type]

          [start]

  [first]

 [avg ms]

     [%]

  [cdf%]

  [mem KB]

[times called]

     TfLiteNnapiDelegate

0.607

18.077

17.979

78.89%

78.89%

0

1

TFLite_Detection_PostProcess

19.087

3.746

3.704

16.25%

95.15%

0

1

                QUANTIZE

0

0.615

0.607

2.66%

97.81%

0

1

              DEQUANTIZE

18.613

0.473

0.474

2.08%

99.89%

0

1

              DEQUANTIZE

18.587

0.025

0.025

0.11%

100.00%

0

1

Number of nodes executed: 5
============================== Summary by node type ==============================

             [Node type]

  [count]

  [avg ms]

    [avg %]

    [cdf %]

  [mem KB]

[times called]

     TfLiteNnapiDelegate

1

17.978

78.90%

78.90%

0

1

TFLite_Detection_PostProcess

1

3.704

16.26%

95.15%

0

1

                QUANTIZE

1

0.606

2.66%

97.81%

0

1

              DEQUANTIZE

2

0.499

2.19%

100.00%

0

2

 

Timings (microseconds): count=50 first=22936 curr=22800 min=22694 max=22972 avg=22789 std=58
Memory (bytes): count=0
5 nodes observed

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

It seems that Post-Process are executed on not NPU(NNAPI) but CPU core.

If so, I assume this behavior differs from Processor Reference Manual which explains PPU in NPU should execute Post-Process(Non-max Suppression).

Also I can see unexpected processing delay in Post-Process on on IMX8MP EVK now.

So I need countermeasure for this behavior.

Thanks in advance.

 

Best Regards,

0 项奖励
回复
1 回复

1,073 次查看
Kei-Ueda
Contributor I

Hi all,

 

I confirmed newest i.MX 8M Plus Applications Processor

Reference Manual Rev.1(https://www.nxp.com/webapp/Download?colCode=IMX8MPRM)

and the description of Supported Neural Network Layers list was gone.

Does this mean NPU(NNAPI) cannot support PostProcess including Non-maximum suppression currently?

Please let me know details.

Thanks in advance.

 

Best Regards,

0 项奖励
回复