How to use the NPU delegate for running inference from Python

キャンセル
次の結果を表示 
表示  限定  | 次の代わりに検索 
もしかして: 

How to use the NPU delegate for running inference from Python

2,572件の閲覧回数
amrithkrish
Contributor I

Hello team,

I have a custom TfLite model for object detection. I want to run the inference using the same from imx8MPlus board. 

The Python script I have written is performing the inference with the default delegate - "XNNPACK delegate for CPU."

I wish to use NPU for running the inference from the board. I tried changing the delegate to libvx_delegate in the tflite.Interpreter in my script as shown :

 

 delegate = tflite.load_delegate('/usr/lib/libvx_delegate.so')
    ModelInterpreter = tflite.Interpreter(model_path=ModelPath,experimental_delegates=[delegate])

 

However, when I printed the delegate I used, it is showing as :

 

Delegate used :  <tflite_runtime.interpreter.Delegate object at 0xffff70472390>

 

 

What should I change/ add in my Python script so that my script will use NPU for inference?

Thanks in advance!

0 件の賞賛
返信
7 返答(返信)

2,482件の閲覧回数
amrithkrish
Contributor I

Hi Brian,

I checked the code that was shared and had implemented a Python script similar to it. However my custom model is taking around 35 seconds for performing one inference when libvx_delegate.so was used. (I had ignored the warm- up time).

So my questions are :

1. Is this expected?

2. Does the model have any influence in the inference time taken ? (Like the model size or something?)

3. Can this time be reduced if NNStreamer is used instead of running it from Python?

 

Thanks in advance!

0 件の賞賛
返信

2,470件の閲覧回数
brian14
NXP TechSupport
NXP TechSupport

Thank you for your reply.

1. Is this expected?

It depends on your model but with 35 seconds for inference this is a really low performance.

2. Does the model have any influence in the inference time taken? (Like the model size or something?)

Yes, on your implementation the model takes most of the time.

3. Can this time be reduced if NNStreamer is used instead of running it from Python?

It seems that the problem is on your model optimization. I suggest you review your model in detail profiling and running benchmarks for your model.
Please have a look on the iMX Machine Learning User's Guide, specifically on section 9 NN Execution on Hardware Accelerators:
i.MX Machine Learning User's Guide (nxp.com)

Have a great day!

0 件の賞賛
返信

2,425件の閲覧回数
amrithkrish
Contributor I

Hi @brian14 

I had one more doubt. When I tried running my custom yolov3 model using the inference script you had mentioned before, I'm getting - Segmentation error", but the script runs fine with other models. 

Why is this error produced for this particular model? Can the size of the model be a factor for this?

 

Thanks in advance!

 

0 件の賞賛
返信

2,355件の閲覧回数
brian14
NXP TechSupport
NXP TechSupport

Hi @amrithkrish,

Based on your error, it is possible that this error is related to the model size.

 

0 件の賞賛
返信

2,337件の閲覧回数
amrithkrish
Contributor I

Thank you for the information.

 

 

0 件の賞賛
返信

2,546件の閲覧回数
brian14
NXP TechSupport
NXP TechSupport

Hi @amrithkrish

Thank you for contacting NXP Support!

I have reviewed your Python code, and it seems that you are correctly loading the external delegate.
However, I'm not sure why are you trying to print the delegate.

If you want to obtain the output from your model using the NPU you will need to implement a code using the following example:

tflite-vx-delegate-imx/examples/python/label_image.py at lf-6.1.36_2.1.0 · nxp-imx/tflite-vx-delegat...

I hope this information will be helpful.

Have a great day!

0 件の賞賛
返信

2,521件の閲覧回数
amrithkrish
Contributor I

Hi Brian,

Thank you for your support.

I had printed the delegate to check if it is showing as External (as expected when using the NPU).

I shall look into the code you had shared.

Thanks once again!

 

0 件の賞賛
返信