How to use the NPU delegate for running inference from Python

取消
显示结果 
显示  仅  | 搜索替代 
您的意思是: 

How to use the NPU delegate for running inference from Python

1,593 次查看
amrithkrish
Contributor I

Hello team,

I have a custom TfLite model for object detection. I want to run the inference using the same from imx8MPlus board. 

The Python script I have written is performing the inference with the default delegate - "XNNPACK delegate for CPU."

I wish to use NPU for running the inference from the board. I tried changing the delegate to libvx_delegate in the tflite.Interpreter in my script as shown :

 

 delegate = tflite.load_delegate('/usr/lib/libvx_delegate.so')
    ModelInterpreter = tflite.Interpreter(model_path=ModelPath,experimental_delegates=[delegate])

 

However, when I printed the delegate I used, it is showing as :

 

Delegate used :  <tflite_runtime.interpreter.Delegate object at 0xffff70472390>

 

 

What should I change/ add in my Python script so that my script will use NPU for inference?

Thanks in advance!

0 项奖励
回复
7 回复数

1,503 次查看
amrithkrish
Contributor I

Hi Brian,

I checked the code that was shared and had implemented a Python script similar to it. However my custom model is taking around 35 seconds for performing one inference when libvx_delegate.so was used. (I had ignored the warm- up time).

So my questions are :

1. Is this expected?

2. Does the model have any influence in the inference time taken ? (Like the model size or something?)

3. Can this time be reduced if NNStreamer is used instead of running it from Python?

 

Thanks in advance!

0 项奖励
回复

1,491 次查看
brian14
NXP TechSupport
NXP TechSupport

Thank you for your reply.

1. Is this expected?

It depends on your model but with 35 seconds for inference this is a really low performance.

2. Does the model have any influence in the inference time taken? (Like the model size or something?)

Yes, on your implementation the model takes most of the time.

3. Can this time be reduced if NNStreamer is used instead of running it from Python?

It seems that the problem is on your model optimization. I suggest you review your model in detail profiling and running benchmarks for your model.
Please have a look on the iMX Machine Learning User's Guide, specifically on section 9 NN Execution on Hardware Accelerators:
i.MX Machine Learning User's Guide (nxp.com)

Have a great day!

0 项奖励
回复

1,446 次查看
amrithkrish
Contributor I

Hi @brian14 

I had one more doubt. When I tried running my custom yolov3 model using the inference script you had mentioned before, I'm getting - Segmentation error", but the script runs fine with other models. 

Why is this error produced for this particular model? Can the size of the model be a factor for this?

 

Thanks in advance!

 

0 项奖励
回复

1,376 次查看
brian14
NXP TechSupport
NXP TechSupport

Hi @amrithkrish,

Based on your error, it is possible that this error is related to the model size.

 

0 项奖励
回复

1,358 次查看
amrithkrish
Contributor I

Thank you for the information.

 

 

0 项奖励
回复

1,567 次查看
brian14
NXP TechSupport
NXP TechSupport

Hi @amrithkrish

Thank you for contacting NXP Support!

I have reviewed your Python code, and it seems that you are correctly loading the external delegate.
However, I'm not sure why are you trying to print the delegate.

If you want to obtain the output from your model using the NPU you will need to implement a code using the following example:

tflite-vx-delegate-imx/examples/python/label_image.py at lf-6.1.36_2.1.0 · nxp-imx/tflite-vx-delegat...

I hope this information will be helpful.

Have a great day!

0 项奖励
回复

1,542 次查看
amrithkrish
Contributor I

Hi Brian,

Thank you for your support.

I had printed the delegate to check if it is showing as External (as expected when using the NPU).

I shall look into the code you had shared.

Thanks once again!

 

0 项奖励
回复