Unable to run vela models i.MX93 NPU

jcase · ‎03-20-2024

Hello,

We have a custom board with NXP i.MX93 (A0 silicon) processor. I am trying to exercise converted vela models in the demo applications provided by NXP but I am running into issues with the vela models. I also have an imx93evk (A0 silicon)

Custom Board & EVK Image:

Static hostname: maaxboardosm93
Icon name: computer
Machine ID: 548144e66f8842339dbc0cf9790b00a2
Boot ID: cef15273ca5f4569a5af7fa7089f4311
Operating System: NXP i.MX Release Distro 6.1-mickledore (mickledore)
Kernel: Linux 6.1.22+g37ae7d12b05b
Architecture: arm64

Setup:

In the custom board image, we have included the meta-imx layer, which includes the meta-ml packages. After building the yocto image and deploying to the board, I verify that I see the two delegates under /usr/lib/:

/usr/lib/libethosu_delegate.so
/usr/lib/libethosu.so.1.0.0

I also verify that I see demo applications under /usr/bin, for example:

/usr/bin/tensorflow-lite-2.10.0

Issue:

When I try to run the example demo above with a vela converted model, I received a segmentation fault error.

root@maaxboardosm93:/usr/bin/tensorflow-lite-2.10.0/examples# ./label_image -m ../examples/output/mobilenet_v1_1.0_224_quant_vela.tflite
INFO: Loaded model ../examples/output/mobilenet_v1_1.0_224_quant_vela.tflite
INFO: resolved reporter
Segmentation fault

When I run this exact same example on the EVK, I receive no issues.

I suspect there are missing dependencies or critical files from the yocto image, but after doing a diff between the evk built image and our image, I find no major differences with the meta-ml layers. Can you please help with letting me know which files are needed in order to exercise the EthosU MicroNPU please?

Thanks!

jcase · ‎03-26-2024

Hello,

We discovered the issue with enabling NPU on our image.

The Ethos-U-Firmware was not ported over correctly from the base NXP image. After correcting for this, we are able to now run Vela models and observe the speed increase on the NPU.

Feel free to close this issue.

Thanks!

Jacob

View solution in original post

Bio_TICFSL · ‎03-21-2024

Hello,

This is provably caused by rev0 where the memory segmentation fault, you need to upgrade the mx93 to a rev1. or you have a low memory. We can run the NPU models well in the EVK.

Regards

jcase · ‎03-21-2024

Hello,

Thanks for your reply.

I verified that the silicon on custom board and EVK are A0 silicon (pimx9352CVUXKAA - assuming this is A0). I've run several demos on this silicon with EVK and have had no issues.

Regarding memory, while running the ./label_image script on a static image with vela model, I'm performing this while board is in an idle mode so no other programs or applications are running.

Any other thoughts or ideas on why this may be occurring? Also to note, when I try to run the same script with the vela model, and call the external delegate (which isn't required), I receive the following error. Does this help provide any more insight in the problem?

INFO: EthosDelegate: 1 nodes delegated out of 1 nodes with 1 partitions.

ERROR: Failed to create ethos_u driver.

ERROR: Delegate kernal was not initialized

ERROR: Node number 1 (EthosDelegate) failed to prepare

ERROR: Failed to allocate tensors!

Bio_TICFSL · ‎03-21-2024

Hi,

A couple of things to check:

Is the model the float model or the quantized model? GPU backend only supports float.
Is the input tensor in float or in uint8? In other words, if you have the input tensor of size [1, 224, 224, 3], are you feeding 1x224x224x3 bytes or 1x224x224x3xsizeof(float) bytes?

Regards

jcase · ‎03-21-2024

Hi,

For now I'm not using a custom model. I'm trying to exercise the NXP provided scripts as shown below but run into these errors.

The model (mobilenet) that is being passed in is provided in the meta-imx/machine learning layer in yocto. So that is basically included in our yocto image as well, but there seems to be an issue with getting the ethos driver working?

Running this same script on the EVK works fine.

root@maaxboardosm93:/usr/bin/tensorflow-lite-2.10.0/examples# python3 label_image.py -m mobilenet_v1_1.0_224_quant_vela.tflite 
Segmentation fault

root@maaxboardosm93:/usr/bin/tensorflow-lite-2.10.0/examples# python3 label_image.py -m mobilenet_v1_1.0_224_quant_vela.tflite -e /usr/lib/libethosu_delegate.so 
Loading external delegate from /usr/lib/libethosu_delegate.so with args: {}
INFO: Ethosu delegate: device_name set to /dev/ethosu0.
INFO: Ethosu delegate: cache_file_path set to .
INFO: Ethosu delegate: timeout set to 60000000000.
INFO: Ethosu delegate: enable_cycle_counter set to 0.
INFO: Ethosu delegate: enable_profiling set to 0.
INFO: Ethosu delegate: profiling_buffer_size set to 2048.
INFO: Ethosu delegate: pmu_event0 set to 0.
INFO: Ethosu delegate: pmu_event1 set to 0.
INFO: Ethosu delegate: pmu_event2 set to 0.
INFO: Ethosu delegate: pmu_event3 set to 0.
INFO: EthosuDelegate: 1 nodes delegated out of 1 nodes with 1 partitions.
Traceback (most recent call last):
  File "/usr/bin/tensorflow-lite-2.10.0/examples/label_image.py", line 92, in <module>
    interpreter.allocate_tensors()
  File "/usr/lib/python3.11/site-packages/tflite_runtime/interpreter.py", line 513, in allocate_tensors
    return self._interpreter.AllocateTensors()
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
RuntimeError: Failed to create ethos_u driver.

Thanks,

Jacob

jcase · ‎03-26-2024

Hello,

We discovered the issue with enabling NPU on our image.

The Ethos-U-Firmware was not ported over correctly from the base NXP image. After correcting for this, we are able to now run Vela models and observe the speed increase on the NPU.

Feel free to close this issue.

Thanks!

Jacob