i.MX93 EVKCM EthosU NPU Example Error IOCTL failed

nxf50230 · ‎05-31-2023

Hi,

i am facing an error when trying to infer AI models on the EthosU NPU on the i.MX93 EVKCM.

Used BSP: Linux imx93evk 6.1.1+g29549c7073bf

First i compiled the mobilenet model from the tensorflow-lite example folder:

root@imx93evk:/usr/bin/tensorflow-lite-2.10.0/examples# vela mobilenet_v1_1.0_224_quant.tflite

Output:

Network summary for mobilenet_v1_1.0_224_quant
Accelerator configuration Ethos_U65_256
System configuration internal-default
Memory mode internal-default
Accelerator clock 1000 MHz
Design peak SRAM bandwidth 16.00 GB/s
Design peak DRAM bandwidth 3.75 GB/s

Total SRAM used 370.91 KiB
Total DRAM used 3622.39 KiB

CPU operators = 0 (0.0%)
NPU operators = 60 (100.0%)

Average SRAM bandwidth 4.73 GB/s
Input SRAM bandwidth 11.96 MB/batch
Weight SRAM bandwidth 9.70 MB/batch
Output SRAM bandwidth 0.00 MB/batch
Total SRAM bandwidth 21.77 MB/batch
Total SRAM bandwidth per input 21.77 MB/inference (batch size 1)

Average DRAM bandwidth 2.13 GB/s
Input DRAM bandwidth 1.52 MB/batch
Weight DRAM bandwidth 3.23 MB/batch
Output DRAM bandwidth 5.06 MB/batch
Total DRAM bandwidth 9.82 MB/batch
Total DRAM bandwidth per input 9.82 MB/inference (batch size 1)

Neural network macs 572406226 MACs/batch
Network Tops/s 0.25 Tops/s

NPU cycles 3891214 cycles/batch
SRAM Access cycles 1020041 cycles/batch
DRAM Access cycles 1677430 cycles/batch
On-chip Flash Access cycles 0 cycles/batch
Off-chip Flash Access cycles 0 cycles/batch
Total cycles 4604278 cycles/batch

Batch Inference time 4.60 ms, 217.19 inferences/s (batch size 1)

Then i tried the tflite label_image example:

root@imx93evk:/usr/bin/tensorflow-lite-2.10.0/examples# ./label_image -m mobilenet_v1_1.0_224_quant_vela.tflite external_delegate_path=/usr/lib/libethosu_delegate.so

Output:

INFO: Loaded model mobilenet_v1_1.0_224_quant_vela.tflite
INFO: resolved reporter
ERROR: Ethos_u inference failed

ERROR: Node number 0 (ethos-u) failed to invoke.
ERROR: Failed to invoke tflite!

I also tried the inference_runner script which throws the following error:

./inference_runner -n ./output/mobilenet_v1_1.0_224_quant_vela.tflite -i grace_hopper.bmp -l labels.txt -o output.txt
Send Ping
Send version request
Send capabilities request
Error: IOCTL failed

Any suggestions what i might do wrong?

brian14 · ‎06-02-2023

Hi @nxf50230,

I tried to replicate this issue and I could run successfully the example model mobilenet_v1_1.0_224_quant.tflite located at /usr/bin/tensorflow-lite-2.10.0/examples.

In your example you are using the command:

root@imx93evk:/usr/bin/tensorflow-lite-2.10.0/examples# ./label_image -m mobilenet_v1_1.0_224_quant_vela.tflite external_delegate_path=/usr/lib/libethosu_delegate.so

For the i.MX93 you don’t need to describe an external_delegate_path. The model compiled with vela command is automatically detected and runs on the NPU EthosU.

Please try with the following command:

root@imx93evk:/usr/bin/tensorflow-lite-2.10.0/examples# ./label_image -m output/mobilenet_v1_1.0_224_quant_vela.tflite -i grace_hopper.bmp -l labels.txt

You will see an output like this:

Notice that the average inference time is 3.885 ms.

If we use the mobilenet_v1_1.0_224_quant.tflite without "vela", the model runs on the CPU.

Notice the average inference time is 135.001 ms.

I hope this information will be helpful.

Have a great day!

View solution in original post

brian14 · ‎06-02-2023

Hi @nxf50230,

I tried to replicate this issue and I could run successfully the example model mobilenet_v1_1.0_224_quant.tflite located at /usr/bin/tensorflow-lite-2.10.0/examples.

In your example you are using the command:

root@imx93evk:/usr/bin/tensorflow-lite-2.10.0/examples# ./label_image -m mobilenet_v1_1.0_224_quant_vela.tflite external_delegate_path=/usr/lib/libethosu_delegate.so

For the i.MX93 you don’t need to describe an external_delegate_path. The model compiled with vela command is automatically detected and runs on the NPU EthosU.

Please try with the following command:

root@imx93evk:/usr/bin/tensorflow-lite-2.10.0/examples# ./label_image -m output/mobilenet_v1_1.0_224_quant_vela.tflite -i grace_hopper.bmp -l labels.txt

You will see an output like this:

Notice that the average inference time is 3.885 ms.

If we use the mobilenet_v1_1.0_224_quant.tflite without "vela", the model runs on the CPU.

Notice the average inference time is 135.001 ms.

I hope this information will be helpful.

Have a great day!

nxf50230 · ‎06-08-2023

Hi @brian14 ,

thanks a lot for your support! I can confirm that the mobilenetV1 model works with the ethosu accelerator, with and without specifying the delegate.

I just discovered that the described error occured after i tried to run a converted efficientdet-lite2 model (from tensorflow-lite model maker) which doesn't seem to work. So after i ran this model which threw the same error, i also couldn't run the mobilenet models on ethosu anymore. Only a reboot of the board fixed this issue for the mobilenet models.

I am still wondering why the efficientdet-lite2 model doesn't work because the vela compiler doesn't throw any errors, just warnings of placing unsupported operations on the CPU.

Do you have any idea why the execution of efficientdet-lite model is throwing this error?

brian14 · ‎06-08-2023

Hi @nxf50230,

Please help me to clarify your request.

Are you using TensorFlow lite model maker to train your model? (I'm not familiar with this tool)

I suspect there could be an error with the training step or with the conversion, you can try to do this conversion using the eIQ Tool from NXP.

I will try to run this model on i.MX93 and I will update it here as soon as possible.

Have a great day!

nxf50230 · ‎06-09-2023

Hi @brian14,

Yes, the tflite model maker is used to train and convert the model to a fully 8bit quantized version. There is no error during training or conversion because i can run the model on the i.MX8M Plus NPU without any problems.

brian14 · ‎06-20-2023

Hi

Sorry for the delayed reply.

I have been reviewing this case, but I couldn't successfully use the Google Collab tool.
Do you have any updates on your side about the use of the Efficientdet-lite model on the iMX93?
Could you please share the model to try on my side?

Have a great day!

Best regards, Brian.