Is it possible to use libovxlib.so directly on Android?

dennis3 · ‎09-21-2023

We're looking to Accelerate the yolo5 model with the NPU on Android (i.mx8m+)

On linux, this is done with the libvx_delegate.so backend.

Example on Linux:

> ./benchmark_model --graph=yolov5n-int8-250.tflite --external_delegate_path=/usr/lib/libvx_delegate.so
</ trim output>
Running benchmark for at least 50 iterations and at least 1 seconds but terminate if exceeding 150 seconds.
count=60 first=16580 curr=16497 min=16464 max=16735 avg=16536.7 std=49

On Android, this delegate is not present. However, libovxlib.so is, which is apparently the backend for the Android HAL layer to the NPU for NNapi.

NNapi however, cannot accelerate the yolo5 model.

Example on Android:

> ./benchmark_model --graph=yolo5n-int8-250.tflite --use_nnapi=true
STARTING!
Log parameter values verbosely: [0]
Graph: [yolov5n-int8-250.tflite]
Use NNAPI: [1]
NNAPI accelerators available: [vsi-npu,nnapi-reference]
Loaded model yolov5n-int8-250.tflite
INFO: Initialized TensorFlow Lite runtime.
INFO: Created TensorFlow Lite delegate for NNAPI.
NNAPI delegate created.
WARNING: NNAPI SL driver did not implement SL_ANeuralNetworksDiagnostic_registerCallbacks!
VERBOSE: Replacing 273 node(s) with delegate (TfLiteNnapiDelegate) node, yielding 7 partitions.
WARNING: NNAPI SL driver did not implement SL_ANeuralNetworksDiagnostic_registerCallbacks!
WARNING: NNAPI SL driver did not implement SL_ANeuralNetworksDiagnostic_registerCallbacks!
WARNING: NNAPI SL driver did not implement SL_ANeuralNetworksDiagnostic_registerCallbacks!
WARNING: NNAPI SL driver did not implement SL_ANeuralNetworksDiagnostic_registerCallbacks!
Explicitly applied NNAPI delegate, and the model graph will be partially executed by the delegate w/ 4 delegate kernels.
The input model file size (MB): 2.16466
Initialized session in 939.695ms.
Running benchmark for at least 1 iterations and at least 0.5 seconds but terminate if exceeding 150 seconds.
ERROR: NN API returned error ANEURALNETWORKS_OP_FAILED at line 5140 while running computation.
ERROR: Node number 284 (TfLiteNnapiDelegate) failed to invoke.
count=1 curr=1374449
Benchmarking failed.

The NPU hardware is capable of processing the model from the Linux tests, so my question is, how can we compile the vx_delegate that works on Linux, or perhaps use the ovxlib directly to bypass the NNapi on Android?

* Tests done with Android 13 2.0.0 on imx8mpevk, Tensorflow lite 2.10.1 benchmark utility

JosephAtNXP · ‎09-21-2023

Hi,

Thank you for your interest in NXP Semiconductor products,

The Machine Learning Users Guide states the following

(https://www.nxp.com/docs/en/user-guide/IMX-MACHINE-LEARNING-UG.pdf)
The NNAPI Delegate for Linux platform is deprecated and will be removed in the future. Use VX Delegate instead.

LinuxBSP supports OpenVX (TIM-VX) instead of NNAPI.

But it seems that the android only support NNAPI(NNRT), on below doc.

(https://www.nxp.com/docs/en/user-guide/IMX_ANDROID_TENSORFLOWLITE_USERS_GUIDE.pdf)

Is the only delegate and it's in future plans to use NNAPI in android, VX is not supported but in Linux.

Regards

dennis3 · ‎10-27-2023

If someone could further address the error with the nnapi in the original post that would be great. This question is still unanswered.

1) We know the ovx library can accelerate the model fine as demonstrated by the vx deletate on Linux.
2) The nnapi on the android side seems to be lacking something but this is not an nnapi API problem, it is a driver issue as this model works fine on other phones through the nnapi.

Help on how to debug or a patch to fix the nnapi would be much appreciated. Problem occurs on Android 12 and latest Android 13 releases.

JosephAtNXP · ‎10-27-2023

Hi,

Currently there are no development in android that uses libovx, libovx is exclusively over openvx, which makes impossible the use of libovx over NNAPI for the current and upcoming releases.

As you said, model is well accelerated in linux, that is the best option that you have.

Regards,

dennis3 · ‎10-27-2023

Thank you for the response.

Unfortunately LInux is not an option for our product. The product is built on Android. Our customer still expect Hardware acceleration though. This puts us in a difficult situation as NXP repeatedly said in the early stages of this product that these models would be supported on Android.

Please advise how we can go about getting the NNApi debugged to work. (see error in original post) A model tweak? A driver fix?

FWIW I ported libvx_deletage to android with the NDK and was able to use the model with similar acceleration results as on Linux. Unfortunately it seems to not play very nice with libGAL when OpenGL rendering is done at the same time as NN processing. Since it isn't supported anyway that isn't a path we'd like to explore.

JosephAtNXP · ‎10-27-2023

Hi,

I don't think there's any work to do on NNAPI, if results do match, you can expect time differences between vx_delegate and NNAPI.

Regards,

dennis3 · ‎10-27-2023

Thank you again for looking.
I think you may be misunderstanding the problem. The original post shows the error message we get when attempting to use the nnapi. The model fails to run at all with nnapi even though it works fine with vx_delegate.

dennis3 · ‎09-22-2023

Thank you for the reply. This hasn't answered the question though.
1) I realize it isn't supported but is it possible to use vx_delegate on Android? Like if I compile it with the ndk and use it through JNI.
or
2) Can I use the libraries that are on Android libovxlib.so through the ndk directly instead of using the nnapi? (I don't know if this would work or if this is the same as using the nnapi though.)

Using the nnapi doesn't accelerate the model so our product cannot function properly on Android as is.

For context, when we first started this project we were told by support through forums and the ticket system that this was a bug that would be fixed in Android 12. Our customer furthered development with the imx8m+ having trust in NXP that we would be able to have this model be accelerated on Android. Later after waiting over a year and having multiple Android releases with no change, support story changed that it wouldn't be supported. This is a major setback for us and we need to focus on finding a solution even if it isn't supported. We know the hardware is capable of accelerating the model.

Is it possible to use libovxlib.so directly on Android?

Is it possible to use libovxlib.so directly on Android?

Android

i.MX 8M | i.MX 8M Mini | i.MX 8M Nano