Using GPU on i.MX8qmek for DNN infernece

cancel
Showing results for 
Show  only  | Search instead for 
Did you mean: 

Using GPU on i.MX8qmek for DNN infernece

Jump to solution
13,894 Views
ullasbharadwaj
Contributor III

I am currently trying to evaluate different inference engines with TensorFlow and TensorFlow Lite models on i.MX8 QMEK. I follow the eIQ guide form NXP and using L4.14 Release.

I tried with OpenCV DNN module, TFLite Interpreter and Arm NN. I was not able to use GPU with any of them. I know OpenCV does not run on GPU due to OpenCL compatibility issue on i.MX8 but can I not also use GPU with TFLite and Arm NN? 

On other hand, the Arm NN examples with e.IQ does not provide an option to use GPU at all. 

In this thread, Arm NN support for the i.MX 8 GPUsvanessamaegima suggested only TFLite engine supports GPU at the moment. 

So it is all confusing to me whether GPU is of any use on imx8qmek for running DNN inference.  

So if there is a way to use GPU, kindly let me know. This has me bogging my head from quite a while now. 

Best Regards 

Ullas Bharadwaj

Labels (1)
1 Solution
12,549 Views
diego_dorta
NXP Employee
NXP Employee

Hi Ullas,

Which BSP version are you using it? 5.4.3_1.0.0? If so, it will not work properly. You must build an image using the 5.4.3_2.0.0 version which is required by PyeIQ.

About the internet connection, you must connect your board because PyeIQ requires a connection to retrieve what it needs to work.

Thanks,

Diego

View solution in original post

0 Kudos
27 Replies
10,550 Views
ullasbharadwaj
Contributor III

Hi Vanessa and Diego,

FInally everything seems to be solved with 5.4.3_2.0.0. I was able to use GPU with TFLite Interpreter using my custom C++ application and also run PyeIQ samples. I would also investigate into ArmNN further. 
Thanks for all your leads :-)


However, I am facing problem with OpenCV. Opening mp4 video file for processing is very slow and also imshow() produces improper video output. It seems some problem with decoding I guess. However same file with gplay-1.0 plays perfecty fine at arounf 60 fps also.
I have attached the OpenCV config and also the image output sample. 

OpenCV uses gstreamer backend I believe and I get same log messages as in the attached screenshot both with gplay as well as videocapture.open( ). I am not sure what the problem is. 


Can you please help?

Also, by default logging is made active for TF lite and opencv... I disable for tflite using VSI_NN_LOG_LEVEL, similarly how can i do it for opencv DNN module? 

Best Regards
Ullas Bharadwaj

0 Kudos
10,549 Views
vanessa_maegima
NXP Employee
NXP Employee

Hi ullasbharadwaj‌,
Can you please share the logs you want to disable from OpenCV DNN?

marcofranchi‌, is there anything you can share with Ullas regarding the response above on the video processing?


Thanks,
Vanessa

0 Kudos
10,520 Views
ullasbharadwaj
Contributor III

Hi Vanessa,

Please find the attached log message screenshot.

Best Regards

Ullas Bahradwaj

0 Kudos
10,521 Views
vanessa_maegima
NXP Employee
NXP Employee

Hi Ullas,

You can disable these logs by removing the following line from the OpenCV code: dnn.cpp\src\dnn\modules - opencv-imx - i.MX OpenCV 

One question: are you evaluating eIQ for a company or self project? Can you share details on this?


Thanks,
Vanessa

0 Kudos
10,521 Views
ullasbharadwaj
Contributor III

Hi Vanessa,
Thanks for the solution:-)

I am actually evaluating i.MX8 for AI solutions in the company projects. i.MX based boards are already used by us extensively.

Best Regards

Ullas Bharadwaj

10,521 Views
vanessa_maegima
NXP Employee
NXP Employee

Hi Ullas,

Thanks! Is it possible to share what company is that?

Vanessa

0 Kudos
10,520 Views
ullasbharadwaj
Contributor III

Hi Vanessa,

I am asking this question in this thread as it is related to the GPU performance.

I ran SSD MobileNet v2 object detection model using TfLite on the GPU of i.MX8 Quad Max MEK as suggested above.

I got an average frame rate of ~10 FPS. I just wanted to know if imx8qmmek can be compared to Nvidia Jetson Nano? Because, according to benchmark of jetson nano, SSD MobileNet v2 offers 39 FPS. Is the GPUs on the imx8qmmek muxh inferior to maxwell GPU on jetson nano?

Best Regards

Ullas Bharadwaj   

0 Kudos
10,531 Views

Hi ullasbharadwaj‌,

The i.MX 8QM VPU requires de GPU usage to handle the linear tile format.

So you have two solutions in this case:

1 - Use imxvideoconvert_g2d in order to display it correctly:
$ gst-launch-1.0 filesrc location=<your_file>.mp4 ! decodebin ! imxvideoconvert_g2d ! waylandsink

2 - Enable use-g2d=1 on /etc/xdg/weston/weston.ini.

With the second solution, you will enable the GPU to handle the weston/wayland, so the imxvideoconvert_g2d will not be required anymore.

Regarding the slow results on ML application, in short history, even by having a good inference processing time, the entire process to take the video frame, resize it, process, and display it requires more than the inference time displayed, so it results is a slow video. We already have some good solutions to solve it and will be released on pyeiqv2.0.

Best Regards,

Marco

0 Kudos
10,524 Views
ullasbharadwaj
Contributor III

Hi Marco, 

Thanks for the reply. 

1. With Soultion 1 (gst-launch), with or without imxvideoconvert_g2d , video is playing properly.

2. use_g2d=1 didn't solve the problem.


The mp4 file plays properly with gplay-1.0  but the problem occurs (as shown in screenshot in my previous message) only when I use openCV videocapture to open the file.

Regarding slowness, with L4.14 BSP there was no such proble. For example, when inference time was 100ms, FPS was roughly around 10. In in 5.4.3_2.0.0 I am facing this issue. I guess the image output and slowness problem are related to each other.

Best Regards
Ullas Bharadwaj 

0 Kudos
10,524 Views

Hi Ullas Bharadwaj,

If you are facing this performance issue due to the BSP release, it will be better to open a new thread to check it.

Fell free to assign it to me as well.

Best Regards,

Marco

0 Kudos
10,552 Views
ullasbharadwaj
Contributor III

Hi, 
I am trying to build 5.4.3_2.0.0 on Ubuntu 18.04. I am facing this error while building from "optee" package. No way build is getting past this even by removing this package in local.conf.  Can you please help?

ERROR: optee-os-3.7.0.imx-r0 do_compile: oe_runmake failed
ERROR: optee-os-3.7.0.imx-r0 do_compile: Execution of '/home/bharadwaj/Desktop/yocto_imx8/imx-yocto-bsp/build_5_4_3_2_0/tmp/work/imx8qmmek-poky-linux/optee-os/3.7.0.imx-r0/temp/run.do_compile.18914' failed with exit code 1:
make: Entering directory '/home/bharadwaj/Desktop/yocto_imx8/imx-yocto-bsp/build_5_4_3_2_0/tmp/work/imx8qmmek-poky-linux/optee-os/3.7.0.imx-r0/git'
GEN /home/bharadwaj/Desktop/yocto_imx8/imx-yocto-bsp/build_5_4_3_2_0/tmp/work/imx8qmmek-poky-linux/optee-os/3.7.0.imx-r0/build.mx8qmmek/core/ta_pub_key.c
CHK /home/bharadwaj/Desktop/yocto_imx8/imx-yocto-bsp/build_5_4_3_2_0/tmp/work/imx8qmmek-poky-linux/optee-os/3.7.0.imx-r0/build.mx8qmmek/conf.mk
INSTALL /home/bharadwaj/Desktop/yocto_imx8/imx-yocto-bsp/build_5_4_3_2_0/tmp/work/imx8qmmek-poky-linux/optee-os/3.7.0.imx-r0/build.mx8qmmek/export-ta_arm64/include/mbedtls/memory_buffer_alloc.h
INSTALL /home/bharadwaj/Desktop/yocto_imx8/imx-yocto-bsp/build_5_4_3_2_0/tmp/work/imx8qmmek-poky-linux/optee-os/3.7.0.imx-r0/build.mx8qmmek/export-ta_arm64/include/mbedtls/platform_time.h
INSTALL /home/bharadwaj/Desktop/yocto_imx8/imx-yocto-bsp/build_5_4_3_2_0/tmp/work/imx8qmmek-poky-linux/optee-os/3.7.0.imx-r0/build.mx8qmmek/export-ta_arm64/include/mbedtls/camellia.h
INSTALL /home/bharadwaj/Desktop/yocto_imx8/imx-yocto-bsp/build_5_4_3_2_0/tmp/work/imx8qmmek-poky-linux/optee-os/3.7.0.imx-r0/build.mx8qmmek/export-ta_arm64/include/mbedtls/config.h
INSTALL /home/bharadwaj/Desktop/yocto_imx8/imx-yocto-bsp/build_5_4_3_2_0/tmp/work/imx8qmmek-poky-linux/optee-os/3.7.0.imx-r0/build.mx8qmmek/export-ta_arm64/include/mbedtls/rsa.h
INSTALL /home/bharadwaj/Desktop/yocto_imx8/imx-yocto-bsp/build_5_4_3_2_0/tmp/work/imx8qmmek-poky-linux/optee-os/3.7.0.imx-r0/build.mx8qmmek/export-ta_arm64/include/mbedtls/havege.h
INSTALL /home/bharadwaj/Desktop/yocto_imx8/imx-yocto-bsp/build_5_4_3_2_0/tmp/work/imx8qmmek-poky-linux/optee-os/3.7.0.imx-r0/build.mx8qmmek/export-ta_arm64/include/mbedtls/nist_kw.h
INSTALL /home/bharadwaj/Desktop/yocto_imx8/imx-yocto-bsp/build_5_4_3_2_0/tmp/work/imx8qmmek-poky-linux/optee-os/3.7.0.imx-r0/build.mx8qmmek/export-ta_arm64/include/mbedtls/chachapoly.h
INSTALL /home/bharadwaj/Desktop/yocto_imx8/imx-yocto-bsp/build_5_4_3_2_0/tmp/work/imx8qmmek-poky-linux/optee-os/3.7.0.imx-r0/build.mx8qmmek/export-ta_arm64/include/mbedtls/asn1write.h
INSTALL /home/bharadwaj/Desktop/yocto_imx8/imx-yocto-bsp/build_5_4_3_2_0/tmp/work/imx8qmmek-poky-linux/optee-os/3.7.0.imx-r0/build.mx8qmmek/export-ta_arm64/include/mbedtls/entropy.h
INSTALL /home/bharadwaj/Desktop/yocto_imx8/imx-yocto-bsp/build_5_4_3_2_0/tmp/work/imx8qmmek-poky-linux/optee-os/3.7.0.imx-r0/build.mx8qmmek/export-ta_arm64/include/mbedtls/ssl_cache.h
INSTALL /home/bharadwaj/Desktop/yocto_imx8/imx-yocto-bsp/build_5_4_3_2_0/tmp/work/imx8qmmek-poky-linux/optee-os/3.7.0.imx-r0/build.mx8qmmek/export-ta_arm64/include/mbedtls/version.h
INSTALL /home/bharadwaj/Desktop/yocto_imx8/imx-yocto-bsp/build_5_4_3_2_0/tmp/work/imx8qmmek-poky-linux/optee-os/3.7.0.imx-r0/build.mx8qmmek/export-ta_arm64/include/mbedtls/sha512.h
INSTALL /home/bharadwaj/Desktop/yocto_imx8/imx-yocto-bsp/build_5_4_3_2_0/tmp/work/imx8qmmek-poky-linux/optee-os/3.7.0.imx-r0/build.mx8qmmek/export-ta_arm64/include/mbedtls/blowfish.h
Traceback (most recent call last):
File "scripts/pem_to_pub_c.py", line 61, in <module>
main()
File "scripts/pem_to_pub_c.py", line 24, in main
from Crypto.PublicKey import RSA
File "/home/bharadwaj/Desktop/yocto_imx8/imx-yocto-bsp/build_5_4_3_2_0/tmp/work/imx8qmmek-poky-linux/optee-os/3.7.0.imx-r0/recipe-sysroot-native/usr/lib/python3.7/site-packages/Crypto/PublicKey/RSA.py", line 585
except ValueError, IndexError:
^
SyntaxError: invalid syntax
INSTALL /home/bharadwaj/Desktop/yocto_imx8/imx-yocto-bsp/build_5_4_3_2_0/tmp/work/imx8qmmek-poky-linux/optee-os/3.7.0.imx-r0/build.mx8qmmek/export-ta_arm64/include/mbedtls/ecdsa.h
INSTALL /home/bharadwaj/Desktop/yocto_imx8/imx-yocto-bsp/build_5_4_3_2_0/tmp/work/imx8qmmek-poky-linux/optee-os/3.7.0.imx-r0/build.mx8qmmek/export-ta_arm64/include/mbedtls/debug.h
mk/subdir.mk:159: recipe for target '/home/bharadwaj/Desktop/yocto_imx8/imx-yocto-bsp/build_5_4_3_2_0/tmp/work/imx8qmmek-poky-linux/optee-os/3.7.0.imx-r0/build.mx8qmmek/core/ta_pub_key.c' failed
make: *** [/home/bharadwaj/Desktop/yocto_imx8/imx-yocto-bsp/build_5_4_3_2_0/tmp/work/imx8qmmek-poky-linux/optee-os/3.7.0.imx-r0/build.mx8qmmek/core/ta_pub_key.c] Error 1
make: *** Waiting for unfinished jobs....
INSTALL /home/bharadwaj/Desktop/yocto_imx8/imx-yocto-bsp/build_5_4_3_2_0/tmp/work/imx8qmmek-poky-linux/optee-os/3.7.0.imx-r0/build.mx8qmmek/export-ta_arm64/include/mbedtls/compat-1.3.h
make: Leaving directory '/home/bharadwaj/Desktop/yocto_imx8/imx-yocto-bsp/build_5_4_3_2_0/tmp/work/imx8qmmek-poky-linux/optee-os/3.7.0.imx-r0/git'
WARNING: exit code 1 from a shell command.

ERROR: Logfile of failure stored in: /home/bharadwaj/Desktop/yocto_imx8/imx-yocto-bsp/build_5_4_3_2_0/tmp/work/imx8qmmek-poky-linux/optee-os/3.7.0.imx-r0/temp/log.do_compile.18914

0 Kudos
10,556 Views
vanessa_maegima
NXP Employee
NXP Employee

Hi Ullas,

When you migrated to 5.4.3_2.0.0 from 5.4.3_1.0.0, did you used the same build directory or did you create a new one for the new build?

Could you please try:

bitbake -c cleanall  optee-os

bitbake imx-image-full

0 Kudos
10,556 Views
ullasbharadwaj
Contributor III

Hi Vanessa,
I created a new directory and I also tried the clean and rebuild as you mentioned but it does not seem to solve.
I have just added these lines below in the conf/local.conf. (I used these in earlier BSP versions)

PACKAGECONFIG_append_pn-opencv_mx8 = " dnn python3 qt5 jasper test"
CORE_IMAGE_EXTRA_INSTALL += " python3 python3-pip"
IMAGE_INSTALL_append = " ffmpeg gputop"
TOOLCHAIN_TARGET_TASK += "tensorflow-lite-dev tensorflow-lite-staticdev"

COMMERCIAL_LICENSE ?= "lame gst-fluendo-mp3 libmad mpeg2dec ffmpeg qmmp"
LICENSE_FLAGS_WHITELIST = 'commercial'

Are some of these causing some conflicts?

Sorry for bothering you too much :-(


Best Regards
Ullas Bharadwaj

0 Kudos
10,557 Views
vanessa_maegima
NXP Employee
NXP Employee

Hi Ullas,

Unfortunately the L5.4.3_2.0.0 release is an Alpha release. Before we try to debug this issue can you please confirm if you had pip3 (python3-pip) installed on your L5.4.3_1.0.0 image when you tried PyeIQ on this release?

We discussed internally and PyeIQ should work with L5.4.3_1.0.0 if you have pip3 installed on the board.

Thanks,
Vanessa

0 Kudos
10,557 Views
ullasbharadwaj
Contributor III

Hi Vanessa,

Yes pip3 was available. pip3 install eiq.tar.gz gave some import errors.

1. There was "requests" package missing, so I installed it via downloading whl file and installing via pip3 install.

2. Then, it gave errors regarding PIL pacakge when importing it. So I tried to install it also via whl file but then it gave errors regarding zlib libraries missing.

Also, I tried without altering the local.conf as above, some how optee-os is now getting built. This was strange and I didn't expect it.

Best Regards,
Ullas Bharadwaj

10,557 Views
ullasbharadwaj
Contributor III

Thanks for the reply.


Then maybe because of 5.4.3_1.0.0 there were python packages were missing and hence I misunderstood that the PyeIQ will downoad them also.

I will try with 5.4.3_2.0.0 and get back to you :-)

Best Regards 
Ullas Bharadwaj

0 Kudos
10,556 Views
ullasbharadwaj
Contributor III

Hi Vanessa,
Thanks and I tried your suggessions.
IMAGE_INSTALL_append = "packagegroup-imx-qt5" gave me error " ERROR: Nothing RPROVIDES 'packagegroup-imx-qt5' ".

IMAGE_INSTALL_append = "packagegroup-imx-ml" is added to local.conf.

Still no libtensorflow-lite.a not included ito the SDK sysroots/usr/lib/. This issue was not there with L4.14.

And the compilation of TFLite app in C++ gives me error, undefined reference to `tflite::Interpreter::ModifyGraphWithDelegate.

Best Regards
Ullas Bharadwaj

0 Kudos
10,556 Views
vanessa_maegima
NXP Employee
NXP Employee

Hi Ullas,

Sorry, please try the following: packagegroup-qt5-imx

For your compilation issue, please try including needed directories in your Makefile/make command with -I instead of using the static library. This issue seems to be fixed in next releases.

Thanks,

Vanessa

0 Kudos
10,559 Views
ullasbharadwaj
Contributor III

Hi Vanessa,

I tried PyeIQ with 5.4.3 BSP. There are so many dependencies missing, for example "requests" which I installed with whl file manually as there is no internet connection to the board. Now I am facing import errors with PIL. 

Do you know if there is a config to include all these or am I missing anything wrt eIQ installation?

And, in the refrence manual for 5.4.3, it is mentions "Arm NN does
not currently support the i.MX 8 GPUs due to Arm NN OpenCL requirements which are
not met by i.MX8 GPUs.". But you mentioned above that it does support GPU. So it is confusing a bit.


Best Regards
Ullas Bharadwaj 

0 Kudos
10,559 Views
vanessa_maegima
NXP Employee
NXP Employee

Hi ullasbharadwaj‌,

And, in the refrence manual for 5.4.3, it is mentions "Arm NN does
not currently support the i.MX 8 GPUs due to Arm NN OpenCL requirements which are
not met by i.MX8 GPUs.". But you mentioned above that it does support GPU. So it is confusing a bit.

Please check Arm Compute Library chapter in Linux RM. Arm NN uses ACL delegates for GPU acceleration.

diegodorta‌, could you please check the PyeIQ queries?

Thanks,
Vanessa 

0 Kudos