Facing Segmentation fault while testing CPP based TFlite application on the imx8mqevk

rmani0029 · ‎10-20-2020

Testing OpenCV-based Lane departure warning system(LDWS) application on imx8mqevk with BSP 5.4.3_1.0.0, it's working fine with CPU memory but consuming 180 percent of the CPU memory(makes the camera output slow). So, I need to run the same piece of code on the GPU to improve camera output response. Took eIQ TFlite based face recognition application as a reference for GPU acceleration. Implemented the same piece of code for the LDWS app also. Tried different TFlite models for execution and enabled the NNAPI delegates, but facing segmentation fault, it's working fine without NNAPI and running on the CPU memory with slow camera response. Attaching log file for reference.

Can you please help me to solve the segmentation issues and suggest how effectively use the imx8 GPU memory for ADAS applications like LDWS and Traffic light recognition.

I took reference source from the following link,

https://source.codeaurora.org/external/imxsupport/eiq_sample_apps/tree/examples-tflite/face_recognit...

The following LDWS run_inference code sections only I have doubt, kindly clarify it.

lane_future = pool.enqueue(init_lanealgorithm);
if (!lane.empty())
pred_future = pool.enqueue(get_output_tensor, lane, s);

init_lanealgorithm function contains the OpenCV piece of code for lane detections and it's return type is Mat object.

lane - Mat object

Bio_TICFSL · ‎10-21-2020

Hello rmani0029,

With your log, it is very difficult to figure out what caused the segmentation after run inference.

And,There is no stack trace back to point to where the segmentation fault happens.

Could you use gdb to run your application to see you can get trace back to point where the segmentation happens?

Or you gave your application and show us how to set up the test so that we can debug.

As I know previously, opencv's mat object uses cpu memory and is very slow. We once had traffic recognition demo which uses gpu physical memory allocated by g2d, implemented zero copy for mat's data buffer for opencl gpu data processing, which improved the performance and reduce cpu usage.

Regards

rmani0029 · ‎10-23-2020

Hi,

As per my understanding g2d support is not available for imx8mq-evk target. Can you please confirm this? If I am wrong on this please share the source reference link for the traffic recognition demo, it will be helpful for me to understand the OpenCL GPU data processing on the imx8 board.

Regards,

Manikandan.R

rmani0029 · ‎10-22-2020

Hi,

Thanks for your reply!

PFA source code & binary of LDWS app. I have taken the face-recognition app as a reference, tested the same it's running on both CPU & GPU with better performance.

https://source.codeaurora.org/external/imxsupport/eiq_sample_apps/tree/examples-tflite/face_recognit...

Test Environment:

Target: imx8mq-evk

bsp: 5.4.3_1.0.0

For imx8 GPU acceleration, if I go for TFlite NNAPI delegates for Lane Departure Warning System(LDWS) application I need a custom .tflite model specifically for LDWS application? Is it possible to use the face recognition .tflite model for LDWS?

lanedaparture_helpers and lanedeparture .c files make it as .cpp. (Because, I can't able to attach the .cpp files)

mfn.h, ThreadPool.h, profiling.h, lanedeparture_helpers_impl.h, Makefile.linux files are the same as the face recognition application.

LaneDeparture.bz2 file contains the binary file of the LDWS application.

Regards,

Manikandan.R