I'm trying to test dlib "webcam_face_pose_ex" example in IMX8QXP board.
The below code is taking 2161 millisecond approximately. And CPU is more than 120%
std::vector<rectangle> faces = detector(cimg, 0);</rectangle></bgr_pixel>
Whereas in Raspberry pi 3 the same code taking 675 milliseocond. [ just for reference]
CPU: 4 * cortex A35
Ram : 3 Gb
Os: 64 bit linux os.
Raspberry pi 3:
CPU: 4 * cortex a53
ram : 1Gb
Os: 32bit rasbian buster.
I have tried the compiler flags like -mcpu and more.
Also tried Openblas library and no improvement in IMX8.
I'm having a high end CPU and still lagging in performance.
How can i improve the perfomace.