Why GPU taking more time than CPU for inference ?

Swapnil_Shah — Thu, 21 Apr 2022 06:14:09 GMT

Image : LF_v5.15.5-1.0.0_images_IMX8MQEVK

Hi @Bio_TICFSL, It seems that the GPU is underperforming while I am running the prebuilt model file, when I run this same thing on the CPU, it gives a much faster result. Below is the mentioned average time for CPU and GPU.

For CPU ==>

thread = 1:

./label_image -i grace_hoopper.bmp -l lables.txt -m mobilenet_v1_1.0_224_quant.tflite -t 1 INFO: Loaded model mobilenet_v1_1.0_224_quant.tflite INFO: resolved reporter INFO: invoked INFO: average time: 179.697 ms INFO: 0.764706: 653 military uniform INFO: 0.121569: 907 Windsor tie INFO: 0.0156863: 458 bow tie INFO: 0.0117647: 466 bulletproof vest INFO: 0.00784314: 835 suit

thread = 2:

./label_image -i grace_hoopper.bmp -l lables.txt -m mobilenet_v1_1.0_224_quant.tflite -t 2 INFO: Loaded model mobilenet_v1_1.0_224_quant.tflite INFO: resolved reporter INFO: invoked INFO: average time: 92.645 ms INFO: 0.764706: 653 military uniform INFO: 0.121569: 907 Windsor tie INFO: 0.0156863: 458 bow tie INFO: 0.0117647: 466 bulletproof vest INFO: 0.00784314: 835 suit

thread = 3 :

./label_image -i grace_hoopper.bmp -l lables.txt -m mobilenet_v1_1.0_224_quant.tflite -t 3 INFO: Loaded model mobilenet_v1_1.0_224_quant.tflite INFO: resolved reporter INFO: invoked INFO: average time: 64.785 ms INFO: 0.764706: 653 military uniform INFO: 0.121569: 907 Windsor tie INFO: 0.0156863: 458 bow tie INFO: 0.0117647: 466 bulletproof vest INFO: 0.00784314: 835 suit

thread = 4 :

./label_image -i grace_hoopper.bmp -l lables.txt -m mobilenet_v1_1.0_224_quant.tflite -t 4 INFO: Loaded model mobilenet_v1_1.0_224_quant.tflite INFO: resolved reporter INFO: invoked INFO: average time: 48.975 ms INFO: 0.764706: 653 military uniform INFO: 0.121569: 907 Windsor tie INFO: 0.0156863: 458 bow tie INFO: 0.0117647: 466 bulletproof vest INFO: 0.00784314: 835 suit

For GPU ==>

./label_image -i grace_hoopper.bmp -l lables.txt -m mobilenet_v1_1.0_224_quant.tflite -a 1 INFO: Loaded model mobilenet_v1_1.0_224_quant.tflite INFO: resolved reporter INFO: Created TensorFlow Lite delegate for NNAPI. NNAPI delegate created. INFO: Applied NNAPI delegate. W [query_hardware_caps:71]Unsupported evis version INFO: invoked INFO: average time: 103.217 ms INFO: 0.784314: 653 military uniform INFO: 0.105882: 907 Windsor tie INFO: 0.0156863: 458 bow tie INFO: 0.00784314: 466 bulletproof vest INFO: 0.00392157: 835 suit

For any number of threads > 1, GPU is slower than CPU. Is there a way to accelerate GPU inference time so that it is faster than the CPU?

topic Why GPU taking more time than CPU for inference ? in i.MX Processors

Why GPU taking more time than CPU for inference ?