Why GPU taking more time than CPU for inference ?

取消
显示结果 
显示  仅  | 搜索替代 
您的意思是: 

Why GPU taking more time than CPU for inference ?

983 次查看
Swapnil_Shah
Contributor I

Image : LF_v5.15.5-1.0.0_images_IMX8MQEVK

Hi  @Bio_TICFSL, It seems that the GPU is underperforming while I am running the prebuilt model file, when I run this same thing on the CPU, it gives a much faster result. Below is the mentioned average time for CPU and GPU.

For CPU  ==> 

thread = 1:

 

./label_image -i grace_hoopper.bmp -l lables.txt -m mobilenet_v1_1.0_224_quant.tflite -t 1
INFO: Loaded model mobilenet_v1_1.0_224_quant.tflite
INFO: resolved reporter
INFO: invoked
INFO: average time: 179.697 ms
INFO: 0.764706: 653 military uniform
INFO: 0.121569: 907 Windsor tie
INFO: 0.0156863: 458 bow tie
INFO: 0.0117647: 466 bulletproof vest
INFO: 0.00784314: 835 suit

 

thread = 2:

 

./label_image -i grace_hoopper.bmp -l lables.txt -m mobilenet_v1_1.0_224_quant.tflite -t 2
INFO: Loaded model mobilenet_v1_1.0_224_quant.tflite
INFO: resolved reporter
INFO: invoked
INFO: average time: 92.645 ms
INFO: 0.764706: 653 military uniform
INFO: 0.121569: 907 Windsor tie
INFO: 0.0156863: 458 bow tie
INFO: 0.0117647: 466 bulletproof vest
INFO: 0.00784314: 835 suit

 

thread = 3 :

 

./label_image -i grace_hoopper.bmp -l lables.txt -m mobilenet_v1_1.0_224_quant.tflite -t 3
INFO: Loaded model mobilenet_v1_1.0_224_quant.tflite
INFO: resolved reporter
INFO: invoked
INFO: average time: 64.785 ms
INFO: 0.764706: 653 military uniform
INFO: 0.121569: 907 Windsor tie
INFO: 0.0156863: 458 bow tie
INFO: 0.0117647: 466 bulletproof vest
INFO: 0.00784314: 835 suit

 

thread = 4 :

 

./label_image -i grace_hoopper.bmp -l lables.txt -m mobilenet_v1_1.0_224_quant.tflite -t 4
INFO: Loaded model mobilenet_v1_1.0_224_quant.tflite
INFO: resolved reporter
INFO: invoked
INFO: average time: 48.975 ms
INFO: 0.764706: 653 military uniform
INFO: 0.121569: 907 Windsor tie
INFO: 0.0156863: 458 bow tie
INFO: 0.0117647: 466 bulletproof vest
INFO: 0.00784314: 835 suit

 

 

For GPU ==>

 

./label_image -i grace_hoopper.bmp -l lables.txt -m mobilenet_v1_1.0_224_quant.tflite -a 1
INFO: Loaded model mobilenet_v1_1.0_224_quant.tflite
INFO: resolved reporter
INFO: Created TensorFlow Lite delegate for NNAPI.
NNAPI delegate created.
INFO: Applied NNAPI delegate.
W [query_hardware_caps:71]Unsupported evis version
INFO: invoked
INFO: average time: 103.217 ms
INFO: 0.784314: 653 military uniform
INFO: 0.105882: 907 Windsor tie
INFO: 0.0156863: 458 bow tie
INFO: 0.00784314: 466 bulletproof vest
INFO: 0.00392157: 835 suit

 

 

For any number of threads > 1, GPU is slower than CPU. Is there a way to accelerate GPU inference time so that it is faster than the CPU?

标签 (1)
0 项奖励
回复
0 回复数