Batch size and inference time on NPU

取消
显示结果 
显示  仅  | 搜索替代 
您的意思是: 

Batch size and inference time on NPU

224 次查看
kevin_allen
Contributor I

Hello there, 

I am using a board from the company PhyTec that is equipped with the I.MX 8M Plus chip and NPU.  I am using a TensorFlow Lite model with a ResNet50 architecture to perform classification. The model in quantized (8-bit unit) to run on the NPU. The inference time for a single image of shape 224x224x3 is approximately 23 ms. When I try to increase the batch size, the inference time is approximately equal to 23*batchSize. 

I wonder whether this is normal. On larger desktop GPUs, processing a small batch of images (e.g., 16) usually takes less time than the inference time for a single image multiplied by the batch size. On the NPU of the I.MX 8M Plus chip, I don't see such a gain. Should a faster inference time be expected from processing images with a batch size larger than one?

Any feedback will be appreciated.

0 项奖励
回复
2 回复数

179 次查看
kevin_allen
Contributor I

Thanks for the quick reply. 

I am not sure I fully understand your answer. So, if it takes 23 ms to process a single image (using a batch size of 1), you would expect that a batch of 6 images takes approximately 23 ms x 6  (or 138 ms) to process. Is this right?

0 项奖励
回复

203 次查看
Bio_TICFSL
NXP TechSupport
NXP TechSupport

Hello,

 

Yes it is expected but not on 23*batchsized is like 10*batchsize depend of the size.

Regards

0 项奖励
回复