imx8mp, Nnapi running inference at 10 Hz puts lots of load on the CPU

取消
显示结果 
显示  仅  | 搜索替代 
您的意思是: 

imx8mp, Nnapi running inference at 10 Hz puts lots of load on the CPU

916 次查看
colinbroderick
Contributor III

We're running a float32 model on the NPU of the imx8mp, via NNAPI. However, when the model is running it appears to put lots of load on the CPU, at least according to htop. htop shows all four cores at about 60% utilization.

I initially thought it might be data transfers or casts, but the data itself is very small (1200 floats) and already in the form the model expects. Plus the model only runs at about 10 Hz so even if the data was larger or needed casting or something, it should be fairly trivial work for the CPU.

We're running the model using onnxruntime in Python (for now):

 

data = np.random.rand(1, 6, 200).astype('float32')
ort_sess = ort.InferenceSession("model.onnx", providers=["NnapiExecutionProvider"])

for _ in range(1000):
    outputs = ort_sess.run(None, {'input': data})

 

Can any give advice on why the CPU is so taxed, or what we can do to diagnose the cause?

(Note that I know this NPU is not well geared for float32 inference - we're also working on quantization but having problems there too.)

Is there a way to directly measure load on the NPU to confirm it is even running there?

Thanks

0 项奖励
回复
1 回复

905 次查看
Bio_TICFSL
NXP TechSupport
NXP TechSupport

Hello  Colin,

 

The.MX 8M Plus NPU doesn't support float 32.

 

Regards

0 项奖励
回复