Hi Aleksandr,
TensorFlow Lite does not have Python bindings, like C++ (CPU, GPU/NPU), for delegates. By default, it uses NNAPI delegate (when you run a demo, you can see it by the following log message: INFO: Created TensorFlow Lite delegate for NNAPI). NNAPI delegate automatically delegates the inference to GPU/NPU.
About Arm NN, it does work with GPU as described in the table (the table is to inform if it is supported or not, and not necessary is the default - we will try to let this more clear in the next version), but you do need to change in the code from Cpu to VsiNpu in order to run inference on GPU/NPU.
PyeIQ focuses on MPlus, so we decided that our default, in this case, would be CPU, ONLY because this particular model (fire detection: float32) is not quantized (uint8) and when this happens CPU works better. If you had a quantized model, please change to VsiNpu that will work way faster :smileyhappy:
Thanks,
Diego