Hello.
Normally inference is run on the GPU or NPU based on the USE_GPU_INFERENCE=0/1 env. variable, but I would like to make this choice in code. In the Machine Learning guide it says that that variable is "directly read by the HW acceleration driver".

Is it feasible to modify the GPU/NPU unified driver to expose this functionality to high abstraction layers? Where can I find it?
If not, What else could I do?
Thanks