I'm profiling a Qt app for an iMX6 IoT device to try to diagnose low frame rate.
I'm thinking we are probably limited by CPU not GPU, but I have one thing from vAnalyser that worries me and that I don't understand. The shader utilisation is very high.
Below is a screenshot of the frame analysis for a typical frame. I see that driver utilisation and GPU utilisation are both really low, which is fine. But the shader utilisation is at over 200%, and this is the same for ALL frames.
Can anyone explain in basic terms what it means for the shader utilisation to be above 200% consistently, and what effect that is likely to have on our application performance?
Thanks!
Hello ATPark,
That part of the advice is local to the shader core - load/store is the highest pipeline load, so that's the critical path for shader execution (ignoring external memory effects).
In addition to compute, load/store is used for reading and writing vertex data. Check your complexity (primitive count, vertex count, and number of bytes per vertex) and CPU-side culling to make sure you are not processing more geometry that the GPU can cope with.
In general for mobile aim for < 300-500K input vertices per frame, and ~32 bytes per vertex.
The driver is utilization 1% so not worry about the DRAM effects.
Regards
Thanks for getting back to me on this!
Below is a typical frame with details of the vertex count and primitive count. Both are reasonably stable across frames, except for a few frames where the input primitives are '12' and the 'trivially reject count' is zero. Either way, the counts are much, much lower than 300-500k.
However, I don't know how to see the bytes per vertex; is it possible that that is very high for some reason?
With respect to 'CPU-side culling', do you know how I could check that? We aren't doing any graphics programming directly - we are drawing using Qt controls. Do you know anything we should check about our use of Qt?
Thanks,
Tony
Hi,
There is no way to see how many bytes per vertex is used, and for CPU culling you may be able to see this blog:
https://bruop.github.io/frustum_culling/
Regards