We have a custom board based on SabreSD. We've been able to decode and playback H.264 files in 1080p@60fps in Android 4.3 without skipped frames. Now, we've moved to Android 5.1.1 and are not able to reach the same performance level anymore.
Having investigated the issue, one of the most striking differences that I observed was the drop in the VPU average burst write size obtained with the mmdc profiling tool, while playing an H.264 1080p@60fps file:
|Android 4.3||Android 5.1|
|Total cycles count||264478000||264073304|
|Busy cycles count||244209294||246661781|
|Read accesses count||7721279||5923053|
|Write accesses count||1598636||4961559|
|Read bytes count||220912824||172561608|
|Write bytes count||98518144||81773840|
|Avg. Read burst size||28||29|
|Avg. Write burst size||61||16|
|Read||420.52 MB/s||329.14 MB/s|
|Write||187.53 MB/s||155.97 MB/s|
|Total||608.05 MB/s||485.11 MB/s|
|Overall Bus Load||92%||93%|
I understand that the overall problem is essentially the MMDC throughput limit. On average the overall AXI bus load is ~4% higher in 5.1, and I suspect that in 4.3 the limit to decode&play H.264 @ 60 fps is approached very closely, whereas in 5.1 we're over. And it seems that the VPU has suddenly become less efficient. The reason why is what I'm trying to figure out.
Just some additional info:
- In 5.1 when the VPU is made to work without the IPU taking a piece of the DDR bandwidth (when decoded frames are simply dropped and not displayed, and the bus load drops below 90%), the write burst size is still 16, so it is not a "fight against the IPU" throughput issue.
- In 5.1 when the VPU is decoding the frames in this fashion it takes <16ms on average for each frame (like in 4.3). When the decoded frames are being converted and displayed by IPU, as they should be, it takes >16ms on average for each frame to be decoded, hence the inability to do 60 fps.
- I played with VPU QoS settings - didn't change anything
- I tried mxc-vpu-test with all sorts of parameters - did't change anything (actually, the best result was ~50 fps with G2D output in 5.1, whereas in 4.3 outputting with V4L2 easily gave ~60 fps)
- I put the 4.3 VPU firmware (2.3.10) instead of the 5.1 VPU firmware (3.1.1) - didn't change anything.
- In Android 6 I tried on SabreSD the write burst size is even less (~12!) and lots of skipped frames.
Do you have an idea why this write burst size drop has happened? Is there a way I can make the VPU use the memory bandwidth more efficiently, so that the bus load decreases, utilization increases and I can have IPU and VPU both use the DRAM happily and let me watch my H.264 1080p@60 in peace, like before?