Hi,
We are currently evaluating the i.MX8 for building a digital signage system. One of the key features required to work is picture-in-picture. For evaluating the performance we run 4 gstreamer pipelines with a 1080p@30 video in parallel. For better bandwidth utilization the gstreamer pipelines enable the hantro tiling.
Two of the wayland clients end up on the two available drm overlay planes and work fine, but the remaining two clients which are rendered on the primary plane are not rendered correctly. If we disable the tiling all clients are shown correctly, but we are not able to meet the required frame rate.
We see the same behavior in our own wayland compositor and when using weston-imx. The dma buffers submitted by the waylandsink are successfully imported with EGL_EXT_image_dma_buf_import_modifiers and we also checked that the modifier has the expected value.
We would like to know if this is a limitation of the vivante gpu or if this is expected to work.
Br,
Christian
Hi,
Which processor are you using? could you help to share the command to reproduce it?
Regards
Hi,
Currently we try to integrate the i.MX8M Quad.
For testing the performance we use weston-imx with the drm backend and run 4 gstreamer pipelines with the wayland sink in parallel. Two of them get assigned to the drm underlay planes and work fine, the remaining two are rendered with opengl es on the vivante gpu and are broken.
For testing we use the following gstreamer pipeline:
gst-launch-1.0 -v souphttpsrc location="http://distribution.bbb3d.renderfarming.net/video/mp4/bbb_sunflower_1080p_30fps_normal.mp4" ! queue ! qtdemux ! h264parse ! vpudec ! queue ! waylandsink enable-tile=true window-width=960 window-height=540
Thanks!
Br, Christian
Hi,
Multiple enable-tile is not included in our test case,
enable-tile : When enabled, the sink propose VSI tile modifier to VPU flags: readable, writable Boolean. Default: false
I can reproduce this issue in i.MX8MQ EVK, when enable-tile=false:
when enable-tile=true, the primary plane are not rendered correctly,
is this phenomena what you meet? if yes, is a bug in our system and will be fixed I will follow up with the results.
Regards and thanks for the catch.
Hi,
Yes, that is exactly the issue we face, thanks for confirmation.
Looking forward to see this resolved as this is currently a blocking issue for us.
Br,
Christian
Hi,
"enable-tile=true" will force vpu output tile format frame, but only two video planes support detiling, it's hardware limitation, the other two tiles will be processed by the gpu, but gpu can't process the tile format of vpu.
Regards
Hi,
We did another test where we mixed tiled and non tiled buffers, first we launched the two non tiled pipelines and after that the two tiled ones. This way the two tiled pipelines should end up on the two planes and the two non tiled get rendered by the GPU.
This seems to work fine, but when we close one of the tiled clients the complete output is broken and the issue persists until the next reboot, a restart of weston is not enough.
After a bit of testing it seems the issue occours when the format/modifier is switched on the plane, so in this case it switches from G1 Hantro to Linear.
Br,
Christian
Hi,
Ok, understood, thanks!
But then there probably is a bug in the egl implementation, I would expect a call to EGL_EXT_image_dma_buf_import_modifiers to fail if the format/modifier combination is not supported by the GPU. Can you please clarify this?
Thanks,
Christian
Hi,
Yes, i can reproduce it. We will back at you when We fixed.
Regards
We are still analysing this issue, here is feedback from RD
' I was able to replicate this with both 5.4.70 and 5.15.71. But only when all windows are 960x540... Not exactly sure what's going on here... I'll dig a little deeper to see what's causing this. Usually, down-scaling makes the DDR bandwidth spike-up for the duration the window is displayed on screen because DCSS needs to fetch the entire buffer from memory in a fraction of the time. Down-scaling a 1080p frame by 2:1 will double the DDR bandwidth because the buffer fetching needs to happen in half the time. Since DCSS is missing the QoS lines, we cannot signal that we need more bandwidth... But, this might be something else. At first sight, it looks like Weston is moving a linear buffer in place of the already closed tiled one. I'll need to look at the DRM flow to see what's going on.
Anyway, I'll do some debugging and let you know my findings.'
Regards