We would like to do 1080p30 video encoding (H.264 if possible) and preview (video loopback) at the same time on i.MX6. Is that possible?
We did some measurements and while we can do either 1080p30 encoding or video loopback, it seems that doing both things at same time seems problematic.
Measurements:
I did measurements of VPU video encoding performance with help of mxc_vpu_test.out unit test. Since it is not possible to encode live stream from camera while running mxc_v4l2_overlay.out or GStreamer pipeline with mfw_v4lsrc (two processes accessing one V4L device), I saved 100 frames from camera into ramdisk (/tmp) via filesink, which I use to measure video encoder performance. The input data are therefore always exactly the same.
In the table below you can see results (enc fps from mxc_vpu_test.out) from several different scenarios:
We use 32bit framebuffer, because using 16 bits means less colors and ugly banding in the resulting image on the display. I have also included results measured with 16bits framebuffer, so you can compare the results.
bpp=16 | bpp=32 | |||||||||
only mxc_vpu_test.out running | only mxc_vpu_test.out running | |||||||||
codec / gopsize | gop = 1 | gop = 5 | gop = 10 | gop = 15 | codec / gopsize | gop = 1 | gop = 5 | gop = 10 | gop = 15 | |
MPEG4 | 53,27 | 46,92 | 46,23 | 46,02 | MPEG4 | 53,27 | 46,02 | 45,26 | 45,02 | |
H.263 | 52,78 | 48,56 | 48,09 | 47,95 | H.263 | 52,77 | 48,52 | 48,05 | 47,91 | |
H.264 | 48,57 | 46,49 | 46,24 | 46,18 | H.264 | 48,56 | 45,87 | 45,54 | 45,46 | |
MJPG | 139,31 | MJPG | 139,47 | |||||||
mxc_vpu_test.out & gst-launch mfw_v4lsrc ! mfw_isink | mxc_vpu_test.out & gst-launch mfw_v4lsrc ! mfw_isink | |||||||||
codec / gopsize | gop = 1 | gop = 5 | gop = 10 | gop = 15 | codec / gopsize | gop = 1 | gop = 5 | gop = 10 | gop = 15 | |
MPEG4 | 40,75 | 29,08 | 28,03 | 27,74 | MPEG4 | 30,31 | 21,53 | 20,71 | 20,51 | |
H.263 | 40 | 28,95 | 27,98 | 27,72 | H.263 | 30,06 | 21,41 | 20,63 | 20,44 | |
H.264 | 35,22 | 26,83 | 26,04 | 25,81 | H.264 | 26,39 | 19,74 | 19,11 | 18,96 | |
MJPG | 110,73 | MJPG | 82,03 | |||||||
mxc_vpu_test.out & mxc_v4l2_overlay.out | ||||||||||
codec / gopsize | gop = 1 | gop = 5 | gop = 10 | gop = 15 | ||||||
MPEG4 | 52,97 | 39,87 | 38,69 | 38,33 | ||||||
H.263 | 52,68 | 40,53 | 39,39 | 39,06 | ||||||
H.264 | 47,67 | 38,06 | 37,12 | 36,88 | ||||||
MJPG | 135,51 |
As you can see, there is huge performance drop when using isink for video loopback. The performance is not sufficient for encoding the video stream in real time, so the resulting stream is missing some frames. Using mxc_v4l2_overlay.out seems to be much better alternative, unfortunately we are not sure if we can combine it with GStreamer.
Is it possible to use mxc_v4l2_overlay.out for video loopback and GStreamer for video saving at same time?
Thanks a lot for any answer.
Hi Ivo
i.MX6S supports 1080p30 encode + decode but this is max. VPU
bare metal performance capability. Usually this can be obtained in OS-less
environment to avoid OS side effects, that is this is performance VPU
module itself. Probably it can be obtained in your case too, however
software should be optimized for obtaining these max. figures.
In particular VPU frequency should be configured to 350MHz
more obtaining max. characteristics.
Best regards
igor
Hi Igor,
well, it seems that VPU itself is powerful enough to do 1080p30 encoding quite well, but the mfw_isink used to display the image at same time is causing major performance drop. I would like to avoid mfw_isink for video loopback and use more direct path, as the mxc_v4l2_overlay.out probably does. But I have no idea if it is possible to configure IPU/VPU to pass the image to the display directly AND save the video stream at the same time (preferably using GStreamer). Is there any way to do that?
I have tried locking VPU clock to 352 MHz (CONFIG_MX6_VPU_352M=y in kernel), but it seems that there is no performance change whatsoever. The results posted in the first post are with CONFIG_MX6_VPU_352M enabled.
Best regards
Ivo
Hi Ivo
one can try latest BSP, it has improved VPU firmware and Gstreamer 1.x support
L3.10.53_1.1.0_iMX6QDLS_Bundle : i.MX 6Quad, i.MX 6Dual, i.MX 6DualLite, i.MX 6Solo
Linux Binary Demo Files and Linux BSP Documentation
Best regards
igor
Hi Igor,
since we use 3rd party SoM card, it is quite difficult to just try latest BSP from Freescale until the manufacturer of SoM integrates needed changes. We currently use BSP based on fsl-L3.10.17_1.0.0GA release 3.0.35 kernel (sorry, I am unsure which BSP release it is) with GStreamer 0.10. Is there any way to do that in BSP?
Best regards
Ivo
Hi Igor,
thank for the tip. I was not aware of the -L option for VPU unit test. I have tried it few minutes ago and the latency is terrible. The processing introduces more than 400 ms delay, which is unacceptable. That is much worse than GStreamer pipeline with isink, which delays the signal by about 140 ms.
Best regards,
Ivo
Hi Ivo
i.MX6 VPU docs do not provide numbers for latencies (delays),
only fps are provided (guaranteed).
Best regards
igor