I am using LTIB for iWave, Linux-3.0.35. Cameras are not native to iMX6 platform, Camera Capture chip is TW6869 on PCI-e bus.
Currently working implementation:
CamCap => [RGB DMA buffer] => IPU => [YUV420 DMA buffer] => GStreamer => [YUV420 DMA buffer] => VPU => [AVC DMA buffer] => GStreamer => final sink
gst-launch -v --gst-debug-level=1 v4l2src device=/dev/video5 ! video/x-raw-yuv,format=(fourcc)I420,width=(int)640,height=(int)480,framerate=30/1 ! vpuenc codec=6 seqheader-method=3 bitrate=500000 gopsize=15 quant=10 framerate-nu=30 framerate-de=1 force-framerate=true ! mpegtsmux ! tcpserversink port=5005
Camera Capture Driver invokes directly IPU Kernel Driver "mxc_ipu.ko" and submits the conversion tasks. IPU driver sources are in ".../linux-3.0.35/drivers/mxc/ipu3/" folder.
YUV420 buffer, as delivered by IPU, is exported to GStreamer by remapping physical address to User VAddr. Full efficiency & performance here.
Gstreamer invokes VPU on this stack of modules:
libgstvideo4linux2.so -- GST plugin; source files in ".../ltib/rpm/BUILD/gst-plugins-good-0.10.30/sys/v4l2/" folder; VPU-related source is in "gstv4l2object.c"
libvpu.so.4 -- VPU Application Driver; source files in ".../ltib/rpm/BUILD/imx-lib-3.0.35-4.0.0/vpu/" folder; portions of these sources are ported to Kernel
mxs_vpu.ko -- VPU Kernel Driver; source files in ".../linux-3.0.35/drivers/mxc/vpu/" folder
The GStreamer plugin "libgstvideo4linux2.so" is responsible for memcpy. Even plain visual inspection of the source file "gstv4l2object.c" reveals the extent of use.
In addition to it, for each frame subject to VPU compression, GSrteamer plugin executes the following inefficient sequence:
GetVpuPhysMem -- remap to VAddr -- copy YUV frame -- invoke VPU -- get compressed frame -- unmap VAddr -- FreeVpuPhysMem
This is the performance bottleneck I want to eliminate, by invoking VPU directly in the same manner I invoke IPU directly from Camera Capture Driver.
VPU interrupts work perfectly well in this scenario; the native libvpu.so.4 and mxc_vpu.ko drivers manage interrupts smoothly.
The TNVP format mentioned above does not appear to be accepted as FOURCC neither by GStreamer, nor by mxc_ipu.ko Kernel Driver. IPUv3 software deals with all kinds of YUV formats, TNVP FOURCC is not seen in any of IPUv3 source files. It seems that this format is only applicable to VPU decoder? If I am to try TNVP, what GSreamer command line would be suggested?
Practical efficient implementation under construction:
CamCap => [RGB DMA buffer] => IPU => [YUV420 DMA buffer] => VPU => [AVC DMA buffer] => GStreamer => final sink
gst-launch -v --gst-debug-level=1 v4l2src device=/dev/video5 ! video/x-h264,format=(fourcc)H264,width=(int)640,height=(int)480,framerate=30/1 ! mpegtsmux ! tcpserversink port=5005
This is the work-in-progress I need help with. In this architecture, the YUV420 output from IPU is directly fed to VPU, and only then the compressed AVC frame is passed to GST.
In all cases, there ought to be memcpy from VPU output DMA buffer to GST User buffer, as frame header for IFrames must be prepended when IFrames are delivered.
Only in thise ported code VPU interrupts are failing; this definitely is problem with the port and not with the original impleemntation.
Because H264 FOURCC is not recognized by VPU libgstvideo4linux2.so plugin, I had to apply patch to the source file "gstv4l2object.c" and rebuild the library.
It would be interesting to know, if in this direct-call-to-VPU scenario, ringbuffer mode may be applicable?
That is, would it be possible to link IPU and VPU execution together, so that when Camera delivers RGB frame, that frame can be fed to the IPU and then the ultimate output be delivered as comressed frame by the VPU? That would be the ideal dream-stream implementation.
I.D.