Hi,
In our project we are using MX6Q to encode live video and transfer the H.264/AVC encoded video stream through a 3G wireless network.
The platform is MX6Q/Linux( Ltib L3.0.35_4.1.0_130816).
The video input port of MX6Q is CSI0.
Case 1. The input video is PAL format (720x576@fps) from a single camera sensor.
The sender and receiver pipeline are as follows:
//////////////////////////////////////////////////////
Sender(MX6Q):
gst-launch -v tvsrc device=/dev/video0 capture-mode=0 ! 'video/x-raw-yuv,format=(fourcc)UYVY,width=720,height=576,framerate=12/1' ! mfw_ipucsc ! 'video/x-raw-yuv,format=(fourcc)I420, width=640,height=480,framerate=12/1'! vpuenc codec=avc bitrate=300000 gopsize=3 ! video/x-h264,width=640,height=480 ! rtph264pay mtu=1024 ! udpsink host=<host_ip>port=5004 sync=false async=false
Receiver (host PC):
gst-launch -v udpsrc port=5004 caps='application/x-rtp, media=(string)video, clock-rate=(int)90000, encoding-name=(string)H264, sprop-parameter-sets=(string)\"Z0JAHqaAtBJkAA\\=\\=\\,aM4wpIAA\", payload=(int)96, ssrc=(uint)1949063162, clock-base=(uint)1878546753, seqnum-base=(uint)42864' ! rtph264depay ! decodebin ! xvimagesink sync=false async=false
//////////////////////////////////////////////////////
Note that as the input video size is 720x576 in UYVY format, the sender gstreamer pipeline uses element 'mfw_ipucsc' to scale the image size down to 640x480 (to reduce the network bandwidth requirement) and transform the format to I420 (which is required by 'vpuenc').
Also among the 'vpuenc' properties, the video encoding standard is set to H.264/AVC, bitrate is set to 300kbps fixed bitrate, and gopsize is set to 3( IPPIPP, there is a 'I' frame every 2 'P' frames). The frame rate is reduced to 12fps (from the original 25fps) so as to reduce the bandwidth requirement of the 3G transfer network.
In this case, with the above settings, both the sender and receiver pipelines work well,
the receiver can receive and decode the video stream very well. The quality of the decoded video stream is quite good and satifying.
Case 2. The MX6Q CSI0 video input is the output from a Intersil video muxer TW2828. TW2828 accepts 8 independent cameras (each as 720x576@25fps) and combine them into a 1920x1080@25fps image. The combined image is split as a 3x3 grid, with each grid's image comes from a single camera. ( Enclosed please find the screen snapshot of one combined video frame). The combined 1920x1080@25fps video stream is fed to CSI0 port and encoded by VPU and then transfered through the same 3G wireless network.
Here are the sender and receiver pipelines:
//////////////////////////////////////////////////////
Sender(MX6Q):
gst-launch -v tvsrc device=/dev/video0 capture-mode=1 ! 'video/x-raw-yuv,format=(fourcc)UYVY,width=1920,height=1080,framerate=12/1' ! mfw_ipucsc ! 'video/x-raw-yuv,format=(fourcc)I420, width=640,height=480,framerate=12/1'! vpuenc codec=avc bitrate=300000 gopsize=3 ! video/x-h264,width=640,height=480 ! rtph264pay mtu=1024 ! udpsink host=<host_ip>port=5004 sync=false async=false
Receiver (host PC):
gst-launch -v udpsrc port=5004 caps='application/x-rtp, media=(string)video, clock-rate=(int)90000, encoding-name=(string)H264, sprop-parameter-sets=(string)\"Z0JAHqaAtBJkAA\\=\\=\\,aM4wpIAA\", payload=(int)96, ssrc=(uint)1949063162, clock-base=(uint)1878546753, seqnum-base=(uint)42864' ! rtph264depay ! decodebin ! xvimagesink sync=false async=false
//////////////////////////////////////////////////////
From the sender pipeline, it can be seen that the size of the input video of 'vpuenc' is still 640x480.
However, when all 8 cameras are connected to TW2828, the combined video frame has 8 different video inputs in each grid,
even if the combined video is also scaled down to 640x480, and with all the properties of 'vpuenc' elemenet keeps the same as the first case, the quality of the decoded video stream on the receiver side is much worse than that of the first case. There are quite a lot of frames that can NOT be correctly decoded.
Compared to the first case, the input video size and properties of 'vpuenc' are the same, and the network condition is similar. The only noteworthy difference is the combied video input from 8 different cameras. In this case, as a single video frame cantains 8 different image from 8 cameras, the intra- and inter- prediction of H.264/AVC algorithm will certainly obtain much worse effect than the first case.
I'm not sure if the 300kbps is enough for the second case. Should it be enlarged?
Also enclosed please find the source code of the gstreamer element 'vpuenc'.
Many of the vpuenc properties can be found in the source file.
We note that the 'EnableAutoSkip' is set to zero, which means disabled.
If 300kbps is not large enough, should we enable 'EnableAutoSkip'?
Or are there any othere properties/parameters of 'vpuenc' that should be adjusted?
How can we obtain better decoded video quality for the second case?
Thanks,
Robbie
Original Attachment has been moved to: vpuenc.c.zip
I think optimal parameters should be defined experimentally.
Have a great day,
Yuri
-----------------------------------------------------------------------------------------------------------------------
Note: If this post answers your question, please click the Correct Answer button. Thank you!
-----------------------------------------------------------------------------------------------------------------------