I have H264 file encoded using ffmpeg from a YUV420 I420 image file (Y size = 1920*1080).
Now, I am trying to make an app that decodes the above H264 file, using VPU wrapper provided by imx-gst1.0-plugin. I am expecting that this H264 file will be decoded back to the original YUV420 I420 image file.
Inspecting the actual decoded image file in a binary editor, I found out 2 things that are different from my expectation:
1. YUV type is of YUV420 NV12, even though the original YUV image input is YUV420 I420.
2. YUV size is different. Actual Y size is "1920*1088", combined UV size is "(960*544)*2".
Seems like the height part is aligned to 16 byte automatically by VPU.
Probably the width part too, but since "1920" and "960" are already 16-byte aligned, width doesn't look like it was checked.
Questions are:
1. Is there a way to set the color format VPU decode output to be of YUV420, specifically I420?
2. Should the original input image's width and height be aligned to 16 bytes? Are there any implications if decoded image's size is different from the original input image's size?
Environment
Board: IMX8MP EVK
BSP: yocto-real-time-edge
I just found this on the reference manual.
16.1.2.1.1 Raster-scan format
The output picture of the decoder is in semi-planar YCbCr 4:2:0 format, i.e. luminance
data forms one plane in memory, and chrominance data forms another. The output picture
has to be stored linearly and contiguously in the memory. The interleaved chrominance
block has to be stored right after the luminance block in external memory as shown in the
following figure. The number of luminance pixels in one row must be divisible by 16.
Looks like my interpretations was right, VPU decode is fixed to YUV 420 NV12 (semi-planar). I would appreciate confirmation from NXP.
Hi,
YUV420 I420 and NV12 are basically the same. Only pixel organisation in memory is different. I420 is planar organisation (Y-plane followed by U-plane followed by U-Plane) whereas NV12 is semi planar (first Y plane then UVUVUV.. mixed).
The reason why you have 1020x1088 instead of 1920x1080 is that H.264 encodes 16x16 macroblocks. Its actually your ffmpeg that encodes that way. You will see the last 8Pixels in your output image are black. You are free to manually discard these pixels.
Also don't expect to recover your input YUV420 I420 Raw image after decoding because H.264 compression is lossy, even if encoded with quantizer=1. To be more precise, a LOSSLESS mode is possible but only when using some advanced profiles which are not supported on most encoders/decoders anyway.
Hi @malik_cisse
Thanks for the detailed response. I learned a lot!
>YUV420 I420 and NV12 are basically the same.
I understood the similarities/differences of these 2 types.
I want to clarify whether or not the VPU is configurable to output I420 by default, or is the VPU decoder's output strictly just NV12?
If it's the latter, then I guess my app should have some sort of post-decode processing to convert NV12 manually to I420.
>Its actually your ffmpeg that encodes that way. You will see the last 8Pixels in your output image are black.
I see, meaning the original YUV (1920*1080) was encoded to H264 (1920*1088) by ffmpeg.
So, when this ffmpeg-encoded H264 file is decoded by VPU, naturally the output YUV's size is 1920*1088.
>Also don't expect to recover your input YUV420 I420 Raw image after decoding because H.264 compression is lossy, even if encoded with quantizer=1.
Thanks for this additional information.
Yes, you can configure the VPU output format to be I420 as shown below. I checked on my i.mx8mp board. Its the "output-format" parameter:
root@myboard-imx8mp-3:~# gst-inspect-1.0 | grep vpu
vpu: vpudec: IMX VPU-based video decoder
vpu: vpuenc_h264: IMX VPU-based AVC/H264 video encoder
vpu: vpuenc_hevc: IMX VPU-based HEVC video encoder
root@myboard-imx8mp-3:~# gst-inspect-1.0 vpudec
Factory Details:
Rank primary + 1 (257)
Long-name IMX VPU-based video decoder
Klass Codec/Decoder/Video
Description Decode compressed video to raw data
Author Multimedia Team <shmmmw@freescale.com>
Plugin Details:
Name vpu
Description VPU video codec
Filename /usr/lib/gstreamer-1.0/libgstvpu.so
Version 4.7.2
License LGPL
Source module imx-gst1.0-plugin
Binary package Freescle Gstreamer Multimedia Plugins
Origin URL http://www.freescale.com
GObject
+----GInitiallyUnowned
+----GstObject
+----GstElement
+----GstVideoDecoder
+----GstVpuDec
Pad Templates:
SINK template: 'sink'
Availability: Always
Capabilities:
video/x-h265
video/x-vp9
video/x-vp8
video/x-h264
SRC template: 'src'
Availability: Always
Capabilities:
video/x-raw
format: { (string)NV12, (string)I420, (string)YV12, (string)Y42B, (string)NV16, (string)Y444, (string)NV24, (string)NV12_10LE }
width: [ 1, 2147483647 ]
height: [ 1, 2147483647 ]
framerate: [ 0/1, 2147483647/1 ]
Element has no clocking capabilities.
Element has no URI handling capabilities.
Pads:
SINK: 'sink'
Pad Template: 'sink'
SRC: 'src'
Pad Template: 'src'
Element Properties:
automatic-request-sync-point-flags: Flags to use when automatically requesting sync points
flags: readable, writable
Flags "GstVideoDecoderRequestSyncPointFlags" Default: 0x00000003, "corrupt-output+discard-input"
(0x00000001): discard-input - GST_VIDEO_DECODER_REQUEST_SYNC_POINT_DISCARD_INPUT
(0x00000002): corrupt-output - GST_VIDEO_DECODER_REQUEST_SYNC_POINT_CORRUPT_OUTPUT
automatic-request-sync-points: Automatically request sync points when it would be useful
flags: readable, writable
Boolean. Default: false
disable-reorder : disable vpu reorder when end to end streaming
flags: readable, writable
Boolean. Default: false
discard-corrupted-frames: Discard frames marked as corrupted instead of outputting them
flags: readable, writable
Boolean. Default: false
frame-drop : enable adaptive frame drop for smoothly playback
flags: readable, writable
Boolean. Default: true
frame-plus : set number of addtional frames for smoothly playback
flags: readable, writable
Unsigned Integer. Range: 0 - 16 Default: 3
max-errors : Max consecutive decoder errors before returning flow error
flags: readable, writable
Integer. Range: -1 - 2147483647 Default: 10
min-force-key-unit-interval: Minimum interval between force-keyunit requests in nanoseconds
flags: readable, writable
Unsigned Integer64. Range: 0 - 18446744073709551615 Default: 0
name : The name of the object
flags: readable, writable, 0x2000
String. Default: "vpudec0"
output-format : set raw video format for output (Y42B NV16 Y444 NV24 only for MJPEG)
flags: readable, writable
Enum "GstVpuDecOutputFormat" Default: 0, "auto"
(0): auto - enable chroma interleave. (default)
(1): NV12 - NV12 format
(2): I420 - I420 format
(3): YV12 - YV12 format
(4): Y42B - Y42B format
(5): NV16 - NV16 format
(6): Y444 - Y444 format
(7): NV24 - NV24 format
parent : The parent of the object
flags: readable, writable, 0x2000
Object of type "GstObject"
qos : Handle Quality-of-Service events from downstream
flags: readable, writable
Boolean. Default: true
use-vpu-memory : use vpu allocate video frame buffer
flags: readable, writable
Boolean. Default: true
Hi @malik_cisse!
Thanks for the reply.
I believe those are the options for the GST plugin "vpudec" that uses the VPU wrapper, so these options are not for the VPU wrapper itself.
My app only uses VPU wrapper directly, and doesn't use any GST APIs or plugins.
I tried looking for a similar option "output-format" in the VPU wrapper and the closest I could find is VpuDecOpenParam. nChromaInterleave, but I already set it to 0 (= non-interleave = I420), so output should be I420.
Hi @b_m,
Ok I see.
I am also interested in the VPU wrapper since it exposes more encoding options.
I would like to use it to encode live video stream coming from a /dev/video v4l2 camera device.
Do you know if this would be feasible? any pointers to sample code to get started?
Thx
@malik_cisse
>Do you know if this would be feasible?
I think so, you can check the sample implementation provided by GST via gstvpuenc.c
I am not sure about the details, though.
My app's current spec is to encode/decode image files directly and not from a live stream, so I am focusing on that.