[imx8mp] VPU decode format

b_m · ‎08-21-2024

I have H264 file encoded using ffmpeg from a YUV420 I420 image file (Y size = 1920*1080).

Now, I am trying to make an app that decodes the above H264 file, using VPU wrapper provided by imx-gst1.0-plugin. I am expecting that this H264 file will be decoded back to the original YUV420 I420 image file.

Inspecting the actual decoded image file in a binary editor, I found out 2 things that are different from my expectation:
1. YUV type is of YUV420 NV12, even though the original YUV image input is YUV420 I420.
2. YUV size is different. Actual Y size is "1920*1088", combined UV size is "(960*544)*2".
Seems like the height part is aligned to 16 byte automatically by VPU.
Probably the width part too, but since "1920" and "960" are already 16-byte aligned, width doesn't look like it was checked.

Questions are:
1. Is there a way to set the color format VPU decode output to be of YUV420, specifically I420?
2. Should the original input image's width and height be aligned to 16 bytes? Are there any implications if decoded image's size is different from the original input image's size?

Environment
Board: IMX8MP EVK
BSP: yocto-real-time-edge

b_m · ‎08-25-2024

Hi NXP Tech-Supp
This post is still not resolved, I am still looking for a way to configure the format of VPU's decode output to be OTHER THAN YUV420 NV12. Input is H264 file, created from encoding a YUV420 I420. I want the output to be YUV420 I420, but actual output is YUV420 NV12.

I tried changing H264 file (created from encoding a YUV422 file instead) , and the decoded output is still YUV420 NV12 and not YUV422.

Is VPU decode output's format fixed to YUV420 NV12?

b_m · ‎08-25-2024

I just found this on the reference manual.

16.1.2.1.1 Raster-scan format
The output picture of the decoder is in semi-planar YCbCr 4:2:0 format, i.e. luminance
data forms one plane in memory, and chrominance data forms another. The output picture
has to be stored linearly and contiguously in the memory. The interleaved chrominance
block has to be stored right after the luminance block in external memory as shown in the
following figure. The number of luminance pixels in one row must be divisible by 16.

Looks like my interpretations was right, VPU decode is fixed to YUV 420 NV12 (semi-planar). I would appreciate confirmation from NXP.

malik_cisse · ‎08-22-2024

Hi,
YUV420 I420 and NV12 are basically the same. Only pixel organisation in memory is different. I420 is planar organisation (Y-plane followed by U-plane followed by U-Plane) whereas NV12 is semi planar (first Y plane then UVUVUV.. mixed).

The reason why you have 1020x1088 instead of 1920x1080 is that H.264 encodes 16x16 macroblocks. Its actually your ffmpeg that encodes that way. You will see the last 8Pixels in your output image are black. You are free to manually discard these pixels.

Also don't expect to recover your input YUV420 I420 Raw image after decoding because H.264 compression is lossy, even if encoded with quantizer=1. To be more precise, a LOSSLESS mode is possible but only when using some advanced profiles which are not supported on most encoders/decoders anyway.

b_m · ‎08-22-2024

Hi @malik_cisse

Thanks for the detailed response. I learned a lot!

>YUV420 I420 and NV12 are basically the same.
I understood the similarities/differences of these 2 types.
I want to clarify whether or not the VPU is configurable to output I420 by default, or is the VPU decoder's output strictly just NV12?
If it's the latter, then I guess my app should have some sort of post-decode processing to convert NV12 manually to I420.

>Its actually your ffmpeg that encodes that way. You will see the last 8Pixels in your output image are black.
I see, meaning the original YUV (1920*1080) was encoded to H264 (1920*1088) by ffmpeg.
So, when this ffmpeg-encoded H264 file is decoded by VPU, naturally the output YUV's size is 1920*1088.

>Also don't expect to recover your input YUV420 I420 Raw image after decoding because H.264 compression is lossy, even if encoded with quantizer=1.
Thanks for this additional information.

malik_cisse · ‎08-22-2024

Yes, you can configure the VPU output format to be I420 as shown below. I checked on my i.mx8mp board. Its the "output-format" parameter:

root@myboard-imx8mp-3:~# gst-inspect-1.0 | grep vpu

vpu: vpudec: IMX VPU-based video decoder

vpu: vpuenc_h264: IMX VPU-based AVC/H264 video encoder

vpu: vpuenc_hevc: IMX VPU-based HEVC video encoder

root@myboard-imx8mp-3:~# gst-inspect-1.0 vpudec

Factory Details:

Rank primary + 1 (257)

Long-name IMX VPU-based video decoder

Klass Codec/Decoder/Video

Description Decode compressed video to raw data

Author Multimedia Team <shmmmw@freescale.com>

Plugin Details:

Name vpu

Description VPU video codec

Filename /usr/lib/gstreamer-1.0/libgstvpu.so

Version 4.7.2

License LGPL

Source module imx-gst1.0-plugin

Binary package Freescle Gstreamer Multimedia Plugins

Origin URL http://www.freescale.com

GObject

+----GInitiallyUnowned

+----GstObject

+----GstElement

+----GstVideoDecoder

+----GstVpuDec

Pad Templates:

SINK template: 'sink'

Availability: Always

Capabilities:

video/x-h265

video/x-vp9

video/x-vp8

video/x-h264

SRC template: 'src'

Availability: Always

Capabilities:

video/x-raw

format: { (string)NV12, (string)I420, (string)YV12, (string)Y42B, (string)NV16, (string)Y444, (string)NV24, (string)NV12_10LE }

width: [ 1, 2147483647 ]

height: [ 1, 2147483647 ]

framerate: [ 0/1, 2147483647/1 ]

Element has no clocking capabilities.

Element has no URI handling capabilities.

Pads:

SINK: 'sink'

Pad Template: 'sink'

SRC: 'src'

Pad Template: 'src'

Element Properties:

automatic-request-sync-point-flags: Flags to use when automatically requesting sync points

flags: readable, writable

Flags "GstVideoDecoderRequestSyncPointFlags" Default: 0x00000003, "corrupt-output+discard-input"

(0x00000001): discard-input - GST_VIDEO_DECODER_REQUEST_SYNC_POINT_DISCARD_INPUT

(0x00000002): corrupt-output - GST_VIDEO_DECODER_REQUEST_SYNC_POINT_CORRUPT_OUTPUT

automatic-request-sync-points: Automatically request sync points when it would be useful

flags: readable, writable

Boolean. Default: false

disable-reorder : disable vpu reorder when end to end streaming

flags: readable, writable

Boolean. Default: false

discard-corrupted-frames: Discard frames marked as corrupted instead of outputting them

flags: readable, writable

Boolean. Default: false

frame-drop : enable adaptive frame drop for smoothly playback

flags: readable, writable

Boolean. Default: true

frame-plus : set number of addtional frames for smoothly playback

flags: readable, writable

Unsigned Integer. Range: 0 - 16 Default: 3

max-errors : Max consecutive decoder errors before returning flow error

flags: readable, writable

Integer. Range: -1 - 2147483647 Default: 10

min-force-key-unit-interval: Minimum interval between force-keyunit requests in nanoseconds

flags: readable, writable

Unsigned Integer64. Range: 0 - 18446744073709551615 Default: 0

name : The name of the object

flags: readable, writable, 0x2000

String. Default: "vpudec0"

output-format : set raw video format for output (Y42B NV16 Y444 NV24 only for MJPEG)

flags: readable, writable

Enum "GstVpuDecOutputFormat" Default: 0, "auto"

(0): auto - enable chroma interleave. (default)

(1): NV12 - NV12 format

(2): I420 - I420 format

(3): YV12 - YV12 format

(4): Y42B - Y42B format

(5): NV16 - NV16 format

(6): Y444 - Y444 format

(7): NV24 - NV24 format

parent : The parent of the object

flags: readable, writable, 0x2000

Object of type "GstObject"

qos : Handle Quality-of-Service events from downstream

flags: readable, writable

Boolean. Default: true

use-vpu-memory : use vpu allocate video frame buffer

flags: readable, writable

Boolean. Default: true

b_m · ‎08-22-2024

Hi @malik_cisse!

Thanks for the reply.

I believe those are the options for the GST plugin "vpudec" that uses the VPU wrapper, so these options are not for the VPU wrapper itself.

My app only uses VPU wrapper directly, and doesn't use any GST APIs or plugins.

I tried looking for a similar option "output-format" in the VPU wrapper and the closest I could find is VpuDecOpenParam. nChromaInterleave, but I already set it to 0 (= non-interleave = I420), so output should be I420.

malik_cisse · ‎08-23-2024

Hi @b_m,

Ok I see.
I am also interested in the VPU wrapper since it exposes more encoding options.
I would like to use it to encode live video stream coming from a /dev/video v4l2 camera device.
Do you know if this would be feasible? any pointers to sample code to get started?
Thx

b_m · ‎08-25-2024

@malik_cisse
>Do you know if this would be feasible?
I think so, you can check the sample implementation provided by GST via gstvpuenc.c
I am not sure about the details, though.

My app's current spec is to encode/decode image files directly and not from a live stream, so I am focusing on that.