We are currently building this GStreamer pipline:
appsrc ! "video/x-h264, width=(int)1024, height=(int)768, framerate=(fraction)0/1, stream-format=(string)avc" ! queue max-size-time=0 ! vpudec output-fotmat=2 ! "video/x-raw,(string)format=I420, width=(int)1024, height=(int)768" ! queue ! appsink
In the appsink CB we try to isolate the Y, U, V byte-streams and the strides.
According to the I420 specification, for NxM frame we should get:
Y - first NxM bytes
U - next NxM/4 bytes
V - last NxM/4 bytes.
So for 1024x768 we should get a frame of 1179648 bytes, however we actually get a fame of 1376288 bytes, and we are unable to really get the correct Y,U,V offsets and sizes.
As far we understand, the format we are using (I420) is 14.1.3.1 which should be 12 bits per pixel.
While the frame that we get from the voudec is not so clear how many bits per pixel is used.
For a 1024x768 with 12 bpp we are expecting a 1179648 bytes frame, while we get from the vpudec 1376288.
This difference make it almost impossible to correctly parse the frame.
Can you please detail the vpudec's I420 output frame structure (offsets,sizes, strides)?