Hi community !
We're using a imx6 quad board with an Android 4.4.2 (kernel 3.0.35) BSP to decode a RTP/VP8 stream. Our implementation receives the RTP stream, assembles VP8 frames from RTP packets and uses Android MediaCodec (thus the VPU) do decode the stream.
The system works well and we're able to decode a 1080p VP8 stream at 30 fps without any issues. But when the incoming VP8 stream uses "temporal scalability", the VPU seems to be unable to handle the frames and big glitches are appearing on the screen. Indeed we tried to decode the same stream with Android software decoder (OMX.google.vp8.decoder) using the same receiver implementation or with GStreamer on a workstation and it is decoded correctly in both situations.
To be more precise "temporal scalability" is used to adapt the video bitrate to the target. The encoding is done in a way that allows to drop some of the encoded frames to use less bandwidth when necessary. The encoded frames are distributed in 1 to 3 layers. The layer 0 is self contained, the layer 1 depends on layer 0 and layer 2 depends on layer 1. Thus the emitter, the receiver, or a bridge between them can choose to drop one or two layers to save bandwidth.
In our situation, the emitter sends a divided in two layers (0 and 1). If we give the two layers to the decoder (VPU), the video is full of garbage. If we drop the layer 1 and give only the layer 0 to the VPU, it is decoded correctly.
Is VP8 temporal scalability supported by the VPU ? Do I really need to filter the layers 1 and 2 ?
Some documentation on temporal scalability and VP8 streams:
- HOWTO Use temporal scalability to adapt video bitrates
- RFC 7741 - RTP Payload Format for VP8 Video (VP8 payload descriptor)
I can provide a sample of RTP stream and the GStreamer pipeline to read it if necessary.