VPU decoder not reporting latency to GStreamer causing timing issues with live sources

cancel
Showing results for 
Show  only  | Search instead for 
Did you mean: 

VPU decoder not reporting latency to GStreamer causing timing issues with live sources

1,714 Views
b-hyl
Contributor I

This will be a long post to explain this issue clearly, so apologies in advance!

Summary

When the GStreamer pipeline has a non-zero indexed runtime (i.e. a live source) the deadlines for the packets are always late when vpudec is included in the pipeline. This seems to be because gst_video_decoder_set_latency() is never called from gstvpudecobject when the video settings are negotiated after the initial frames, causing the clock to drift.

What should this be set to and how should this be patched in the imx GStreamer plugins?

Context

I have been working with the iMX 8MP EVK with a "imx-multimedia" yocto image built on the L5.10.9_1.0.0 BSP appended with the kernel patch and libavtp library added to GStreamer based on the GStreamer AVTP tutorial for the i.MX 8MP. The tutorial is a little stale but in general the specific pipelines work as stated.

However, as the listed pipelines do not use the VPU hardware codec and instead use libav(ffmpeg), HD video at proper framerates is not achievable using the software codecs due to CPU hardware limits, with a limit of around 480p15. Here is the test pipeline I have been using to verify this, note that this is effectively a local loopback with the avtpsrc and avtpsink elements skipped:

 

 

gst-launch-1.0 clockselect. \( clock-id=tai \
    videotestsrc is-live=true ! video/x-raw,width=720,height=480,framerate=15/1 ! \
    clockoverlay ! x264enc bframes=0 key-int-max=1 speed-preset=1 tune=4 ! h264parse config-interval=-1 ! \
    avtpcvfpay processing-deadline=35000000 mtu=1470 mtt=2000000 tu=125000 streamid=0xAABBCCDDEEFF000A \
    measurement-interval=125000 max-interval-frames=5 ! \
    queue max-size-bytes=0 max-size-buffers=0 max-size-time=0 ! \
    avtpcvfdepay streamid=0xAABBCCDDEEFF000A ! queue max-size-bytes=0 max-size-buffers=0 max-size-time=0 ! \
    avdec_h264 ! videoconvert ! clockoverlay halignment=right ! autovideosink \)

 

 

So I have attempted to use make use of the VPU via the imx-gst1.0-plugin in order to improve the video to HD quality by taking load off the CPU. First I replaced the x264enc plugin with the vpuenc_h264 plugin resulting in the following pipeline capable of maximum 1080p15:

 

 

gst-launch-1.0 clockselect. \( clock-id=tai \
    videotestsrc is-live=true ! video/x-raw,width=1920,height=1080,framerate=15/1 ! \
    clockoverlay ! vpuenc_h264 ! h264parse config-interval=-1 ! \
    avtpcvfpay processing-deadline=35000000 mtu=1470 mtt=2000000 tu=125000 streamid=0xAABBCCDDEEFF000A \
    measurement-interval=125000 max-interval-frames=5 ! \
    queue max-size-bytes=0 max-size-buffers=0 max-size-time=0 ! \
    avtpcvfdepay streamid=0xAABBCCDDEEFF000A ! queue max-size-bytes=0 max-size-buffers=0 max-size-time=0 ! \
    avdec_h264 ! videoconvert ! clockoverlay halignment=right ! autovideosink \)

 

 

To get full 1080p30 which is what the VPU is advertised as capable of, I then replaced the avdec_h264 decoder with the vpudec decoder:

 

 

gst-launch-1.0 clockselect. \( clock-id=tai \
    videotestsrc is-live=true ! video/x-raw,width=1920,height=1080,framerate=15/1 ! \
    clockoverlay ! vpuenc_h264 ! h264parse config-interval=-1 ! \
    avtpcvfpay processing-deadline=35000000 mtu=1470 mtt=2000000 tu=125000 streamid=0xAABBCCDDEEFF000A \
    measurement-interval=125000 max-interval-frames=5 ! \
    queue max-size-bytes=0 max-size-buffers=0 max-size-time=0 ! \
    avtpcvfdepay streamid=0xAABBCCDDEEFF000A ! queue max-size-bytes=0 max-size-buffers=0 max-size-time=0 ! \
    vpudec ! videoconvert ! clockoverlay halignment=right ! autovideosink \)

 

 

However, this immediately runs into massive timing issues in GStreamer and causes huge amounts of stutter and the following error repeated in the terminal:

WARNING: from element /GstClockSelect:clockselect0/GstAutoVideoSink:autovideosink0/GstWaylandSink:autovideosink0-actual-sink-wayland: A lot of buffers are being dropped.
Additional debug info:
../git/libs/gst/base/gstbasesink.c(3142): gst_base_sink_is_too_late (): /GstClockSelect:clockselect0/GstAutoVideoSink:autovideosink0/GstWaylandSink:autovideosink0-actual-sink-wayland:
There may be a timestamping problem, or this computer is too slow.
WARNING: from element /GstClockSelect:clockselect0/GstAutoVideoSink:autovideosink0/GstWaylandSink:autovideosink0-actual-sink-wayland: A lot of buffers are being dropped.
Additional debug info:
../git/libs/gst/base/gstbasesink.c(3142): gst_base_sink_is_too_late (): /GstClockSelect:clockselect0/GstAutoVideoSink:autovideosink0/GstWaylandSink:autovideosink0-actual-sink-wayland:
There may be a timestamping problem, or this computer is too slow.

Attempting to narrow it down even further, the avtp parts can be removed resulting in this simple pipeline that has issues, indicating that the issue with vpudec occurs when a live source element is present:

 

 

gst-launch-1.0 videotestsrc is-live=true ! vpuenc_h264 ! vpudec ! waylandsink

 

 

If the "is-live=true" parameter is removed then everything plays correctly.

What this means is when the GStreamer pipeline has a non-zero indexed runtime (i.e. a live source) the deadlines for the packets are always late.

Tracing the issue

After quite a bit of debugging I think I have tracked down the issue to the gst_video_decoder_set_latency() never being called from gstvpudecobject when the video settings are negotiated after the initial frames/resolution change event. This means that when the downstream wayland video sink queries the upstream latency a value that is too low is returned:

 

 

    basesink gstbasesink.c:1229:gst_base_sink_query_latency:<autovideosink0-actual-sink-wayland> we are ready for LATENCY query
videodecoder gstvideodecoder.c:1847:gst_video_decoder_src_query:<vpudec0> received query 7683, latency
videodecoder gstvideodecoder.c:1713:gst_video_decoder_src_query_default:<vpudec0> handling query: latency query: 0xffff680018f0, GstQueryLatency, live=(boolean)false, min-latency=(guint64)0, max-latency=(guint64)18446744073709551615;
videodecoder gstvideodecoder.c:1810:gst_video_decoder_src_query_default:<vpudec0> Peer qlatency: live 1, min 0:00:00.066666666 max 99:99:99.999999999
    basesink gstbasesink.c:1287:gst_base_sink_query_latency:<autovideosink0-actual-sink-wayland> latency query: live: 1, have_latency 1, upstream_live 1, min(0:00:00.081666666)=upstream(0:00:00.066666666)+processing_deadline(0:00:00.015000000)+render_delay(0:00:00.000000000), max(99:99:99.999999999)=upstream(99:99:99.999999999)+render_delay(0:00:00.000000000)
    basesink gstbasesink.c:5537:default_element_query:<autovideosink0-actual-sink-wayland> query latency returns 1
    basesink gstbasesink.c:4982:gst_base_sink_send_event:<autovideosink0-actual-sink-wayland> handling event 0xffff80005920 latency event: 0xffff80005920, time 99:99:99.999999999, seq-num 123, GstEventLatency, latency=(guint64)81666666;
    basesink gstbasesink.c:4999:gst_base_sink_send_event:<autovideosink0-actual-sink-wayland> latency set to 0:00:00.081666666
    basesink gstbasesink.c:5033:gst_base_sink_send_event:<autovideosink0-actual-sink-wayland> sending event 0xffff80005920 latency event: 0xffff80005920, time 99:99:99.999999999, seq-num 123, GstEventLatency, latency=(guint64)81666666;
videodecoder gstvideodecoder.c:1698:gst_video_decoder_src_event:<vpudec0> received event 56321, latency
videodecoder gstvideodecoder.c:1598:gst_video_decoder_src_event_default:<vpudec0> received event 56321, latency
    basesink gstbasesink.c:5121:gst_base_sink_send_event:<autovideosink0-actual-sink-wayland> handled event: 1

 

 

Compared to avdec_h264 which returns the following:

 

 

    basesink gstbasesink.c:1229:gst_base_sink_query_latency:<autovideosink0-actual-sink-wayland> we are ready for LATENCY query
videodecoder gstvideodecoder.c:1847:gst_video_decoder_src_query:<avdec_h264-0> received query 7683, latency
videodecoder gstvideodecoder.c:1713:gst_video_decoder_src_query_default:<avdec_h264-0> handling query: latency query: 0xffff5c0018f0, GstQueryLatency, live=(boolean)false, min-latency=(guint64)0, max-latency=(guint64)18446744073709551615;
videodecoder gstvideodecoder.c:1810:gst_video_decoder_src_query_default:<avdec_h264-0> Peer qlatency: live 1, min 0:00:00.066666666 max 99:99:99.999999999
    basesink gstbasesink.c:1287:gst_base_sink_query_latency:<autovideosink0-actual-sink-wayland> latency query: live: 1, have_latency 1, upstream_live 1, min(0:00:00.215000000)=upstream(0:00:00.200000000)+processing_deadline(0:00:00.015000000)+render_delay(0:00:00.000000000), max(99:99:99.999999999)=upstream(99:99:99.999999999)+render_delay(0:00:00.000000000)
    basesink gstbasesink.c:5537:default_element_query:<autovideosink0-actual-sink-wayland> query latency returns 1
    basesink gstbasesink.c:4982:gst_base_sink_send_event:<autovideosink0-actual-sink-wayland> handling event 0xffff740074f0 latency event: 0xffff740074f0, time 99:99:99.999999999, seq-num 125, GstEventLatency, latency=(guint64)215000000;
    basesink gstbasesink.c:4999:gst_base_sink_send_event:<autovideosink0-actual-sink-wayland> latency set to 0:00:00.215000000
    basesink gstbasesink.c:5033:gst_base_sink_send_event:<autovideosink0-actual-sink-wayland> sending event 0xffff740074f0 latency event: 0xffff740074f0, time 99:99:99.999999999, seq-num 125, GstEventLatency, latency=(guint64)215000000;
videodecoder gstvideodecoder.c:1698:gst_video_decoder_src_event:<avdec_h264-0> received event 56321, latency
videodecoder gstvideodecoder.c:1598:gst_video_decoder_src_event_default:<avdec_h264-0> received event 56321, latency
    basesink gstbasesink.c:5121:gst_base_sink_send_event:<autovideosink0-actual-sink-wayland> handled event: 1

 

 

You can see in this line in the query handler where the latency configured by gst_video_decoder_set_latency() is added to the base latency, but gstvpudecobject has min_latency set to the default of 0.

Potential fixes

Therefore gstvpudec either needs to:

  • Retrieve this value from the VPU via the wrapper OR
  • Hard-code this value with a fixed latency

libav does the former in the GstVideoDecoderClass->set_format function and also when renegotiating after the first frame inside GstVideoDecoderClass->handle_frame by calculating it from the framerate set by ffmpeg in its codec context.

However the VPU wrapper is not reporting a framerate when VPU_DecGetInitialInfo() is called as both the returned nFrameRateRes and nFrameRateDiv memebers are set to zero.

Questions

So to sum up this long post I am asking the following:

  1. Is NXP already aware of this issue? If so, is there a patch coming in an upcoming BSP?
  2. If not, how should the decode latency for the VPU be set? Hard-coded or is there a better query to the VPU wrapper?
    1. If this value should be hard-coded what is a sensible value to set it to considering it may be different for different video formats?
    2. If this value should be queried from the VPU, what query should be used and is this use case even supported?

Please let me know if you need help reproducing these issues or would like some detailed log-files from GStreamer.

Thanks!

0 Kudos
Reply
0 Replies