2295192_en-US

キャンセル
次の結果を表示 
表示  限定  | 次の代わりに検索 
もしかして: 

2295192_en-US

2295192_en-US

i.MX95 VPU H265 latency and performance(v4l2h265enc/v4l2h265dec)

Hello,

I am currently evaluating the H265 decode → encode pipeline on i.MX95 and I am observing significantly worse latency and stability compared to i.MX8MP, using similar workloads and configurations.

Context

On i.MX8MP, using:

  • imxvpudec_h265

  • vpuenc_hevc

I am able to achieve:

  • Very low end-to-end latency (≈ 10 ms)

  • Stable operation with multiple streams

  • No visible freezes or artifacts

On i.MX95, using:

  • v4l2h265dec

  • v4l2h265enc

I observe:

  • ~90 ms latency for a single decode → encode pipeline

  • Freezes and visual artifacts when running 4 simultaneous camera streams


Test pipeline

To reproduce the issue, I used IP cameras (H265 over RTSP) with the following pipeline:

test-launch "( 
  rtspsrc location=rtsp://10.42.0.85 drop-on-latency=true latency=0 buffer-mode=4 !
  rtph265depay !
  h265parse config-interval=1 !
  v4l2h265dec !
  v4l2h265enc !
  rtph265pay name=pay0 pt=96
)"

This pipeline works correctly with a single camera, but when scaling to 4 cameras, freezes and artifacts start to appear.


Questions

  1. Why is the H265 VPU pipeline on i.MX95 significantly more latent than on i.MX8MP?

  2. Is DMABUF zero-copy fully supported between v4l2h265dec and v4l2h265enc on i.MX95?

    • If not, is there an implicit memory copy that could explain the additional latency and bandwidth pressure?

  3. Are there known limitations in the current i.MX95 VPU driver regarding:

    • Low-latency operation

    • Multi-stream decode + encode

    • Internal buffering depth

  4. Are there recommended V4L2 controls or io-modes (capture/output-io-mode) to minimize latency on i.MX95?

  5. Is the i.MX95 VPU driver expected to reach performance parity with i.MX8MP in future BSP releases, or is the higher latency an inherent design tradeoff?


Goal

My objective is real-time, low-latency video processing (decode → process → encode), similar to what is achievable on i.MX8MP.

Thank you for your support.

Re: i.MX95 VPU H265 latency and performance(v4l2h265enc/v4l2h265dec)

Hello,

Which BSP version are you using on i.MX8MP and i.MX95 to recreate this issue?

Are you using the same GST pipeline (other than the decoder/encoder elements) on both 8MP and 95?

AFAIK, there is no known limitation that results in high latency on i.MX95.

On i.MX95, encoder support mmap and dmabuf, which can make a pipeline to transfer dma-buf without copy b/w decoder and encoder.

Does the IP camera encode without B frames? Then we can disable decode frame reorder, which will reduce the latency. Please check this. You can disable frame reorder using below v4l2 control
" v4l2h264dec extra-controls="decode,display_delay_enable=1,display_delay=0" "

You can list all v4l2 controls using below command:
v4l2-ctl -l -d


Regards

Re: i.MX95 VPU H265 latency and performance(v4l2h265enc/v4l2h265dec)

The kernel versions:

  • imx95: 6.6.101-0
  • imx8mp: 6.6.84-0

Yes it is the same gstreamer pipeline.

The IP camera encode without B frames. I tired your command and it doesn't solve the issue. There are still artifacts on the screen and the latency is high. However qualitatively speaking, it seems to remove the freezes.


```
v4l2-ctl -l -d /dev/v4l/by-path/platform-4c480000.vpu-video-index0

User Controls

min_number_of_capture_buffers 0x00980927 (int) : min=1 max=32 step=1 default=1 value=1 flags=read-only
thumbnail_mode 0x00981901 (bool) : default=0 value=0 flags=write-only

Codec Controls

h264_profile 0x00990a6b (menu) : min=0 max=4 default=0 value=0 (Baseline)
hevc_profile 0x00990b67 (menu) : min=0 max=0 default=0 value=0 (Main)
display_delay 0x00990b8d (int) : min=0 max=0 step=1 default=0 value=0
display_delay_enable 0x00990b8e (bool) : default=0 value=0
```

```
v4l2-ctl -l -d /dev/v4l/by-path/platform-4c480000.vpu-video-index1

User Controls

horizontal_flip 0x00980914 (bool) : default=0 value=0
vertical_flip 0x00980915 (bool) : default=0 value=0
rotate 0x00980922 (int) : min=0 max=270 step=90 default=0 value=0 flags=modify-layout
min_number_of_output_buffers 0x00980928 (int) : min=1 max=32 step=1 default=1 value=1 flags=read-only

Codec Controls

video_gop_size 0x009909cb (int) : min=0 max=2047 step=1 default=30 value=30
video_bitrate_mode 0x009909ce (menu) : min=0 max=1 default=1 value=1 (Constant Bitrate) flags=update
video_bitrate 0x009909cf (int) : min=1 max=1500000000 step=1 default=2097152 value=2097152
frame_level_rate_control_enable 0x009909d7 (bool) : default=1 value=1
h264_mb_level_rate_control 0x009909da (bool) : default=1 value=1
number_of_mbs_in_a_slice 0x009909dc (int) : min=0 max=262143 step=1 default=1 value=1
slice_partitioning_method 0x009909dd (menu) : min=0 max=1 default=0 value=0 (Single)
force_key_frame 0x009909e5 (button) : value=0 flags=write-only, execute-on-write
intra_refresh_period 0x009909ec (int) : min=0 max=2160 step=1 default=0 value=0
intra_refresh_period_type 0x009909ed (menu) : min=0 max=1 default=1 value=1 (Cyclic)
h264_i_frame_qp_value 0x00990a5e (int) : min=0 max=51 step=1 default=30 value=30
h264_p_frame_qp_value 0x00990a5f (int) : min=0 max=51 step=1 default=30 value=30
h264_b_frame_qp_value 0x00990a60 (int) : min=0 max=51 step=1 default=30 value=30
h264_minimum_qp_value 0x00990a61 (int) : min=0 max=51 step=1 default=8 value=8
h264_maximum_qp_value 0x00990a62 (int) : min=0 max=51 step=1 default=51 value=51
h264_8x8_transform_enable 0x00990a63 (bool) : default=1 value=1
h264_cpb_buffer_size 0x00990a64 (int) : min=0 max=18750000 step=1 default=0 value=0
h264_entropy_mode 0x00990a65 (menu) : min=0 max=1 default=1 value=1 (CABAC)
h264_i_frame_period 0x00990a66 (int) : min=0 max=2047 step=1 default=0 value=0
h264_level 0x00990a67 (menu) : min=0 max=16 default=14 value=14 (5)
h264_loop_filter_alpha_offset 0x00990a68 (int) : min=-6 max=6 step=1 default=0 value=0
h264_loop_filter_beta_offset 0x00990a69 (int) : min=-6 max=6 step=1 default=0 value=0
h264_loop_filter_mode 0x00990a6a (menu) : min=0 max=2 default=0 value=0 (Enabled)
h264_profile 0x00990a6b (menu) : min=0 max=4 default=4 value=4 (High)
vertical_size_of_sar 0x00990a6c (int) : min=0 max=65535 step=1 default=0 value=0
horizontal_size_of_sar 0x00990a6d (int) : min=0 max=65535 step=1 default=0 value=0
aspect_ratio_vui_enable 0x00990a6e (bool) : default=0 value=0
vui_aspect_ratio_idc 0x00990a6f (menu) : min=0 max=17 default=0 value=0 (Unspecified)
h264_constrained_intra_pred 0x00990a7f (int) : min=0 max=1 step=1 default=0 value=0
h264_chroma_qp_index_offset 0x00990a80 (int) : min=-12 max=12 step=1 default=0 value=0
hevc_minimum_qp_value 0x00990b58 (int) : min=0 max=51 step=1 default=8 value=8
hevc_maximum_qp_value 0x00990b59 (int) : min=0 max=51 step=1 default=51 value=51
hevc_i_frame_qp_value 0x00990b5a (int) : min=0 max=51 step=1 default=30 value=30
hevc_p_frame_qp_value 0x00990b5b (int) : min=0 max=51 step=1 default=30 value=30
hevc_b_frame_qp_value 0x00990b5c (int) : min=0 max=51 step=1 default=30 value=30
hevc_profile 0x00990b67 (menu) : min=0 max=0 default=0 value=0 (Main)
hevc_level 0x00990b68 (menu) : min=0 max=8 default=7 value=7 (5)
hevc_loop_filter 0x00990b6c (menu) : min=0 max=2 default=1 value=1 (Enabled)
hevc_loop_filter_beta_offset 0x00990b6d (int) : min=-6 max=6 step=1 default=0 value=0
hevc_loop_filter_tc_offset 0x00990b6e (int) : min=-6 max=6 step=1 default=0 value=0
hevc_refresh_type 0x00990b6f (menu) : min=0 max=2 default=2 value=2 (IDR)
hevc_num_of_i_frame_b_w_2_idr 0x00990b70 (int) : min=0 max=2047 step=1 default=0 value=0
hevc_constant_intra_prediction 0x00990b72 (int) : min=0 max=1 step=1 default=0 value=0
hevc_strong_intra_smoothing 0x00990b76 (int) : min=0 max=1 step=1 default=1 value=1
hevc_tmv_prediction 0x00990b79 (int) : min=0 max=1 step=1 default=1 value=1
prepend_sps_and_pps_to_idr 0x00990b84 (int) : min=0 max=1 step=1 default=1 value=1
frame_skip_mode 0x00990b86 (menu) : min=0 max=2 default=0 value=0 (Disabled)
average_qp_value 0x00990b91 (int) : min=0 max=51 step=1 default=0 value=0 flags=read-only
```

Re: i.MX95 VPU H265 latency and performance(v4l2h265enc/v4l2h265dec)

The BSP versions are :

  • imx8mp: lf-6.6.52-2.2.0
  • imx95: lf-6.6.52-2.2.1

Thank you

Re: i.MX95 VPU H265 latency and performance(v4l2h265enc/v4l2h265dec)

Yes the IP camera encode without B frames. I tried your command and it doesn't solve the issue. There are still artifacts on the screen and the latency is high. However qualitatively speaking, it seems to remove the freezes.

Re: i.MX95 VPU H265 latency and performance(v4l2h265enc/v4l2h265dec)

ok Also please confirm the B frame question I mentioned above.

Regards

Re: i.MX95 VPU H265 latency and performance(v4l2h265enc/v4l2h265dec)

Hello,

To confirm: the IP cameras encode without B frames (I-P only). I have applied display_delay_enable=1, display_delay=0 on the decoder. This helped remove freezes, but the latency remains at ~90 ms.

To isolate the bottleneck, I used the GStreamer element-latency tracer on this pipeline:

```
rtspsrc location=rtsp://... drop-on-latency=true latency=0 buffer-mode=4 !  rtph265depay ! h265parse config-interval=1 !  v4l2h265dec extra-controls="decode,display_delay_enable=1,display_delay=0" capture-io-mode=4 output-io-mode=4 !  v4l2h265enc extra-controls="encode,video_bitrate_mode=1,video_bitrate=2097152,frame_level_rate_control_enable=0,video_gop_size=30" capture-io-mode=4 output-io-mode=4 !  h265parse config-interval=1 ! rtph265pay name=pay0 pt=96
```

Per-element latency results (640x480 @ 30fps, single stream):

  • v4l2h265enc: ~67 ms
  • v4l2h265dec ~2 ms
  • all other elements < 1 ms

The encoder alone accounts for ~67 ms (~2 frame periods). This is consistent across runs. The decoder is fine at ~2 ms.

I have tried:

  • frame_level_rate_control_enable=0
  • capture-io-mode=4 and output-io-mode=4 (DMABUF) on both encoder and decoder
  • Minimal queue buffering between elements (max-size-buffers=1)

None of these significantly reduce the encoder latency.

My questions, focused on the encoder:

  1. Is the Wave6 encoder expected to hold 2 frames internally before producing output? Is there a way to reduce this internal buffering (similar to display_delay=0 on the decoder side)?
  2. Does output-io-mode=5 (DMABUF_IMPORT) work on the encoder to achieve zero-copy from upstream? Would this help latency?
  3. Is there a V4L2 control or driver parameter to enable a low-latency / zero-delay encoding mode on the Wave6?
  4. On i.MX8MP, vpuenc_hevc achieves ~10 ms for the same workload. Is this latency gap with v4l2h265enc on i.MX95 expected to improve in future BSP releases?

Thank you for your help.

タグ(1)
評価なし
バージョン履歴
最終更新日:
‎04-16-2026 02:31 AM
更新者: