Simultaneous Ethernet Camera Streaming Solution

Wobaffet · ‎06-16-2026

We are developing a custom i.MX8MP-based board for an automotive camera system. We have a working single-camera H.264 RTP streaming pipeline using the VPU hardware decoder on our 5.15.71 NXP Yocto BSP. We are now planning to scale to 4 simultaneous camera streams via an Ethernet switch. The switch will aggregate 4x 100BASE-T1 camera ports and connect to the i.MX8MP EQOS MAC via a gigabit RGMII uplink, with each camera isolated on its own VLAN.

Our display is 1280x800. For 4-stream simultaneous view each stream only needs to fill a quarter of the display (~640x400 or lower @30 fps), so we can scale camera resolution down as needed.

1. Is there an NXP i.MX8MP reference design or application note covering multi-channel video decode that we can use as a starting point?

2. Can the H.264 VPU decoder handle multiple independent instances from separate sources, or is a pre-decode composition step required to reduce the number of processes and context switches? If composition is required, what is the recommended approach on i.MX8MP?

3. What is the overall recommended software architecture for this use case from NXP's perspective?

Our future application is surround view, so scalability and low latency are key requirements.

Thanks!

Zhiming_Liu

Hi @Wobaffet

The i.MX8MP can serve as a candidate platform for this 4-channel Ethernet H.264 camera quad-view application; however, only the i.MX8QM and i.MX95 have reference designs for surround view.

The recommended architecture is for the four RTP/H.264 streams to enter separate GStreamer pipelines, be decoded individually by VPU hardware decoders, and then undergo hardware-assisted scaling and composition after decoding, with the output displayed on a 1280×800 display. It is not recommended to attempt composition before H.264 decoding, as separate H.264 streams cannot be directly composited into a single display frame in the compressed domain unless decoding and re-encoding have already been completed at the upstream switch or camera.

For a display requirement of 4 streams at approximately 640×400@30fps, a rough estimate based on pixel rate indicates that the load is significantly lower than that of 1080p60 decoding, so this approach is reasonable. However, whether this ultimately meets the low-latency and stability requirements for surround view will require system-level verification, taking into account actual camera bitrate, profile, GOP, RTP jitter, DDR bandwidth, the GStreamer zero-copy path, and the composition method.

Best Regards,
Zhiming

Wobaffet

Hello,

Thank you for the answer!

As a follow up,

1. We are planning to run 4 independent GStreamer pipelines simultaneously, each decoding its own H.264 stream using the VPU hardware. We want to confirm that the VPU driver in the 5.15.71 BSP supports 4 truly concurrent decode instances. Is there any known configuration or limitation in this BSP that would prevent all four pipelines from decoding in parallel, or is a scheduling method required? If so, are any examples available?

2. Our setup has an Ethernet switch connecting 4 cameras to the i.MX8MP via a single RGMII uplink, with each camera on its own VLAN. On the Linux side we plan to create VLAN subinterfaces on the EQOS port and have each GStreamer pipeline receive UDP/RTP from its own subinterface. Two questions on this: Does the EQOS driver in the 5.15.71 BSP support 802.1Q VLAN tagged frames correctly out of the box, or is there any additional configuration needed? And are there any known receive-path tuning recommendations for running 4 simultaneous UDP streams on this interface in this BSP?

Thanks!

Wobaffet

reminder