<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>i.MX Processorsのトピック[iMX8MP] V4L2 Buffer Copy Time</title>
    <link>https://community.nxp.com/t5/i-MX-Processors/iMX8MP-V4L2-Buffer-Copy-Time/m-p/2329284#M244402</link>
    <description>&lt;P class=""&gt;&lt;SPAN&gt;Hello All,&lt;/SPAN&gt;&lt;/P&gt;&lt;P class=""&gt;&lt;SPAN&gt;I need to iterate over an image captured from the camera subsystem on the i.MX8M Plus. The image is greyscale 1920×1200 and copying the buffer currently takes about 15 ms. This suggests that the buffer memory is mapped as non-cacheable.&lt;/SPAN&gt;&lt;/P&gt;&lt;P class=""&gt;&lt;SPAN&gt;On previous ARM systems I have worked with, DMA buffers could be mapped as cache-coherent (for example using the &lt;/SPAN&gt;&lt;SPAN&gt;dma-coherent&lt;/SPAN&gt;&lt;SPAN&gt; device-tree property or similar mechanisms), which reduced a similar copy operation to roughly 2 ms.&lt;/SPAN&gt;&lt;/P&gt;&lt;P class=""&gt;&lt;SPAN&gt;Currently the V4L2 capture buffers appear to be uncached, so both copying the buffer and iterating over it directly (zero-copy processing) are quite slow.&lt;/SPAN&gt;&lt;/P&gt;&lt;P class=""&gt;&lt;SPAN&gt;Is there a mechanism on the i.MX8M Plus to enable cache-coherent mappings for these buffers (for example via a device-tree configuration), or another recommended approach to improve CPU access performance?&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN&gt;Thanks&lt;/SPAN&gt;&lt;/P&gt;</description>
    <pubDate>Mon, 09 Mar 2026 22:23:31 GMT</pubDate>
    <dc:creator>DIM</dc:creator>
    <dc:date>2026-03-09T22:23:31Z</dc:date>
    <item>
      <title>[iMX8MP] V4L2 Buffer Copy Time</title>
      <link>https://community.nxp.com/t5/i-MX-Processors/iMX8MP-V4L2-Buffer-Copy-Time/m-p/2329284#M244402</link>
      <description>&lt;P class=""&gt;&lt;SPAN&gt;Hello All,&lt;/SPAN&gt;&lt;/P&gt;&lt;P class=""&gt;&lt;SPAN&gt;I need to iterate over an image captured from the camera subsystem on the i.MX8M Plus. The image is greyscale 1920×1200 and copying the buffer currently takes about 15 ms. This suggests that the buffer memory is mapped as non-cacheable.&lt;/SPAN&gt;&lt;/P&gt;&lt;P class=""&gt;&lt;SPAN&gt;On previous ARM systems I have worked with, DMA buffers could be mapped as cache-coherent (for example using the &lt;/SPAN&gt;&lt;SPAN&gt;dma-coherent&lt;/SPAN&gt;&lt;SPAN&gt; device-tree property or similar mechanisms), which reduced a similar copy operation to roughly 2 ms.&lt;/SPAN&gt;&lt;/P&gt;&lt;P class=""&gt;&lt;SPAN&gt;Currently the V4L2 capture buffers appear to be uncached, so both copying the buffer and iterating over it directly (zero-copy processing) are quite slow.&lt;/SPAN&gt;&lt;/P&gt;&lt;P class=""&gt;&lt;SPAN&gt;Is there a mechanism on the i.MX8M Plus to enable cache-coherent mappings for these buffers (for example via a device-tree configuration), or another recommended approach to improve CPU access performance?&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN&gt;Thanks&lt;/SPAN&gt;&lt;/P&gt;</description>
      <pubDate>Mon, 09 Mar 2026 22:23:31 GMT</pubDate>
      <guid>https://community.nxp.com/t5/i-MX-Processors/iMX8MP-V4L2-Buffer-Copy-Time/m-p/2329284#M244402</guid>
      <dc:creator>DIM</dc:creator>
      <dc:date>2026-03-09T22:23:31Z</dc:date>
    </item>
    <item>
      <title>Re: [iMX8MP] V4L2 Buffer Copy Time</title>
      <link>https://community.nxp.com/t5/i-MX-Processors/iMX8MP-V4L2-Buffer-Copy-Time/m-p/2331099#M244447</link>
      <description>&lt;P&gt;&lt;span class="lia-inline-image-display-wrapper lia-image-align-inline" image-alt="image.png" style="width: 999px;"&gt;&lt;img src="https://community.nxp.com/t5/image/serverpage/image-id/379044i4C8226DD69172EFD/image-size/large?v=v2&amp;amp;px=999" role="button" title="image.png" alt="image.png" /&gt;&lt;/span&gt;&lt;/P&gt;
&lt;P&gt;you can try the dmabuf,&amp;nbsp;The dmabuf uses buffers of a hardware DMA in order to perform a zero-copy pipeline, as shown below:&lt;BR /&gt;$ gst-launch-1.0 v4l2src device=/dev/video0 num-buffers=300 io-mode=dmabuf ! \&lt;BR /&gt;'video/x-raw,format=(string)NV12,width=1920,height=1080,framerate=(fraction)30/1' ! \&lt;BR /&gt;queue ! v4l2h264enc output-io-mode=dmabuf-import ! avimux ! filesink location=test.avi&lt;/P&gt;</description>
      <pubDate>Thu, 12 Mar 2026 06:59:29 GMT</pubDate>
      <guid>https://community.nxp.com/t5/i-MX-Processors/iMX8MP-V4L2-Buffer-Copy-Time/m-p/2331099#M244447</guid>
      <dc:creator>joanxie</dc:creator>
      <dc:date>2026-03-12T06:59:29Z</dc:date>
    </item>
    <item>
      <title>Re: [iMX8MP] V4L2 Buffer Copy Time</title>
      <link>https://community.nxp.com/t5/i-MX-Processors/iMX8MP-V4L2-Buffer-Copy-Time/m-p/2331838#M244463</link>
      <description>&lt;P&gt;Hi joanxie, thank you for your reply.&lt;/P&gt;&lt;P&gt;I understand I can pass the data using zero-copy via DMABUF, however, it is very slow to access that data when I go to process it. I'd ultimately like minimal latency in processing time. I believe gstreamer doesn't exactly represent my use case because it is parallelizing the processing of the buffers, so the latency doesn't matter in that case.&lt;/P&gt;&lt;P&gt;Is there some way to allow processing of the buffer with minimal latency? I believe this would require cache-coherence, typically implemented on ARM processors via the &lt;A href="https://developer.arm.com/documentation/ddi0500/j/Functional-Description/Interfaces/Accelerator-Coherency-Port" target="_self"&gt;Accelerator Coherency Port&lt;/A&gt;. Does the i.MX8MP have this feature?&lt;/P&gt;&lt;P&gt;Thanks,&lt;/P&gt;</description>
      <pubDate>Thu, 12 Mar 2026 20:20:07 GMT</pubDate>
      <guid>https://community.nxp.com/t5/i-MX-Processors/iMX8MP-V4L2-Buffer-Copy-Time/m-p/2331838#M244463</guid>
      <dc:creator>DIM</dc:creator>
      <dc:date>2026-03-12T20:20:07Z</dc:date>
    </item>
  </channel>
</rss>

