I have a monochrom image sensor connected to my imx6q board, capturing 30fps 1280x960 8bit grayscale images using gstreamer and the mfw_v4lsrc element. Converting from 8bit grayscale to I420 should be quick, but it appears memcpy out of a dma coherent buffer is very inefficient. What other tricks can I use to do this in hardware? Converting to I420 should involve just growing the buffer by 614,400 bytes and filling that area with 128.