i.MX53 V4L2: Is De-Interlace in Overlay mode possible?

cancel
Showing results for 
Show  only  | Search instead for 
Did you mean: 

i.MX53 V4L2: Is De-Interlace in Overlay mode possible?

2,291 Views
TomE
Specialist II

i.MX53 based on the Karo TX53 board. ADV7180 capturing composite (and therefore Interlaced) video.

This is an Automotive application, so the video is the Reversing Camera. A very common application that should be easily supported by the hardware and software.

I would expect there to be Application Notes detailing how to set up and use all of these cases, but the only thing that is available seems to be copies of Freescale's software testing "unit tests". They're made to regression-test the drivers rather than demonstrate their use in an Application.

Running Freescale Linux 2.6.35 taken from "git://git.freescale.com/imx/linux-2.6-imx.git" and starting from the imx_2.6.35_maintain branch. As far as I know this is the latest version of Linux to run on the i.MX53. That's the version linked from the i.MX53 Product Pages anyway. The last patch to that, the latest 2.6.35 branch, is dated 30 Oct 2013.

I can run "imx-test-11.09.01/test/mxc_v4l2_test/mxc_v4l2_tvin.c" and it runs very well, but with one major problem.

It uses V4L2_MEMORY_MMAP buffers on both the Capture and Output sides, and uses memcpy() to copy all data between the interfaces/drivers. The 720x480x2 buffers require copying 30 times a second, resulting in the CPU copying 20 megabytes/second.

Because the rescaling is only supported on the output driver, even if I'm only showing a postage-stamp-sized view of the camera it still has to copy the whole full resolution frame.

Without video, our Application takes 25% of the CPU doing what it does. With Video enabled, this rises to 88%. With a slightly more "busy" version of our Application which runs at 35% CPU loading, the total is now nudging 100%. That's no good. We can't handle the "reversing camera" taking 60% of the CPU. We need some spare CPU for USB data storage and for handling CAN traffic.

60% of the CPU taken copying 20 megabytes/sec? That's 5 mega-32-bit-words/sec read and the same written (total 10 MW/s) on a 32-bit DDR3 memory system running at 400 MHz. That should only be 1/40 of the memory bandwidth and not 60%!

Replacing the "memcpy()" with one written using NEON opcodes drops the overhead from 60% down to 40%. That's still way too much.

The obvious fix is to rewrite mxc_v4l2_tvin.c to use V4L2_MEMORY_USERPTR buffers, so the user code only has to swap pointers instead of copying data.

That's supported by the OUTPUT drivers in "drivers/media/video/mxc/output", but is NOT supported by the CAPTURE drivers in "drivers/media/video/mxc/capture". They only support V4L2_MEMORY_MMAP. I find that the later versions based on 3.14 and supporting subdevices do support USERPTR, but that's no good for 2.6-based drivers.

http://git.freescale.com/git/cgit.cgi/imx/linux-2.6-imx.git/tree/drivers/media/platform/mxc/capture/...

I can also run "imx-test-11.09.01/test/mxc_v4l2_test/mxc_v4l2_overlay.c", and it runs with ZERO CPU overhead as it is doing everything in hardware. That program doesn't support setting up the hardware for deinterlacing, so the video appears on the screen as two squashed frames, being the odd field and the even field. Inspection of the drivers indicates that only the OUTPUT drivers know how to deinterlace, and the OVERLAY driver is an INPUT one, and doesn't seem to.

I know there's a patch that adds deinterlacing to the Capture interface, but it is for "a rather old BSP version" meaning 3.10, and not 2.6.35 and is "old and unsupported" anyway.

    De-interlace capture device

    https://community.freescale.com/docs/DOC-93633

So does anyone have any suggestions for getting interlaced CVBS video from the camera to the display without killing the CPU?

Thanks,

Tom

4 Replies

929 Views
TomE
Specialist II

Nobody knows how to do this? Nobody else has used this chip in a car as a "Reversing Camera"?

There's been a suggestion I'm about to look into, but If I'm going to waste a day trying and someone else knows it can't work I'd like to know this first.

The CAPTURE device can't use V4L2_MEMORY_USERPTR, but the OUTPUT device can.

So maybe I can allocate V4L2_MEMORY_MMAP buffers for the CAPTURE side and then MMAP them as per usual.

Then I allocate V4L2_MEMORY_USERPTR buffers for the OUTPUT, and then (the tricky bit)...

Point the OUTPUT buffer pointers at the CAPTURE MMAPed buffers?

Is there any reason that shouldn't work?

Thanks,

Tom

0 Kudos

929 Views
TomE
Specialist II

I wrote:

> So maybe I can allocate V4L2_MEMORY_MMAP buffers for the CAPTURE side and then

> MMAP them as per usual.

> Then I allocate V4L2_MEMORY_USERPTR buffers for the OUTPUT, and then (the tricky bit)...

> Point the OUTPUT buffer pointers at the CAPTURE MMAPed buffers?

> Is there any reason that shouldn't work?

It should work, but I can't get it to. Before it locks up the kernel and has the watchdog reset the whole thing it shows a dark green screen instead of my video  feed from the capture interface..

V42L_MEMORY_USERPTR is meant by V4L2 to allow any "user memory" to be used. The only "example" given is the "output" test in the latest i.MX6 version of the "Unit Test" code. These are provided as "unit tests" and not really the source for working user examples. That code it gets the "user memory" from an IOCTL call to the IPU.

So I still don't have a solution to get reversing camera video (interlaced) to the Frame Buffer without taking half of the CPU to very slowly copy uncached memory buffers.

Another possible solution to my problems would be to use the a user interface to the SDMA to have it do the copies for me, freeing the CPU to do other work that needs doing, I can't find a driver to do that, and the only example given in the following post is for a test module that performs transfers within the driver and not to and from user space:

Re: i.MX6 SDMA

Does anyone have a working driver from the SDMA that lets user programs use it for M2M copies? I can't afford to waste half of the GHz CPU doing useless video copies.

Or an "overlay" driver that can deinterlace?

Tom

0 Kudos

929 Views
TomE
Specialist II

> It should work, but I can't get it to. Before it locks up the kernel and has the watchdog reset the whole thing

> it shows a dark green screen instead of my video  feed from the capture interface.

I managed to fix this and get it working.

The V4L2 specification says that in order to queue a buffer, the VIDIOC_QBUF ioctl is called. The "buf.m.userptr" field is to be filled in:

For the single-planar API and when memory is V4L2_MEMORY_USERPTR this is a pointer to

the buffer (casted to unsigned long type) in virtual memory, set by the application.

http://linuxtv.org/downloads/v4l-dvb-apis/buffer.html

http://linuxtv.org/downloads/v4l-dvb-apis/vidioc-qbuf.html

The Freescale drivers don't seem to follow the above specification. Here's the example Unit Test code filling in that parameter:

imx-test-3.14.28-1.0.0/test/mxc_v4l2_test/mxc_v4l2_output.c:

int mxc_v4l_output_test(FILE *in)
    ...

            memset(&buf, 0, sizeof (buf));
            buf.type = V4L2_BUF_TYPE_VIDEO_OUTPUT;
            buf.memory = g_mem_type;
            buf.index = i;
            if (ioctl(fd_v4l, VIDIOC_QUERYBUF, &buf) < 0)
            { ... }

            buffers[i].length = buf.length;
            buffers[i].offset = (size_t) buf.m.offset;

    ...

            if (g_mem_type == V4L2_MEMORY_USERPTR) {
                buf.m.userptr = (unsigned long) buffers[i].offset;
                buf.length = buffers[i].length;
            }

The "buf.m.offset" value filled in by the VIDIOC_QUERYBUF ioctl call is the PHYSICAL ADDRESS of the buffer, and not the VIRTUAL ADDRESS. The above test code isn't using "user memory" buffers, but allocates the buffers by making an IPU_ALLOC ioctl() call to the IPU device, also getting physical addresses. This also has the advantage of getting contiguous buffers, which is probably why it is doing this.

When I changed by code to pass the PHYSICAL address of the buffer in "buf.m.userptr" it started working. I can now see the video on the screen without forcing the CPU to copy every pixel.

So I think I have a workaround, but this is unexpected. The Driver is meant to be responsible for mapping the user's (usually) virtual-contiguous but physically discontiguous buffers into something the DMA controllers can handle or can scatter-gather. Not supporting this sure makes the drivers simpler, at the expense of apparently not following the standard.

Tom

929 Views
gusarambula
NXP TechSupport
NXP TechSupport

Hello Tom Evans,

Thanks for posting your finding! I'm sure they will help ther Community users!

If it helps, please find the atached "mxc_v4l2_tvin_no_memcpy.zip" which references how to implement zero memcpy for de-interlace in aplication.

We also have the patch for on the fly de-interlace, CSI->VDI->IC->MEM, in this case, if you set the overlay frame buffer as the capture buffer, it can be same as the mxc_v4l2_overlay test case:

iMX53 camera patch to support CSI->VDI->IC->MEM capture

0 Kudos