hi.
I'm using EVK with imx6ull with connected mt9j003 sensor. I developed the driver for support mt9j003 and it works together with mx6s_capture driver. used linux kernel is linux-imx-4.9.88. I have to update the kernel because of bugs in Ethernet driver.
Now I can capture the images and save it as file or transfer it over USB to host PC. But I found that my framerate is too low(about 2 FPS but expected 4 FPS). Then I measured the times necessary for every operation. I captured the on the mt9j003 generated test pattern in full resolution 10Mpix. The capturing is as expected was with about 4 FPS possible and the bottleneck was the memcpy. For 10 Mbyte it needs over 250 ms! To access the video buffers I used the mmap way.
Reading of many threads in internet confirm my suspicion the over mmap allocated memory are not cached and the access to that memory is very slow. I didn't found any solution for that problem but proposed workaround to use UESRPTR method. I test this way but it does not work as expected. First I got the errno -22 on VIDIOC_QBUF call. After I replaced the malloc to memalign like:
// buffers[n_buffers].start = malloc(buffer_size);
buffers[n_buffers].start = memalign(page_size,buffer_size);
I got another error -14 bad address and a message "contiguous mapping is too small 4096/10444800". Probably the user allocated memory is fragmented in physical memory and DMA can't work with this type of memory.
Now I don't know what can I do to get the fast memcpy of captured frame fast. Because the 250 ms for 10 Mbyte is 40 Mbyte/sec. The 8051 may be is faster then imx6...
Hi Andrej
one can look at memcpy improvements suggestions on
or try sdma memory copy example
mxc_sdma_memcopy_test.c\module_test - imx-test - i.MX Driver Test Application Software
Best regards
igor
-----------------------------------------------------------------------------------------------------------------------
Note: If this post answers your question, please click the Correct Answer button. Thank you!
-----------------------------------------------------------------------------------------------------------------------
hi Igor,
thank you for your answer. the first link may be helpful, but i have a little other problem. my system runs on expected speed. and copy on malloc memroy runs up to 10 times faster then on mmap memory allocated in /dev/video0.
about the second link. this is a some module that tests the m2m transfer. the example driver on write init 4 wbufs and starts dma transfer from wbufs to rbufs. On read the data will be compared. I don't see how it can help me to copy in video4linux captured frames to other user allocated memory with the same performance as copy from user allocated memory to user allocated memory.
may be I don't described my problem good enough? sorry for bad English.
Best Regards
Andrej.
Hi Andrej
as your task is specifc for memcpy improvements may be recommended to use
Best regards
igor
Hi Igor,
thank you for the answer. i don't know what is specific in my question. I want only get the captured image from v4l device make some operations with the image and send it over USB to PC. Send to PC works es expected. Image capturing too, but only the copy the captured v4l frame to the user memory runs 10 times slower as user memory to user memory.
I try to explain on the othe way:
void *buf1,*buf2; // i have two buffers
const size_t size = 10*1024*1024; // both buffers are 10 Mb
case1:
buf1 = malloc(size); buf2 = malloc(size); memcpy(buf1, buf2, size); // this operation is approx 20 ms
case2:
buf1 = malloc(size); xioctl(fd, VIDIOC_QUERYBUF, &v4l_buf_struct); // this step is necessary for v4l2 to allocate buffer buf2 = mmap(NULL,size,PROT_READ | PROT_WRITE,MAP_SHARED,fd,v4l_buf_struct.m.offset); // get memory allocated in v4l2 memcpy(buf1,buf2,size); // this operation needs 250 ms!!!
Best regards
Andrej
Hi Andrej
specific are requirements for your task (in particular memcpy implementation),
NXP provides software which does not meet them. Performance requirements, as they
are board specific, are usually supported through NXP Professional Services.
Best regards
igor
Hi Igor,
thank you for the fast answer. I can confirm, that is a memcpy operation that making a problems, but only with memory allocated by mx6s_capture module. :-) I guess this module is from freescale/nxp. So I tried to ask by NXP :-)
The module mx6s_capture provides the v4l2 interface. This module returns the memory that have bad performance to copy it to userspace. I can try to implement for-loop that copies the memory in loop over incremented pointers but I guess the result will be the same.
I think the task to copy the captured image to userspace nothing special. ;-)
Best regards
Andrej.