i.MX6 Memory issues after long encoding

jcc273 · ‎05-09-2018

Hello,

So i have a recording device using an i.MX 6 Solo and it encodes to h264 video. The problem is that after i record for awhile i cannot start a new recording because I get memory errors. For example if i record for 25 minutes straight i have no issues during the recording but as soon as that recording is stopped i cannot start another one without getting a memory crash:

cam-source:src: page allocation failure: order:9, mode:0xd1
CPU: 0 PID: 16107 Comm: cam-source:src Not tainted 4.1.15-0.3-224221-g6bef30f-dirty #29
Hardware name: Freescale i.MX6 Quad/DualLite (Device Tree)
[<80015904>] (unwind_backtrace) from [<80011b7c>] (show_stack+0x10/0x14)
[<80011b7c>] (show_stack) from [<80658220>] (dump_stack+0x6c/0xb4)
[<80658220>] (dump_stack) from [<800a26b0>] (warn_alloc_failed+0xe8/0x114)
[<800a26b0>] (warn_alloc_failed) from [<800a4f44>] (__alloc_pages_nodemask+0x5fc/0x788)
[<800a4f44>] (__alloc_pages_nodemask) from [<8001a16c>] (__dma_alloc_buffer+0x2c/0x168)
[<8001a16c>] (__dma_alloc_buffer) from [<8001a444>] (__dma_alloc+0x19c/0x258)
[<8001a444>] (__dma_alloc) from [<8001a620>] (arm_dma_alloc+0x84/0x90)
[<8001a620>] (arm_dma_alloc) from [<8046a8bc>] (mxc_v4l_do_ioctl+0x132c/0x1bc4)
[<8046a8bc>] (mxc_v4l_do_ioctl) from [<80451da4>] (video_usercopy+0x230/0x3f4)
[<80451da4>] (video_usercopy) from [<8044d51c>] (v4l2_ioctl+0x5c/0x114)
[<8044d51c>] (v4l2_ioctl) from [<800e3e30>] (do_vfs_ioctl+0x4ec/0x5b0)
[<800e3e30>] (do_vfs_ioctl) from [<800e3f28>] (SyS_ioctl+0x34/0x5c)
[<800e3f28>] (SyS_ioctl) from [<8000eb40>] (ret_fast_syscall+0x0/0x3c)
Mem-Info:
active_anon:1399 inactive_anon:19380 isolated_anon:0
active_file:3833 inactive_file:12905 isolated_file:0
unevictable:1 dirty:1 writeback:0 unstable:0
slab_reclaimable:549 slab_unreclaimable:907
mapped:2149 shmem:19390 pagetables:70 bounce:0
free:2857 free_pcp:76 free_cma:0
Normal free:11428kB min:1704kB low:2128kB high:2556kB active_anon:5596kB inactive_anon:77520kB active_file:15332kB inactive_file:51620kB unevictable:4kB isolated(anon):0kB isolated(file):0kB present:524288kB managed:182068kB mlocked:4kB dirty:4kB writeback:0kB mapped:8596kB shmem:77560kB slab_reclaimable:2196kB slab_unreclaimable:3628kB kernel_stack:584kB pagetables:280kB unstable:0kB bounce:0kB free_pcp:304kB local_pcp:304kB free_cma:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? no
lowmem_reserve[]: 0 0 0
Normal: 335*4kB (MR) 225*8kB (UMR) 118*16kB (UMR) 56*32kB (U) 28*64kB (U) 8*128kB (UMR) 1*256kB (R) 1*512kB (R) 1*1024kB (R) 0*2048kB 0*4096kB 0*8192kB 0*16384kB 0*32768kB = 11428kB
36127 total pagecache pages
0 pages in swap cache
Swap cache stats: add 0, delete 0, find 0/0
Free swap = 0kB
Total swap = 0kB
131072 pages RAM
0 pages HighMem/MovableOnly
85555 pages reserved
0 pages cma reserved
ERROR: v4l2 capture: mxc_allocate_frame_buf failed.
GST Error: Internal data flow error.

I added this call after the end of every recording:

echo 3 | tee /proc/sys/vm/drop_caches

And it helped immensely because before that it would happen even after a couple short recordings, so now it takes longer to get the issues but it still happens : /. It is like the encoder is has a leak and is slowly taking all the CMA memory and then eventually I don't have a enough left to start a new recording process!

I am running the rel_imx_4.1.15_1.1.0_ga kernel.

Any ideas?

jcc273 · ‎05-18-2018

Figured this out. So the main problem was that CMA was not properly enabled in the kernel config. CMA itself was all enabled but the option CONFIG_DMA_CMA was set to no. Because of this the kernel was just ignoring the >256MB i had assigned as reserved cma memory. This means not only was the system not using cma for the capture stuff but that it was also working on half the memory.

So after running a recording for awhile all the memory allocation from file system writes would fragment the memory so badly that on a second recording it could not find space for all 6 of the almost 2MB continuous buffers it needed and it would fail. Even calling drop_caches would only help for a short while.

So I correctly enabled CMA to allow for the continuous memory to be available and that fixed it. For good measure I also went ahead and modified mxc_v4l2_capture.c to not release the buffers once allocated. So on first run it allocates buffers then keeps them until removal of the driver (never) or if a bigger size is needed because pix.sizeimage changes it will release and reallocate as well. This is much the way that the mxc_vpu works as well. I figured it didn't make sense to ever release since this is a recording device and i will just be almost constantly using them.

在原帖中查看解决方案

jcc273 · ‎05-18-2018

Figured this out. So the main problem was that CMA was not properly enabled in the kernel config. CMA itself was all enabled but the option CONFIG_DMA_CMA was set to no. Because of this the kernel was just ignoring the >256MB i had assigned as reserved cma memory. This means not only was the system not using cma for the capture stuff but that it was also working on half the memory.

So after running a recording for awhile all the memory allocation from file system writes would fragment the memory so badly that on a second recording it could not find space for all 6 of the almost 2MB continuous buffers it needed and it would fail. Even calling drop_caches would only help for a short while.

So I correctly enabled CMA to allow for the continuous memory to be available and that fixed it. For good measure I also went ahead and modified mxc_v4l2_capture.c to not release the buffers once allocated. So on first run it allocates buffers then keeps them until removal of the driver (never) or if a bigger size is needed because pix.sizeimage changes it will release and reallocate as well. This is much the way that the mxc_vpu works as well. I figured it didn't make sense to ever release since this is a recording device and i will just be almost constantly using them.

igorpadykov · ‎05-10-2018

Hi Jarrod

possible issues with long videos are described in attached Release Notes

sect.6.6 Known issues and limitations for multimedia:

As the maximum buffer size of the playbin multi-queue is 2 MB, problems may be

seen with some long audio or video interleaved streams. You can enlarge this buffer

size to support these special use cases.

Best regards
igor
-----------------------------------------------------------------------------------------------------------------------
Note: If this post answers your question, please click the Correct Answer button. Thank you!
-----------------------------------------------------------------------------------------------------------------------

jcc273 · ‎05-10-2018

I am recording not playing back though so i am not using playbin or multiqueue. Also the issue happens if i record many short videos instead of just 1 long as well. It's like there is some sort of memory leak, after recording so much it just runs out of memory.

igorpadykov · ‎05-10-2018

please try to reproduce issue of SabreSD reference board with Demo Images:

https://www.nxp.com/webapp/Download?colCode=L4.1.15_2.0.0_iMX7D&appType=license&location=null&Parent...

also one can use dropping disc caches during video playback via VPU with a small perl script, as described on post [May 22, 2014 12:46 PM]

https://community.nxp.com/thread/323018

jcc273 · ‎05-11-2018

I do not have a SabreSD board, i have a compulab board and and econ systems board but I'm guessing those won't help you here? I am already dropping disk caches during recording every few seconds and after the recording ends. still crashes. Doesn't seem to be related to the encoder though. I was able to create a simple pipeline that does it:

gst-launch-1.0 -e imxv4l2videosrc ! avimux ! filesink location=/home/root/tempTest.avi

Letting this run for a few minutes (creates a 2GB AVI file) and then stopping i get the issue described where i cannot start another recording without restarting because it can't get memory. So it's like the actual videosrc buffer memory is not getting released??? Its weird because it doesn't run out of memory during recording, i can record for as long as i want with no issues but after recording for several minutes worth of time and then stopping i cannot start another recording : /

igorpadykov · ‎05-13-2018

support policy for non-nxp boards is described on link
FSL Community BSP Release Notes 2.3 documentation
"..every new board must have someone assigned as maintainer..
The maintainer duties:
Responsible to keep that machine working (that means, booting and with some stability)
Keep kernel, u-boot updated/tested/working.
Keep release notes updated
Keep test cycle updated.."
So for issues with compulab board and and econ systems boards software one can apply
to vendor of these boards.

Best regards
igor

i.MX6 Memory issues after long encoding

i.MX6 Memory issues after long encoding

i.MX6_All

i.MX6DL

i.MX6S

Multimedia

Suspected Software Defect