Hi,
I'm using imx8mp with 2GB RAM (UHD + camera + AI algorithm)
And I can't clear weston crash below,
Mar 28 04:38:38 imx8mp-lpddr4-evk kernel: alloc_contig_range: [6e400, 703a4) PFNs busy
Mar 28 04:38:38 imx8mp-lpddr4-evk kernel: alloc_contig_range: [72400, 743a4) PFNs busy
Mar 28 04:38:38 imx8mp-lpddr4-evk kernel: alloc_contig_range: [72400, 744a4) PFNs busy
Mar 28 04:38:38 imx8mp-lpddr4-evk kernel: alloc_contig_range: [72400, 745a4) PFNs busy
Mar 28 04:38:38 imx8mp-lpddr4-evk kernel: alloc_contig_range: [72400, 746a4) PFNs busy
Mar 28 04:38:38 imx8mp-lpddr4-evk kernel: alloc_contig_range: [72800, 747a4) PFNs busy
Mar 28 04:38:38 imx8mp-lpddr4-evk kernel: alloc_contig_range: [72800, 748a4) PFNs busy
Mar 28 04:38:38 imx8mp-lpddr4-evk kernel: alloc_contig_range: [72800, 749a4) PFNs busy
Mar 28 04:38:38 imx8mp-lpddr4-evk kernel: alloc_contig_range: [72800, 74aa4) PFNs busy
Mar 28 04:38:38 imx8mp-lpddr4-evk kernel: alloc_contig_range: [72800, 74ba4) PFNs busy
Mar 28 04:38:38 imx8mp-lpddr4-evk weston[474]: g2d_alloc: alloc memory fail with size 33177600!
Mar 28 04:38:39 imx8mp-lpddr4-evk kernel: audit: type=1701 audit(1616906318.984:4): auid=0 uid=0 gid=0 ses=2 pid=474 comm="weston" exe="/usr/bin/weston" sig=11 res=1
Mar 28 04:38:38 imx8mp-lpddr4-evk audit[474]: ANOM_ABEND auid=0 uid=0 gid=0 ses=2 pid=474 comm="weston" exe="/usr/bin/weston" sig=11 res=1
Mar 28 04:38:39 imx8mp-lpddr4-evk systemd-logind[351]: Session c1 logged out. Waiting for processes to exit.
Mar 28 04:38:39 imx8mp-lpddr4-evk ergoai_test[496]: Error reading events from display: Broken pipe
Mar 28 04:38:39 imx8mp-lpddr4-evk systemd[1]: weston.service: Main process exited, code=killed, status=11/SEGV
Mar 28 04:38:39 imx8mp-lpddr4-evk systemd[1]: weston.service: Failed with result 'signal'.
Mar 28 04:38:39 imx8mp-lpddr4-evk systemd[1]: session-c1.scope: Succeeded.
Here is my memory info
Mar 24 10:25:19 imx8mp-lpddr4-evk kernel: Booting Linux on physical CPU 0x0000000000 [0x410fd034]
Mar 24 10:25:19 imx8mp-lpddr4-evk kernel: Linux version 5.10.72+ergoai-0.2.1+g366d92006 (oe-user@oe-host) (aarch64-poky-linux-gcc (GCC) 10.2.0, GNU ld (GNU Binutils) 2.36.1.20210209) #1 SMP PREEMPT Mon Sep 26 03:23:54 UTC 2022
Mar 24 10:25:19 imx8mp-lpddr4-evk kernel: Machine model: NXP i.MX8MPlus board
Mar 24 10:25:19 imx8mp-lpddr4-evk kernel: efi: UEFI not found.
Mar 24 10:25:19 imx8mp-lpddr4-evk kernel: Reserved memory: created CMA memory pool at 0x0000000058000000, size 512 MiB
Mar 24 10:25:19 imx8mp-lpddr4-evk kernel: OF: reserved mem: initialized node linux,cma, compatible id shared-dma-pool
Mar 24 10:25:19 imx8mp-lpddr4-evk kernel: Reserved memory: created DMA memory pool at 0x0000000079f00000, size 1 MiB
Mar 24 10:25:19 imx8mp-lpddr4-evk kernel: OF: reserved mem: initialized node vdev0buffer@79f00000, compatible id shared-dma-pool
Mar 24 10:25:19 imx8mp-lpddr4-evk kernel: NUMA: No NUMA configuration found
Mar 24 10:25:19 imx8mp-lpddr4-evk kernel: NUMA: Faking a node at [mem 0x0000000040000000-0x00000000bfffffff]
Mar 24 10:25:19 imx8mp-lpddr4-evk kernel: NUMA: NODE_DATA [mem 0xbfc48700-0xbfc4afff]
Mar 24 10:25:19 imx8mp-lpddr4-evk kernel: Zone ranges:
Mar 24 10:25:19 imx8mp-lpddr4-evk kernel: DMA [mem 0x0000000040000000-0x00000000bfffffff]
Mar 24 10:25:19 imx8mp-lpddr4-evk kernel: DMA32 empty
Mar 24 10:25:19 imx8mp-lpddr4-evk kernel: Normal empty
Mar 24 10:25:19 imx8mp-lpddr4-evk kernel: Movable zone start for each node
Mar 24 10:25:19 imx8mp-lpddr4-evk kernel: Early memory node ranges
Mar 24 10:25:19 imx8mp-lpddr4-evk kernel: node 0: [mem 0x0000000040000000-0x0000000077ffffff]
Mar 24 10:25:19 imx8mp-lpddr4-evk kernel: node 0: [mem 0x0000000078000000-0x0000000079ffffff]
Mar 24 10:25:19 imx8mp-lpddr4-evk kernel: node 0: [mem 0x000000007a000000-0x00000000943fffff]
Mar 24 10:25:19 imx8mp-lpddr4-evk kernel: node 0: [mem 0x0000000094400000-0x00000000a43fffff]
Mar 24 10:25:19 imx8mp-lpddr4-evk kernel: node 0: [mem 0x00000000a4400000-0x00000000bfffffff]
Mar 24 10:25:19 imx8mp-lpddr4-evk kernel: Initmem setup node 0 [mem 0x0000000040000000-0x00000000bfffffff]
Mar 24 10:25:19 imx8mp-lpddr4-evk kernel: On node 0 totalpages: 524288
Mar 24 10:25:19 imx8mp-lpddr4-evk kernel: DMA zone: 8192 pages used for memmap
Mar 24 10:25:19 imx8mp-lpddr4-evk kernel: DMA zone: 0 pages reserved
Mar 24 10:25:19 imx8mp-lpddr4-evk kernel: DMA zone: 524288 pages, LIFO batch:63
Mar 24 10:25:19 imx8mp-lpddr4-evk kernel: psci: probing for conduit method from DT.
Mar 24 10:25:19 imx8mp-lpddr4-evk kernel: psci: PSCIv1.1 detected in firmware.
Mar 24 10:25:19 imx8mp-lpddr4-evk kernel: psci: Using standard PSCI v0.2 function IDs
Mar 24 10:25:19 imx8mp-lpddr4-evk kernel: psci: MIGRATE_INFO_TYPE not supported.
Mar 24 10:25:19 imx8mp-lpddr4-evk kernel: psci: SMC Calling Convention v1.2
Mar 24 10:25:19 imx8mp-lpddr4-evk kernel: percpu: Embedded 23 pages/cpu s56280 r8192 d29736 u94208
Mar 24 10:25:19 imx8mp-lpddr4-evk kernel: pcpu-alloc: s56280 r8192 d29736 u94208 alloc=23*4096
Mar 24 10:25:19 imx8mp-lpddr4-evk kernel: pcpu-alloc: [0] 0 [0] 1 [0] 2 [0] 3
Mar 24 10:25:19 imx8mp-lpddr4-evk kernel: Detected VIPT I-cache on CPU0
Mar 24 10:25:19 imx8mp-lpddr4-evk kernel: CPU features: detected: ARM erratum 845719
Mar 24 10:25:19 imx8mp-lpddr4-evk kernel: CPU features: detected: GIC system register CPU interface
Mar 24 10:25:19 imx8mp-lpddr4-evk kernel: Built 1 zonelists, mobility grouping on. Total pages: 516096
Mar 24 10:25:19 imx8mp-lpddr4-evk kernel: Policy zone: DMA
Mar 24 10:25:19 imx8mp-lpddr4-evk kernel: Kernel command line: console=ttymxc1,115200 root=/dev/mmcblk2p2 rootwait rw
Mar 24 10:25:19 imx8mp-lpddr4-evk kernel: Dentry cache hash table entries: 262144 (order: 9, 2097152 bytes, linear)
Mar 24 10:25:19 imx8mp-lpddr4-evk kernel: Inode-cache hash table entries: 131072 (order: 8, 1048576 bytes, linear)
Mar 24 10:25:19 imx8mp-lpddr4-evk kernel: mem auto-init: stack:off, heap alloc:off, heap free:off
Mar 24 10:25:19 imx8mp-lpddr4-evk kernel: Memory: 1203184K/2097152K available (15744K kernel code, 1388K rwdata, 6080K rodata, 10752K init, 540K bss, 369680K reserved, 524288K cma-reserved)
When weston crash occurs,
Mar 28 04:38:31 CmaFree: 71240 kB
Mar 28 04:38:32 CmaFree: 70728 kB
Mar 28 04:38:33 CmaFree: 71536 kB
Mar 28 04:38:34 CmaFree: 71672 kB
Mar 28 04:38:36 CmaFree: 70540 kB
Mar 28 04:38:37 CmaFree: 71280 kB
Mar 28 04:38:38 CmaFree: 70936 kB
Mar 28 04:38:39 CmaFree: 200108 kB
Mar 28 04:38:40 CmaFree: 200372 kB
Mar 28 04:38:42 CmaFree: 200456 kB
top - 04:38:37 up 3 days, 18:13, 2 users, load average: 3.43, 4.02, 4.22
Tasks: 121 total, 2 running, 119 sleeping, 0 stopped, 0 zombie
%Cpu(s): 33.8 us, 26.8 sy, 0.0 ni, 35.2 id, 0.0 wa, 2.8 hi, 1.4 si, 0.0 st
MiB Mem : 1697.5 total, 642.0 free, 659.7 used, 395.8 buff/cache
MiB Swap: 0.0 total, 0.0 free, 0.0 used. 882.0 avail Mem
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
496 root 20 0 2290104 242480 118396 R 187.5 13.9 8136:44 ergoai_+
494 root 20 0 227840 2036 1780 S 6.2 0.1 55:49.76 MotorUa+
3620738 root 0 -20 0 0 0 I 6.2 0.0 0:10.21 kworker+
3674367 root 20 0 4836 2324 1984 R 6.2 0.1 0:00.02 top
1 root 20 0 92540 7448 5648 S 0.0 0.4 0:10.88 systemd
2 root 20 0 0 0 0 S 0.0 0.0 0:00.38 kthreadd
3 root 0 -20 0 0 0 I 0.0 0.0 0:00.00 rcu_gp
4 root 0 -20 0 0 0 I 0.0 0.0 0:00.00 rcu_par+
8 root 0 -20 0 0 0 I 0.0 0.0 0:00.00 mm_perc+
top - 04:38:38 up 3 days, 18:13, 2 users, load average: 3.43, 4.02, 4.22
Tasks: 122 total, 2 running, 119 sleeping, 0 stopped, 1 zombie
%Cpu(s): 32.8 us, 29.9 sy, 0.0 ni, 34.3 id, 0.0 wa, 1.5 hi, 1.5 si, 0.0 st
MiB Mem : 1697.5 total, 642.0 free, 659.7 used, 395.8 buff/cache
MiB Swap: 0.0 total, 0.0 free, 0.0 used. 882.0 avail Mem
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
496 root 20 0 2290104 242480 118396 R 188.2 13.9 8136:47 ergoai_+
3620738 root 0 -20 0 0 0 I 5.9 0.0 0:10.24 kworker+
1 root 20 0 92540 7448 5648 S 0.0 0.4 0:10.88 systemd
2 root 20 0 0 0 0 S 0.0 0.0 0:00.38 kthreadd
3 root 0 -20 0 0 0 I 0.0 0.0 0:00.00 rcu_gp
4 root 0 -20 0 0 0 I 0.0 0.0 0:00.00 rcu_par+
8 root 0 -20 0 0 0 I 0.0 0.0 0:00.00 mm_perc+
9 root 20 0 0 0 0 S 0.0 0.0 0:00.00 rcu_tas+
10 root 20 0 0 0 0 S 0.0 0.0 0:45.06 ksoftir+
top - 04:38:39 up 3 days, 18:13, 2 users, load average: 3.24, 3.97, 4.20
Tasks: 118 total, 1 running, 117 sleeping, 0 stopped, 0 zombie
%Cpu(s): 0.0 us, 1.5 sy, 0.0 ni, 98.5 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st
MiB Mem : 1697.5 total, 1109.1 free, 361.3 used, 227.1 buff/cache
MiB Swap: 0.0 total, 0.0 free, 0.0 used. 1246.1 avail Mem
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
1 root 20 0 92540 7448 5648 S 0.0 0.4 0:10.97 systemd
2 root 20 0 0 0 0 S 0.0 0.0 0:00.38 kthreadd
3 root 0 -20 0 0 0 I 0.0 0.0 0:00.00 rcu_gp
4 root 0 -20 0 0 0 I 0.0 0.0 0:00.00 rcu_par+
8 root 0 -20 0 0 0 I 0.0 0.0 0:00.00 mm_perc+
9 root 20 0 0 0 0 S 0.0 0.0 0:00.00 rcu_tas+
10 root 20 0 0 0 0 S 0.0 0.0 0:45.06 ksoftir+
11 root 20 0 0 0 0 I 0.0 0.0 3:41.90 rcu_pre+
12 root rt 0 0 0 0 S 0.0 0.0 0:02.18 migrati+
Hi, @Bio_TICFSL
Here is my CMA settings,
As-is : DDR RAM 2GB
GPU(DMA) range 1GB
CMA size 512MB (0x20000000)
Galcore CMA 256MB (cat /sys/module/galcore/parameters/contiguousSize, 268435456)
To-be : DDR RAM 2GB
GPU(DMA) range 1GB
CMA size 704MB (0x2C000000)
Galcore CMA ????
Is there any recommand for Galcore CMA size in case CMA 704MB ?
Sorry for the delay.
The best case scenario would be for you to first try to increase the CMA size only. Maybe only this modification would solve the crash. After that they can try to increase the size of the Galcore CMA in order to improve the performances.
The suggestion would be that Galcore CMA should be at least the maximum size the graphics applications will use at runtime. Also, if the customer uses CMA for non-GPU tasks, they should keep some room outside of the Galcore CMA for these non-GPU tasks.
Regards
Hi @Bio_TICFSL
Could you tell me how to measure the memory size of graphics applications at runtime?
I already know /proc/meminfo or top, but I want to check about "the memory size of graphics applications at runtime"
I think you are looking for is the gpuinfo tool, which is a script used to gather GPU runtime status debugfs interface. It can be found here: /unit_tests/GPU/gpuinfo.sh and can provide information about total memory usage, for certain process or for all processes. Also, gputop tool can be used to trace information about memory usage your application is using at runtime.
More information regarding these tools and how to use them properly can be found in the "14.1 gpuinfo tool" and "14.2 gputop tool " chapters of the i.MX Graphics User's Guide (nxp.com)
Regards
Hi @Bio_TICFSL
When I increase the galcore CMA 256MB to 384MB, the warning message is printed more than before.
I think that is no good because the more warning print, the worse case.
What is certain is that there should be fewer warnings, right?
before : DDR RAM 2GB
GPU(DMA) range 1GB
CMA size 704MB
Galcore CMA 256MB (cat /sys/module/galcore/parameters/contiguousSize, 268435456)
// Suppressed callback count less than 54 and the duration between warnings are long
줄 1745: Mar 24 12:45:01 imx8mp-lpddr4-evk kernel: alloc_contig_range: 20 callbacks suppressed
줄 1756: Mar 24 12:45:09 imx8mp-lpddr4-evk kernel: alloc_contig_range: 54 callbacks suppressed
줄 1767: Mar 24 12:45:17 imx8mp-lpddr4-evk kernel: alloc_contig_range: 54 callbacks suppressed
줄 1778: Mar 24 12:45:25 imx8mp-lpddr4-evk kernel: alloc_contig_range: 54 callbacks suppressed
줄 1789: Mar 24 12:46:01 imx8mp-lpddr4-evk kernel: alloc_contig_range: 22 callbacks suppressed
줄 1800: Mar 24 12:46:09 imx8mp-lpddr4-evk kernel: alloc_contig_range: 54 callbacks suppressed
줄 1811: Mar 24 12:46:17 imx8mp-lpddr4-evk kernel: alloc_contig_range: 54 callbacks suppressed
줄 1822: Mar 24 13:19:32 imx8mp-lpddr4-evk kernel: alloc_contig_range: 54 callbacks suppressed
줄 1833: Mar 24 14:08:55 imx8mp-lpddr4-evk kernel: alloc_contig_range: 21 callbacks suppressed
줄 1844: Mar 24 14:17:41 imx8mp-lpddr4-evk kernel: alloc_contig_range: 22 callbacks suppressed
줄 1855: Mar 24 14:19:38 imx8mp-lpddr4-evk kernel: alloc_contig_range: 22 callbacks suppressed
줄 1866: Mar 25 21:37:46 imx8mp-lpddr4-evk kernel: alloc_contig_range: 22 callbacks suppressed
---------------------------------------------------------------------------------------------------------------------------------------
after : DDR RAM 2GB
GPU(DMA) range 1.2GB
CMA size 800MB
Galcore CMA 384MB
// Suppressed callback count less than 815 and the duration between warnings are short (almost every 5~6 sec)
Line 6415: Mar 24 11:10:49 imx8mp-lpddr4-evk kernel: alloc_contig_range: 242 callbacks suppressed
Line 6426: Mar 24 11:10:54 imx8mp-lpddr4-evk kernel: alloc_contig_range: 242 callbacks suppressed
Line 6437: Mar 24 11:11:00 imx8mp-lpddr4-evk kernel: alloc_contig_range: 242 callbacks suppressed
Line 6448: Mar 24 11:11:05 imx8mp-lpddr4-evk kernel: alloc_contig_range: 242 callbacks suppressed
Line 6459: Mar 24 11:11:11 imx8mp-lpddr4-evk kernel: alloc_contig_range: 242 callbacks suppressed
Line 6470: Mar 24 11:11:16 imx8mp-lpddr4-evk kernel: alloc_contig_range: 242 callbacks suppressed
Line 6481: Mar 24 11:11:21 imx8mp-lpddr4-evk kernel: alloc_contig_range: 242 callbacks suppressed
Line 6492: Mar 24 11:11:27 imx8mp-lpddr4-evk kernel: alloc_contig_range: 242 callbacks suppressed
Line 6503: Mar 24 11:11:32 imx8mp-lpddr4-evk kernel: alloc_contig_range: 242 callbacks suppressed
Line 6514: Mar 24 11:11:38 imx8mp-lpddr4-evk kernel: alloc_contig_range: 242 callbacks suppressed
Line 6525: Mar 24 11:11:43 imx8mp-lpddr4-evk kernel: alloc_contig_range: 242 callbacks suppressed
Line 6536: Mar 24 11:11:48 imx8mp-lpddr4-evk kernel: alloc_contig_range: 242 callbacks suppressed
Line 6547: Mar 24 11:11:54 imx8mp-lpddr4-evk kernel: alloc_contig_range: 242 callbacks suppressed
Line 6558: Mar 24 11:11:59 imx8mp-lpddr4-evk kernel: alloc_contig_range: 270 callbacks suppressed
Line 6569: Mar 24 11:12:04 imx8mp-lpddr4-evk kernel: alloc_contig_range: 242 callbacks suppressed
Line 6580: Mar 24 11:12:10 imx8mp-lpddr4-evk kernel: alloc_contig_range: 242 callbacks suppressed
Line 6591: Mar 24 11:12:15 imx8mp-lpddr4-evk kernel: alloc_contig_range: 270 callbacks suppressed
Line 6602: Mar 24 11:12:20 imx8mp-lpddr4-evk kernel: alloc_contig_range: 242 callbacks suppressed
Line 6613: Mar 24 11:12:26 imx8mp-lpddr4-evk kernel: alloc_contig_range: 242 callbacks suppressed
Line 6624: Mar 24 11:12:31 imx8mp-lpddr4-evk kernel: alloc_contig_range: 242 callbacks suppressed
Line 6635: Mar 24 11:12:36 imx8mp-lpddr4-evk kernel: alloc_contig_range: 242 callbacks suppressed
Line 6646: Mar 24 11:12:42 imx8mp-lpddr4-evk kernel: alloc_contig_range: 453 callbacks suppressed
Line 6657: Mar 24 11:12:47 imx8mp-lpddr4-evk kernel: alloc_contig_range: 815 callbacks suppressed
Line 6668: Mar 24 11:12:52 imx8mp-lpddr4-evk kernel: alloc_contig_range: 547 callbacks suppressed
Line 6679: Mar 24 11:12:57 imx8mp-lpddr4-evk kernel: alloc_contig_range: 242 callbacks suppressed
Line 6690: Mar 24 11:13:02 imx8mp-lpddr4-evk kernel: alloc_contig_range: 242 callbacks suppressed
Line 6701: Mar 24 11:13:08 imx8mp-lpddr4-evk kernel: alloc_contig_range: 242 callbacks suppressed
Line 6712: Mar 24 11:13:13 imx8mp-lpddr4-evk kernel: alloc_contig_range: 522 callbacks suppressed
Line 6723: Mar 24 11:13:18 imx8mp-lpddr4-evk kernel: alloc_contig_range: 770 callbacks suppressed
Line 6734: Mar 24 11:13:23 imx8mp-lpddr4-evk kernel: alloc_contig_range: 774 callbacks suppressed
Line 6745: Mar 24 11:13:28 imx8mp-lpddr4-evk kernel: alloc_contig_range: 546 callbacks suppressed
Line 6756: Mar 24 11:13:34 imx8mp-lpddr4-evk kernel: alloc_contig_range: 242 callbacks suppressed
Line 6767: Mar 24 11:13:39 imx8mp-lpddr4-evk kernel: alloc_contig_range: 242 callbacks suppressed
Line 6778: Mar 24 11:13:44 imx8mp-lpddr4-evk kernel: alloc_contig_range: 242 callbacks suppressed
Line 6789: Mar 24 11:13:49 imx8mp-lpddr4-evk kernel: alloc_contig_range: 242 callbacks suppressed
hi,
The warning messages "alloc_contig_range" are around the memory usage of the GPU with CMA, they're not harmful, not a bug so they're not going to be fixed. If there's any cma_alloc failure kernel log, then it would cause problems.
Regards
Hi @Bio_TICFSL
The important thing is that less warning message is better between 2 settings at the same scene, aren't you?
BR,
Hello,
I did some research, and it seems that for the GPU the memory is reserved from CMA (Contiguous Memory Allocator). Memory allocation for g2d occurs from CMA where also, as the name suggests a physically contiguous memory gets allocated.
In this case, the reason g2d_alloc fails could be that the contiguous memory size is exceeded. At the boot time a fixed size of physically contiguous memory is allocated from where the GPU driver is trying to allocate.
If you want to check the size of the contiguous memory run the following command:
cat /sys/module/galcore/parameters/contiguousSize
Also, you can dynamically increase the size of the physically contiguous memory by doing the following:
echo '201326592' > /sys/module/galcore/parameters/contiguousSize
Maybe this can help with the problem. If you have any further questions do not hesitate to ask.
Regards