imx6q: PL310 caching issues?

cancel
Showing results for 
Show  only  | Search instead for 
Did you mean: 

imx6q: PL310 caching issues?

806 Views
kregl
Contributor I

Hello!

We are running an imx6q platform with kernel version 3.10.108.

The error rate increased at the memory stress testing step.
During this step up to 10 "memtester 35M" processes are spawned and run without core bindings on all 4 cores.
Attached are multiple memtester log of the same device. All kinds of different subtests of memtester fail from time to time.

Improving memtester to resolve the virtual addresses to physical like so:
https://shanetully.com/2014/12/translating-virtual-addres...
showed that the addresses are jumping around, so I think it is no memory problem.
Binding all processes to a single core or running a single instance with all available memory showed also no problems.

Attached is also the memtester routine for the "Block Sequential" subtest.
memtester splits the given memory into two buffers and compares them after each memory modification. It looks like the compare steps reads an old value from one buffer.
Block Sequential : testing 105
FAILURE: 0x69696969 != 0x68686868 at offset 0x01045d1c.
The correct value at step 105 is 105 == 0x69 => 0x69696969

As you can see in the attached failure logs, every failure consists of a contiguous miss match of 8*32bit=32bytes.
Maybe it has something to do with the PL310 cache, its cacheline consists of 32byte?
A short test with some newer 4.* kernel did not show this issue. But I have to run it for longer.
Binding all processes to the same core also solves the issue.

The PL310 driver and the many PL310 ERRATA workarounds were heavily reworked since the 3.10.108 kernel.
Did anybody suffer from the same issue in the past?
Can anybody remember which patch fixed the issue and point me in the right direction?

 

 

memtester version 4.3.0 (32-bit)
Copyright (C) 2001-2012 Charles Cazabon.
Licensed under the GNU General Public License version 2 (only).
pagesize is 4096
pagesizemask is 0xfffff000
want 35MB (36700160 bytes)
got 35MB (36700160 bytes), trying mlock ...locked.
Loop 1/1:
Stuck Address : ok
Random Value : ok
Compare XOR : ok
Compare SUB : ok
Compare MUL : ok
Compare DIV : ok
Compare OR : ok
Compare AND : ok
Sequential Increment: ok
Solid Bits : ok
Block Sequential : testing 105
FAILURE: 0x69696969 != 0x68686868 at offset 0x01045d1c.
FAILURE: 0x69696969 != 0x68686868 at offset 0x01045d20.
FAILURE: 0x69696969 != 0x68686868 at offset 0x01045d24.
FAILURE: 0x69696969 != 0x68686868 at offset 0x01045d28.
FAILURE: 0x69696969 != 0x68686868 at offset 0x01045d2c.
FAILURE: 0x69696969 != 0x68686868 at offset 0x01045d30.
FAILURE: 0x69696969 != 0x68686868 at offset 0x01045d34.
FAILURE: 0x69696969 != 0x68686868 at offset 0x01045d38.
Checkerboard : ok
Bit Spread : ok
Bit Flip : ok
Walking Ones : ok
Walking Zeroes : ok
8-bit Writes : ok
16-bit Writes : ok
Done.

memtester version 4.3.0 (32-bit)
Copyright (C) 2001-2012 Charles Cazabon.
Licensed under the GNU General Public License version 2 (only).
pagesize is 4096
pagesizemask is 0xfffff000
want 35MB (36700160 bytes)
got 35MB (36700160 bytes), trying mlock ...locked.
Loop 1/1:
Stuck Address : ok
Random Value : ok
Compare XOR : ok
Compare SUB : ok
Compare MUL : ok
Compare DIV : ok
Compare OR : ok
Compare AND : ok
Sequential Increment: ok
Solid Bits : ok
Block Sequential : testing 123
FAILURE: 0x7a7a7a7a != 0x7b7b7b7b at offset 0x00a60f00.
FAILURE: 0x7a7a7a7a != 0x7b7b7b7b at offset 0x00a60f04.
FAILURE: 0x7a7a7a7a != 0x7b7b7b7b at offset 0x00a60f08.
FAILURE: 0x7a7a7a7a != 0x7b7b7b7b at offset 0x00a60f0c.
FAILURE: 0x7a7a7a7a != 0x7b7b7b7b at offset 0x00a60f10.
FAILURE: 0x7a7a7a7a != 0x7b7b7b7b at offset 0x00a60f14.
FAILURE: 0x7a7a7a7a != 0x7b7b7b7b at offset 0x00a60f18.
FAILURE: 0x7a7a7a7a != 0x7b7b7b7b at offset 0x00a60f1c.
Checkerboard : ok
Bit Spread : ok
Bit Flip : ok
Walking Ones : ok
Walking Zeroes : ok
8-bit Writes : ok
16-bit Writes : ok
Done.

memtester version 4.3.0 (32-bit)
Copyright (C) 2001-2012 Charles Cazabon.
Licensed under the GNU General Public License version 2 (only).
pagesize is 4096
pagesizemask is 0xfffff000
want 35MB (36700160 bytes)
got 35MB (36700160 bytes), trying mlock ...locked.
Loop 1/1:
Stuck Address : ok
Random Value : ok
Compare XOR : ok
Compare SUB : ok
Compare MUL : ok
Compare DIV : ok
Compare OR : ok
Compare AND : ok
Sequential Increment: ok
Solid Bits : ok
Block Sequential : testing 247
FAILURE: 0xf7f7f7f7 != 0xf6f6f6f6 at offset 0x0079bb1c.
FAILURE: 0xf7f7f7f7 != 0xf6f6f6f6 at offset 0x0079bb20.
FAILURE: 0xf7f7f7f7 != 0xf6f6f6f6 at offset 0x0079bb24.
FAILURE: 0xf7f7f7f7 != 0xf6f6f6f6 at offset 0x0079bb28.
FAILURE: 0xf7f7f7f7 != 0xf6f6f6f6 at offset 0x0079bb2c.
FAILURE: 0xf7f7f7f7 != 0xf6f6f6f6 at offset 0x0079bb30.
FAILURE: 0xf7f7f7f7 != 0xf6f6f6f6 at offset 0x0079bb34.
FAILURE: 0xf7f7f7f7 != 0xf6f6f6f6 at offset 0x0079bb38.
Checkerboard : ok
Bit Spread : ok
Bit Flip : ok
Walking Ones : ok
Walking Zeroes : ok
8-bit Writes : ok
16-bit Writes : ok
Done.

memtester version 4.3.0 (32-bit)
Copyright (C) 2001-2012 Charles Cazabon.
Licensed under the GNU General Public License version 2 (only).
pagesize is 4096
pagesizemask is 0xfffff000
want 35MB (36700160 bytes)
got 35MB (36700160 bytes), trying mlock ...locked.
Loop 1/1:
Stuck Address : ok
Random Value : ok
Compare XOR : ok
Compare SUB : ok
Compare MUL : ok
Compare DIV : ok
Compare OR : ok
Compare AND : ok
Sequential Increment: ok
Solid Bits : testing 37
FAILURE: 0xffffffff != 0x00000000 at offset 0x0097c27c.
FAILURE: 0x00000000 != 0xffffffff at offset 0x0097c280.
FAILURE: 0xffffffff != 0x00000000 at offset 0x0097c284.
FAILURE: 0x00000000 != 0xffffffff at offset 0x0097c288.
FAILURE: 0xffffffff != 0x00000000 at offset 0x0097c28c.
FAILURE: 0x00000000 != 0xffffffff at offset 0x0097c290.
FAILURE: 0xffffffff != 0x00000000 at offset 0x0097c294.
FAILURE: 0x00000000 != 0xffffffff at offset 0x0097c298.
Block Sequential : ok
Checkerboard : ok
Bit Spread : ok
Bit Flip : ok
Walking Ones : ok
Walking Zeroes : ok
8-bit Writes : ok
16-bit Writes : ok
Done.
int compare_regions(ulv *bufa, ulv *bufb, size_t count) {
int r = 0;
size_t i;
ulv *p1 = bufa;
ulv *p2 = bufb;
off_t physaddr;

for (i = 0; i < count; i++, p1++, p2++) {
if (*p1 != *p2) {
if (use_phys) {
physaddr = physaddrbase + (i * sizeof(ul));
fprintf(stderr,
"FAILURE: 0x%08lx != 0x%08lx at physical address "
"0x%08lx.\n",
(ul) *p1, (ul) *p2, physaddr);
} else {
fprintf(stderr,
"FAILURE: 0x%08lx != 0x%08lx at offset 0x%08lx.\n",
(ul) *p1, (ul) *p2, (ul) (i * sizeof(ul)));
}
/* printf("Skipping to next test..."); */
r = -1;
}
}
return r;
}

int test_blockseq_comparison(ulv *bufa, ulv *bufb, size_t count) {
ulv *p1 = bufa;
ulv *p2 = bufb;
unsigned int j;
size_t i;

printf(" ");
fflush(stdout);
for (j = 0; j < 256; j++) {
printf("\b\b\b\b\b\b\b\b\b\b\b");
p1 = (ulv *) bufa;
p2 = (ulv *) bufb;
printf("setting %3u", j);
fflush(stdout);
for (i = 0; i < count; i++) {
*p1++ = *p2++ = (ul) UL_BYTE(j);
}
printf("\b\b\b\b\b\b\b\b\b\b\b");
printf("testing %3u", j);
fflush(stdout);
if (compare_regions(bufa, bufb, count)) {
return -1;
}
}
printf("\b\b\b\b\b\b\b\b\b\b\b \b\b\b\b\b\b\b\b\b\b\b");
fflush(stdout);
return 0;
}

 

 

Labels (2)
0 Kudos
2 Replies

787 Views
kregl
Contributor I

The supplied uboot version from the board vendor is 2015.07.

The CONFIG_ARM_ERRATA_845369 is missing in mainline uboot up to some 2017 version.
But the boardvendor has included this fix in the 2015.07 uboot. All other erratas in your link are also active.

So maybe it hat something to do with a combination uboot+kernel?

 

0 Kudos

800 Views
igorpadykov
NXP Employee
NXP Employee

Hi kregl

 

kernel version 3.10.108 is not supported by nxp, but

some arm errata are fixed by "define CONFIG_ARM_ERRATA" in

imx_v2014.04 used with linux L3.10.53_1.1.0_ga and one can try them

https://source.codeaurora.org/external/imx/uboot-imx/tree/include/configs/mx6_common.h?h=nxp/imx_v20...

 

Best regards
igor

0 Kudos