Test ECC injection with EDAC driver always trigger EDAC interrupt on LX2080A

取消
显示结果 
显示  仅  | 搜索替代 
您的意思是: 

Test ECC injection with EDAC driver always trigger EDAC interrupt on LX2080A

917 次查看
RoseWen
Contributor I

Hi,

When I test ECC error injection with EDAC driver on LX2080A board, it will always trigger edac interrupt even though error injection disable. Is it the normal situation?

Please refer to the below information and attched test log.

LSDK version: 21.08

Linux version: 5.10.35

Kernel Configs:

CONFIG_EDAC=y

CONFIG_EDAC_DEBUG=y

CONFIG_EDAC_SUPPORT=y

CONFIG_EDAC_LAYERSCAPE=y

-----------------------------------------------------------------------------------------

In the test log, I execute the following commands consecutively to enable/disable error injection and mask low data path.

echo 0x100 > /sys/devices/system/edac/mc/mc0/inject_ctrl

echo 0x1 > /sys/devices/system/edac/mc/mc0/inject_data_lo

echo 0x0 > /sys/devices/system/edac/mc/mc0/inject_ctrl

echo 0x0 > /sys/devices/system/edac/mc/mc0/inject_data_lo

The dmesg will always print a lot of error detect messages even though clear ERR_DETECT by manul. 

devmem 0x1080e40 32 0x80000004

The result I expected should be after disabling error injection, the DDR controller will correct the ECC data. The interrupt will only be triggerd once. The value of ce_count will be 1 also.

 

Q1: I want to confirm above test steps are correct or not?

Q2: How to clear the ERR_DETECT?

Q3: Is the driver fsl_ddr_edac.c support for LX2080A platform?

 

Thanks,

Rose

0 项奖励
回复
5 回复数

770 次查看
LFGP
NXP TechSupport
NXP TechSupport

the ECC in LX2080A is a SECDED, this means single bit error correction and double bit error detection. the single bit errors will increment the SBEC (single bit error counter) till it reaches the SBET(single bit error threshold). once the threshold is reached the ERR_DETECT register SBE flag will be set. you can detect the SBE either by checking the SBEC or the ERR_DETECT register.

the multi-bit errors, only guarantee any two bit flip detections, and ERR_DETECT MBE flag will be set. if you do more than two bit flip in your test, the ECC may or may not be able to detect it.

0 项奖励
回复

766 次查看
RoseWen
Contributor I

Hi

The information you mentioned can be found in the LX2080A datasheet. Waiting for so long but it has no reference value at all.

My question is when SBEC is equal SBET, and disable error injection already. Why the EDAC driver still detect error count even if I clear it manually?

The result I expected should be after disabling error injection, the DDR controller will detect and correct single-bit error. (The ERR_DETECT[SBE] will be 1, and ERR_SBE[SBEC] will be 1 too. ) And the interrupt should be stop.

But the test result always detect error, and the location of error address is different every time. This may mean that EDAC driver is always inject error in background.

Why does this happen?

Attach the test log again, please help to look carefully. Thanks.

 

Rose

0 项奖励
回复

897 次查看
LFGP
NXP TechSupport
NXP TechSupport

(apps team )  the proper procedure to do ECC injection:

Make sure caches are enabled 4/15-much better/controlled process

1)           Enable the cache both instruction and data cache.

2)           Mark the address range to be tested in DDR as cache inhabited or not cacheable 

3)           Set the error bit in DATA_ERR_INJECT_HI/LO DDR register

4)           Enable the error injection by setting ECC_ERR_INJECT[EIEN] = 1

5)           Write to one or less than one cache line size with address cache line aligned.

6)           Disable the error injection ECC_ERR_INJECT[EIEN] = 0

7)           Read from the same location of memory for one cache line or less (at this point the DDR error registers are set correctly).

           Write to the same location the error was injected (this is to correct the content of error injected location, in order to          stop the ECC errors for subsequent reads from this location).

9)           Do whatever is needed with the error discovered. (Perhaps verify it is the exact error expected, or verify/ test the       interrupt handler for this error, or repeat the read step to incur more interrupt before processing the first interrupt to       cause a machine check).

0 项奖励
回复

896 次查看
RoseWen
Contributor I

Hi,

How to enable instruction cache and data cache?

The dcahe status is ON in U-Boot stage as below, does this mean the cache has been enable now?

----------------------------------------------

=> dcache
Data (writethrough) Cache is ON

-----------------------------------------------

In the process you provided, if only set step3,4 and 6 will detect single-bit error with ERR_DETECT register. In addition to these steps, is it neccessary to read/write same memory location for error injection test?

Regarding the question in my first post, please help to reply.

-----------------------------------------------------------

Q1: Please help to confirm EDAC driver test steps are correct or not?

Q2: And how to clear the ERR_DETECT?

Q3: Is the driver fsl_ddr_edac.c support for LX2080A platform?

-----------------------------------------------------------

Thank you!

 

Regards,

Rose

0 项奖励
回复

870 次查看
RoseWen
Contributor I

Hi,

Any updates?

Rose

0 项奖励
回复