Machine Check Exception occur with data cache enabled on MPC5746C multi-core project

キャンセル
次の結果を表示 
表示  限定  | 次の代わりに検索 
もしかして: 

Machine Check Exception occur with data cache enabled on MPC5746C multi-core project

2,630件の閲覧回数
1045302770
Contributor III

Dear NXP team,

I am developing a multi-core project on MPC5746C, instruction cache and data cache of Z4 core is enabled at startup. To avoid cache coherency problem, SMPU is enabled and a region in system RAM is set as cache inhibited for inter-core communication. A machine check exception occurs with MAV/LD/BUS_RDERR set in MCSR, the example code of Z4 core and Z2 core are as follows:

 

/************************************

 * Z4 core Code     

 ***********************************/

#pragma ghs section data = ".NONE_CACHE_RAM"   /*NONE_CACHE_RAM refers the cache inhibited region of SMPU*/

unsigned int z4_write_var = 1;

#pragma ghs section data = default

 

void test_code_z4(void)

{

while(1)

{

z4_write_var++;

}

}

 

/************************************

 * Z2 core Code     

 ***********************************/

extern unsigned int z4_write_var;

unsigned int z2_read_var;

 

void test_code_z2(void)

{

while(1)

{

z2_read_var = z4_write_var;

}

}

 

The running result of the program is that a machine check exception occurs, MAV/LD/BUS_RDERR are set in MCSR, and MCAR indicates the address of z4_write_var, I have checked .map file and I’m sure that z4_write_var locates within the cache inhibit region ".NONE_CACHE_RAM". And I have checked MEMU module, there is nothing indicating an ECC or EDC error.

 

I want to know what is the reason for the exception , and how to deal with it.

 

Looking forward for a reply!

 

Best Regards!

Victor

タグ(4)
0 件の賞賛
返信
12 返答(返信)

2,565件の閲覧回数
davidtosenovjan
NXP TechSupport
NXP TechSupport

Have you checked CMPU setting for second core? Isn't access restricted?

0 件の賞賛
返信

2,555件の閲覧回数
1045302770
Contributor III
I have checked it, the word2 of the NONE_CACHE_RAM memory region is 0xF3FCF000. And the exception occurs after z4_write_var counts for a while and z2_read_var changes normally until the exception occurs.
0 件の賞賛
返信

2,544件の閲覧回数
davidtosenovjan
NXP TechSupport
NXP TechSupport

Is this error dependent on running of z4 core? If you stop z4, letting only z2 core at run, does it lead in exception as well?

0 件の賞賛
返信

2,534件の閲覧回数
1045302770
Contributor III

Some additional information, when data cache is disabled or smpu is disabled, no exception occurs.

0 件の賞賛
返信

2,520件の閲覧回数
davidtosenovjan
NXP TechSupport
NXP TechSupport

It should not be any difference between disabled cache and access to the SMPU region with cache inhibited attribute. Please check SMPU setting. Btw it is stated that aborted core accesses generate instruction storage (ISI) or data storage (DSI) interrupts, what would fit to described behavior.

0 件の賞賛
返信

2,453件の閲覧回数
1045302770
Contributor III

Sorry, I was on vacation these days. I have checked SMPU setting, there is no overlapping between SMPU regions, and the setting of the NON_CACHE_RAM is as follows:

REG_WRITE32(SMPU_0_RGD7_W2, 0xf3fcf000);
REG_WRITE32(SMPU_0_RGD7_W3, 0x00000002);
REG_WRITE32(SMPU_0_RGD7_W4, 0x00000000);
REG_WRITE32(SMPU_0_RGD7_W5, 0x00000001);

And I don't know what "aborted core accesses" is, would you please give me some special example.

0 件の賞賛
返信

2,440件の閲覧回数
davidtosenovjan
NXP TechSupport
NXP TechSupport

I mean that access can be terminated (aborted) by SMPU. You could investigate SMPU status registers at the time of error. 

davidtosenovjan_0-1636969706809.png

 

0 件の賞賛
返信

2,369件の閲覧回数
1045302770
Contributor III

The Error Status Register of SMPU did not imply any errors.

Does this problem have anything to do with the Store Buffer.

I just tried to disable the store buffer by the following code,

__MTSPR(976,0x00000008)

Then the  program runs for 30 minutes with no error, I will test for longer time, I will update the test result to you tomorrow.

Thanks again for your help.

0 件の賞賛
返信

2,538件の閲覧回数
1045302770
Contributor III

If you stop z4, letting only z2 core at run, does it lead in exception as well?

 

No, I have tested for minutes, there is no exception.

0 件の賞賛
返信

2,603件の閲覧回数
davidtosenovjan
NXP TechSupport
NXP TechSupport

Hi, in this test case i would say, possible cause be XBAR priority setting

davidtosenovjan_0-1635502497901.png

I would try to use round-robin configuration for both cores. Let me know if it helps or not

0 件の賞賛
返信

2,597件の閲覧回数
1045302770
Contributor III

Hi, thanks for your reply. 

I have tried to use round-robin configuration, but it doesn't work, the exception still happens. My original XBAR configuration is as followse:

1045302770_0-1635508780887.png

 

0 件の賞賛
返信

2,610件の閲覧回数
1045302770
Contributor III

@lukaszadrapa  @davidtosenovjan Sorry for bothering you, but would you please give me some advice to solve this problem?

0 件の賞賛
返信