Machine Check Exception occur with data cache enabled on MPC5746C multi-core project

cancel
Showing results for 
Show  only  | Search instead for 
Did you mean: 

Machine Check Exception occur with data cache enabled on MPC5746C multi-core project

2,633 Views
1045302770
Contributor III

Dear NXP team,

I am developing a multi-core project on MPC5746C, instruction cache and data cache of Z4 core is enabled at startup. To avoid cache coherency problem, SMPU is enabled and a region in system RAM is set as cache inhibited for inter-core communication. A machine check exception occurs with MAV/LD/BUS_RDERR set in MCSR, the example code of Z4 core and Z2 core are as follows:

 

/************************************

 * Z4 core Code     

 ***********************************/

#pragma ghs section data = ".NONE_CACHE_RAM"   /*NONE_CACHE_RAM refers the cache inhibited region of SMPU*/

unsigned int z4_write_var = 1;

#pragma ghs section data = default

 

void test_code_z4(void)

{

while(1)

{

z4_write_var++;

}

}

 

/************************************

 * Z2 core Code     

 ***********************************/

extern unsigned int z4_write_var;

unsigned int z2_read_var;

 

void test_code_z2(void)

{

while(1)

{

z2_read_var = z4_write_var;

}

}

 

The running result of the program is that a machine check exception occurs, MAV/LD/BUS_RDERR are set in MCSR, and MCAR indicates the address of z4_write_var, I have checked .map file and I’m sure that z4_write_var locates within the cache inhibit region ".NONE_CACHE_RAM". And I have checked MEMU module, there is nothing indicating an ECC or EDC error.

 

I want to know what is the reason for the exception , and how to deal with it.

 

Looking forward for a reply!

 

Best Regards!

Victor

0 Kudos
Reply
12 Replies

2,568 Views
davidtosenovjan
NXP TechSupport
NXP TechSupport

Have you checked CMPU setting for second core? Isn't access restricted?

0 Kudos
Reply

2,558 Views
1045302770
Contributor III
I have checked it, the word2 of the NONE_CACHE_RAM memory region is 0xF3FCF000. And the exception occurs after z4_write_var counts for a while and z2_read_var changes normally until the exception occurs.
0 Kudos
Reply

2,547 Views
davidtosenovjan
NXP TechSupport
NXP TechSupport

Is this error dependent on running of z4 core? If you stop z4, letting only z2 core at run, does it lead in exception as well?

0 Kudos
Reply

2,537 Views
1045302770
Contributor III

Some additional information, when data cache is disabled or smpu is disabled, no exception occurs.

0 Kudos
Reply

2,523 Views
davidtosenovjan
NXP TechSupport
NXP TechSupport

It should not be any difference between disabled cache and access to the SMPU region with cache inhibited attribute. Please check SMPU setting. Btw it is stated that aborted core accesses generate instruction storage (ISI) or data storage (DSI) interrupts, what would fit to described behavior.

0 Kudos
Reply

2,456 Views
1045302770
Contributor III

Sorry, I was on vacation these days. I have checked SMPU setting, there is no overlapping between SMPU regions, and the setting of the NON_CACHE_RAM is as follows:

REG_WRITE32(SMPU_0_RGD7_W2, 0xf3fcf000);
REG_WRITE32(SMPU_0_RGD7_W3, 0x00000002);
REG_WRITE32(SMPU_0_RGD7_W4, 0x00000000);
REG_WRITE32(SMPU_0_RGD7_W5, 0x00000001);

And I don't know what "aborted core accesses" is, would you please give me some special example.

0 Kudos
Reply

2,443 Views
davidtosenovjan
NXP TechSupport
NXP TechSupport

I mean that access can be terminated (aborted) by SMPU. You could investigate SMPU status registers at the time of error. 

davidtosenovjan_0-1636969706809.png

 

0 Kudos
Reply

2,372 Views
1045302770
Contributor III

The Error Status Register of SMPU did not imply any errors.

Does this problem have anything to do with the Store Buffer.

I just tried to disable the store buffer by the following code,

__MTSPR(976,0x00000008)

Then the  program runs for 30 minutes with no error, I will test for longer time, I will update the test result to you tomorrow.

Thanks again for your help.

0 Kudos
Reply

2,541 Views
1045302770
Contributor III

If you stop z4, letting only z2 core at run, does it lead in exception as well?

 

No, I have tested for minutes, there is no exception.

0 Kudos
Reply

2,606 Views
davidtosenovjan
NXP TechSupport
NXP TechSupport

Hi, in this test case i would say, possible cause be XBAR priority setting

davidtosenovjan_0-1635502497901.png

I would try to use round-robin configuration for both cores. Let me know if it helps or not

0 Kudos
Reply

2,600 Views
1045302770
Contributor III

Hi, thanks for your reply. 

I have tried to use round-robin configuration, but it doesn't work, the exception still happens. My original XBAR configuration is as followse:

1045302770_0-1635508780887.png

 

0 Kudos
Reply

2,613 Views
1045302770
Contributor III

@lukaszadrapa  @davidtosenovjan Sorry for bothering you, but would you please give me some advice to solve this problem?

0 Kudos
Reply