spiderman

K70 DDR2 read failure with increasing temperature

Discussion created by spiderman on Apr 6, 2017
Latest reply on Apr 21, 2017 by spiderman

Hello everybody.

 

We are using K70 controller (MK70FN1M0VMJ15 rev. 3N96B).


Architecture is the same as the tower TWRK70F120 supplied by NXP.

 

To reproduce the issue, we are running a simple test software that continues to write and read different values (0x00 to 0xff) in a loop into the whole 128 MB external RAM memory (Samsung K4T1G164QG-BCE7).

Once the board is heated to 40 °C or more, the RAM test fails. Note that once the problem arises, the faulty reading persists with following retries.

 

Our software is using MQX 4.1.0 operating system and the DDR controller is initialized by means of _bsp_ddr2_setup provided in init_hw.c (BSP library) module, as it is, without changes:

 

void _bsp_ddr2_setup (void)
{
SIM_MemMapPtr sim = SIM_BASE_PTR;
DDR_MemMapPtr ddr = DDR_BASE_PTR;
MCM_MemMapPtr mcm = MCM_BASE_PTR;

/* Enable DDR controller clock */
sim->SCGC3 |= SIM_SCGC3_DDR_MASK;

/* Enable DDR pads and set slew rate */
sim->MCR |= 0xC4; /* bits were left out of the manual so there isn't a macro right now */

ddr->RCR |= DDR_RCR_RST_MASK;

* (volatile uint32_t *)(0x400Ae1ac) = 0x01030203;

/* TC's init */
ddr->CR00 = 0x00000400;
ddr->CR02 = 0x02000031;
ddr->CR03 = 0x02020506;
ddr->CR04 = 0x06090202;
ddr->CR05 = 0x02020302;
ddr->CR06 = 0x02904002;
ddr->CR07 = 0x01000303;
ddr->CR08 = 0x05030201;
ddr->CR09 = 0x020000c8;
ddr->CR10 = 0x03003207;
ddr->CR11 = 0x01000000;
ddr->CR12 = 0x04920031;
ddr->CR13 = 0x00000005;
ddr->CR14 = 0x00C80002;
ddr->CR15 = 0x00000032;
ddr->CR16 = 0x00000001;
ddr->CR20 = 0x00030300;
ddr->CR21 = 0x00040232;
ddr->CR22 = 0x00000000;
ddr->CR23 = 0x00040302;
ddr->CR25 = 0x0A010201;
ddr->CR26 = 0x0101FFFF;
ddr->CR27 = 0x01010101;
ddr->CR28 = 0x00000003;
ddr->CR29 = 0x00000000;
ddr->CR30 = 0x00000001;
ddr->CR34 = 0x02020101;
ddr->CR36 = 0x01010201;
ddr->CR37 = 0x00000200;
ddr->CR38 = 0x00200000;
ddr->CR39 = 0x01010020;
ddr->CR40 = 0x00002000;
ddr->CR41 = 0x01010020;
ddr->CR42 = 0x00002000;
ddr->CR43 = 0x01010020;
ddr->CR44 = 0x00000000;
ddr->CR45 = 0x03030303;
ddr->CR46 = 0x02006401;
ddr->CR47 = 0x01020202;
ddr->CR48 = 0x01010064;
ddr->CR49 = 0x00020101;
ddr->CR50 = 0x00000064;
ddr->CR52 = 0x02000602;
ddr->CR53 = 0x03c80000;
ddr->CR54 = 0x03c803c8;
ddr->CR55 = 0x03c803c8;
ddr->CR56 = 0x020303c8;
ddr->CR57 = 0x01010002;

_ASM_NOP();

ddr->CR00 |= 0x00000001;

while ((ddr->CR30 & 0x400) != 0x400) {
}

mcm->CR |= MCM_CR_DDRSIZE(1);
}

 

Not all the boards are experiencing the problem, but roughly 60 % of them.

 

We suspect that the issue may be related to the K70 DDR controller.

 

We tried to apply the advice in this document (erratum ID e10521), and tried also various RCR values other than those read by the procedure explained in e10521, but with no success.

 

Update 2017-04-07:

We tried to apply a different ddr2 setup by using the output produced by Freescale's K70memctrl (we found it here) with the following command:

 

K70memctrl c MT47H64M16.mem ddr2setup.c

 

and reporting the output of ddr2setup.c in our initialization function:

 

void _bsp_ddr2_setup_modified (void)
{
SIM_MemMapPtr sim = SIM_BASE_PTR;
DDR_MemMapPtr ddr = DDR_BASE_PTR;
MCM_MemMapPtr mcm = MCM_BASE_PTR;

/* Enable DDR controller clock */
sim->SCGC3 |= SIM_SCGC3_DDR_MASK;

/* Enable DDR pads and set slew rate */
sim->MCR |= 0xC4; /* bits were left out of the manual so there isn't a macro right now */

ddr->RCR |= DDR_RCR_RST_MASK;

* (volatile uint32_t *)(0x400Ae1ac) = 0x01030203;

/* TC's init */
ddr->CR00 = 0x00000400;
ddr->CR02 = 0x02007530;
ddr->CR03 = 0x02020707;
ddr->CR04 = 0x07090202;
ddr->CR05 = 0x02020302;
ddr->CR06 = 0x00290402;
ddr->CR07 = 0x01010303;
ddr->CR08 = 0x06030301;
ddr->CR09 = 0x020000c8;
ddr->CR10 = 0x02000808;
ddr->CR11 = 0x01000000;
ddr->CR12 = 0x048a001e;
ddr->CR13 = 0x00000005;
ddr->CR14 = 0x00c70002;
ddr->CR15 = 0x00000015;
ddr->CR16 = 0x00000001;
ddr->CR20 = 0x00030300;
ddr->CR21 = 0x24040232;
// ddr->CR22 = 0x00000000;
// ddr->CR23 = 0x00040302;
ddr->CR25 = 0x0A010201;
ddr->CR26 = 0x0101FFFF;
ddr->CR27 = 0x00010101;
ddr->CR28 = 0x00000001;
// ddr->CR29 = 0x00000000;
ddr->CR30 = 0x00000001;
ddr->CR34 = 0x00000101;
// ddr->CR36 = 0x01010201;
ddr->CR37 = 0x00000200;
ddr->CR38 = 0x00200000;
ddr->CR39 = 0x00000020;
ddr->CR40 = 0x00002000;
ddr->CR41 = 0x01010020;
ddr->CR42 = 0x00002000;
ddr->CR43 = 0x02020020;
// ddr->CR44 = 0x00000000;
ddr->CR45 = 0x00070b0f;
ddr->CR46 = 0x0f004000;
ddr->CR47 = 0x0100070b;
ddr->CR48 = 0x0b0f0040;
ddr->CR49 = 0x00020007;
ddr->CR50 = 0x00000040;
ddr->CR52 = 0x02000602;
// ddr->CR53 = 0x03c80000;
// ddr->CR54 = 0x03c803c8;
// ddr->CR55 = 0x03c803c8;
ddr->CR56 = 0x02030000;
ddr->CR57 = 0x01000000;

_ASM_NOP();

ddr->CR00 |= 0x00000001;

while ((ddr->CR30 & 0x400) != 0x400) {
}

mcm->CR |= MCM_CR_DDRSIZE(1);
}

 

With this setup, the problem on faulty boards is occurring much less often within a series of tests, but it is always present.

 

Any hints?
Thanks in advance.

Outcomes