Hi all,
Whilst torture-testing a design that is largely based on the i.MX28 EVK, we are encountering intermittent soft reboot/reset failures. The behavior is the same as that observed earlier during the design process when we discovered that we were exceeding the 100mA inrush current limitation imposed by the power supply subsystem (i.MX28 Applications Processor Reference Manual, Rev.2 p. 947). This prevents the startup sequence from completing, and leaves the board in a totally locked up state.
The occurrence is very rare, but because of the nature of the product, this type of behavior is simply not something we can ship with.
We believe we have traced the root cause to the DDR2 memory, which is powered from the VDDA rail. We are using a 256MB part: Micron Technology, Inc. - MT47H128M16RT-25E
Using a current probe, upon reset we see the inrush hitting a 100mA ceiling. Driving the DDR2 from a separate rail lowers that current to around 60mA. This isn't a suitable solution because it prevents us from being able to operate under battery power.
We are seeking a recommendation on how to resolve this. For example, any suggestions for alternative 256MB DDR2 parts that operate within the 100mA constraints. We are approaching the end of the development cycle, so we really have to implement the least invasive fix possible.
Thanks in advance,
Tim
Hi Tim, thank you for your post. I believe we may have a similar issue found during reboot testing of our i.mx28 design. I was running a reboot test overnight testing a fix for the windows ce ethernet driver which would occasionally fail to reconnect after reboot (turns out the PHY reset conditions were not obeyed). Anyway a couple of times in the hundreds of reboots our board failed to restart.
i think this is similar because we do not even see any of the early debug output. I am yet to check the PMU supply rails in this state. It seems to make sense that the reset may occur when the sdram is active doing something and requiring a lot of current. Your reboot seems a good work around to prevent this.
Did you try the alternative reboot that includes the PMU? I believe there are two reset bits one a processor reset and the other includes the PMU. This was my next thing to try.
as for your application, although this seems rare we need to be sure if our systems are remotely rebooted they do come back online.
mark
Hi Mark,
This is an excerpt from our modified mach-mxs.c, based on the 4.4 kernel source. Sounds like you're using Windows CE, so I'm unable to offer any porting advice
Will provide a formal patch soon.
Note that rtc_addr (0x80056000) and dram_addr (0x800e0000) base addresses are obtained from our device tree:
#define MXS_CLKCTRL_RESET_CHIP (1 << 1) #define MXS_PWRDOWN_DRAM (1 << 16) #define MXS_DRAM_CTR16 0x40 #define STMP3XXX_RTC_CTRL 0x0 #define STMP3XXX_RTC_WATCHDOG 0x50 #define STMP3XXX_RTC_PERSISTENT1 0x70 #define STMP3XXX_RTC_CTRL_WATCHDOGEN 0x00000010 #define STMP3XXX_RTC_PERSISTENT1_FORCE_UPDATER 0x80000000 static void mxs_restart(enum reboot_mode mode, const char *cmd) { //set HW_RTC_WATCHDOG to 2 seconds: if (rtc_addr) { writel(2000, rtc_addr + STMP3XXX_RTC_WATCHDOG); __mxs_setl(STMP3XXX_RTC_CTRL_WATCHDOGEN, rtc_addr + STMP3XXX_RTC_CTRL); __mxs_setl(STMP3XXX_RTC_PERSISTENT1_FORCE_UPDATER, rtc_addr + STMP3XXX_RTC_PERSISTENT1); } if (dram_addr) { pr_err("Powering down DRAM...\n"); mdelay(50); __raw_writel(MXS_PWRDOWN_DRAM, dram_addr + MXS_DRAM_CTR16); mdelay(50); pr_err("Failed to power down the DRAM\n"); mdelay(50); } else { pr_err("dram_addr not mapped, bypassing DRAM power-down\n"); mdelay(50); } //falls through to the original reboot mechanism here
Hi Mark, does your design also have 256 MB RAM?
Yes it does
Actually further to my comment above the HW_CLKCTRL_ENET_RESET_BY_SW_CHIP won't help as it seems the current limit is enabled by default during startup. I will try your suggestion.
Just want to follow up - after discussing with NXP FAE's, we established that during the reboot, the RAM can occasionally be left in a state where it's drawing over 100mA.
This has been resolved by editing mach-mxs.c's machine_restart function. Instead of asserting the chip reset, we configure the watchdog for a two second timeout, then power down the DRAM by setting the POWER_DOWN bit in HW_DRAM_CTL16.
Thanks,
Tim
Hi Tim,
i'm interested about this issue and need more information. How often does it occured? Do you use a mainline kernel? Is there a specific test scenario?
Hi Stefan - sorry for the delay. Happens once every few hundred reboots, probably depends on the RAM configuration you're using. We use a 4.4.0 mainline kernel.
I have a patch I want to submit, but I have no idea who to send it to.
Hi Tim,
Please run ./scripts/get_maintainer.pl -f arch/arm/mach-mxs/mach-mxs.c to know the people and mailing lists you should send this patch to.
Thanks
Most likely, the issue is caused by some inappropriate design of the power supply section. Is your system powered by 5V only source, or by battery only source, or the power is mixed? Do you use linear regulators or DCDC converter as the main power supply when the system is up and running? Also, please provide your system's schematic to make me able to check it.
Best Regards,
Artur
Thanks for the reply Artur, my follow-up is below.