Message Edited by Dietrich on 04-03-2006 11:11 AM
Message Edited by Dietrich on 04-04-2006 01:25 PM
Message Edited by Dietrich on 04-04-2006 01:35 PM
We are running 2.6.29 kernel (CFLINUX_20100901) on a MCF5474-based design. When we execute a reboot command from Linux, this eventually calls the machine_restart() function that calls mach_reset() (=coldfire_reboot()), which disables interrupts, enables a short watchdog timeout and halts at a forever loop if the watchdog doesn't reset first. What we have found is that the reset by the watchdog intermittently fails, maybe 1 in 10.
We have a BDM with GDB attached and a logic analyser on the flex bus. When the CPU fails to start, it has hung at address zero; the chip select registers appear to have been correctly reset (only CS0 enabled, with max wait states); if we try and display the beginning of the NOR Flash memory using GDB where the reset vector resides we read back all zeroes, yet the logic analyzer shows /CSO being asserted and the NOR flash driving the reset vector and initial stack pointer setting onto the flex bus.
Any ideas what the problem may be?
When the CPU fails to start, it has hung at address zero; the chip select registers appear to have been correctly reset (only CS0 enabled, with max wait states);
Any ideas what the problem may be?
It might be a similar one to this:
These chips all have "force a few address or data pins to specific levels during reset" to set a bunch of startup options. With the MCF5301X it seems there is at least one pin that must be driven on reset, but none of the documentation lists it. It is unlikely the same would be happening with your chip, but it is something to check.
More likely is that you're not driving all of AD[12:8] properly on reset, or that some other chip on the bus is intermittently jamming them during reset.
Another suggestion on reset, make sure you've disabled the cache in case the chip is being reset during a cache flush operation.
thanks for your two answers to my question, it is much appreciated. We will certainly check out all of your suggestions and post our finding back here.
It was brought to our attention that the M5474LITE development board is configured to run with a 66MHz CLKIN and 1:2 bus ratio which results in 66MHz PCI/FlexBus, 133MHz SDRAM and 266MHz Core frequencies. Our own MCF5474-based design was configured to run with 33MHz CLKIN and 1:4 bus ratio which results in 33MHz PCI/FlexBus, 133MHz SDRAM and 266MHz Core frequencies.
After modifying our design to run 66MHz CLKIN with a 1:2 bus ratio, we retested with an auto reset on U-Boot (bootdelay 10, bootcmd reset) and successfully restarted more than 27000 times. Previously it hung roughly 1 in 10 restart attempts and I had never seen more than 28 restarts without it hanging.
Errata SEFC077 documents a 1:4 bus ratio fault that can result in a FlexBus hang, but does not mention anything to do with a Watchdog reset hang. The workaround mentioned in SEFC077 did not affect the reliability of the Watchdog reset.
Thanks TomE for your suggestions, but we didn't have any success with selectively shutting down components prior to initiating the watchdog reset with 33MHz CLKIN.
Check the Errata. There's nothing obvious in there, but make sure you don't have an old chip with SECF060.
Are you generating a "very short reset" like one of the other posters had that might not be resetting your FLASH chips properly? Is it possible to switch the CPU to running on a slower clock before generating the watchdog? You might be able to get a longer reset that way.
it is always worth deliberately shutting everything down prior to a reset if you can, like resetting all controllers, stoppng DMA and Interrupts, disable the DRAM, reset the GPIO pins, slow down the CPU and so on. If you do that and the problem goes away you can find out which change did it and zero in on the original problem. Do you have pullups/downs on any sensitive pins that tri-state durng reset?
Do the FlexBus and DDRAM controller share any data pins, like the Series 2 and 3 ones do (it looks like they don't)? Do you have any other peripherals on your data bus? A lot of these chips have problems when they reset during a DRAM cycle, as the DRAM can then lock up and jam the busses. Likewise some peripherals if active when their bus cycle is aborted during reset. Refer to the descriptions here: