Hello BP,
To improve the performance of the equipment, when subjected to this type of abuse, is likely to involve a review of three different areas -
- The mechanical design of the equipment, including packaging and materials.
- The electrical design of the hardware, including transient suppression measures.
- The firmware design.
I guess the first step is to determine the manner in which the transient voltages are reaching the MCU and other active devices, whether by conduction or induction, or perhaps a combination of both. I presume that you can easily replicate the problems in a laboratory environment.
I assume the connections to the utility lines are already well protected against induced voltages surges due to lightning activity. The solution may involve additional shielding of internal components, or further transient suppression measures in the path to the various I/O of the MCU. Obviously, these sorts of issues cannot be readily addressed within the forum.
However, there are some measures that should be taken within the firmware design, if you do not have them implemented already.
I/O and peripheral registers:
Most examples of code that I see in these forums, initialise all the registers that operate at other than their POR default, and then enter the main loop, and forget about any registers that are not updated during the normal operation of the code. It is assumed that, once the registers are set to the required configuration, they will remain so indefinitely. This cannot be guaranteed, as you have discovered.
To improve reliability, you must assume that any register may spuriously change, and consider in detail which registers can have a detrimental effect on the operation of the equipment if they were to alter. These registers should be re-initialised frequently, probably each time the main loop executes. As a minimimum, I would always re-initialise the DDRs, but you will likely need to give consideration to many more, including registers where you would utilize the power-up default. Obviously, you can't re-initialise the registers that are dynamically changing during program operation.
Interrupts:
You must assume that any unused interrupt may spuriously occur. There should be an interrupt handler for every interrupt source. For the unused interrupts, the minimal response would be to clear the interrupt flag, and to disable any further interrupts from that source.
RAM:
If you are accumulating data in RAM registers over an extended period, as I suspect would be the case, you would need to allow for the data being lost or corrupted. I assume you would already provide frequent dumps to NV memory. To detect wnen corruption has occurred, you mignt need to also allow for duplicated RAM registers, and provide sanity checks against the last value stored in NV memory.
I apologise if I am stating the obvious.
Regards,
Mac
Message Edited by bigmac on
2008-02-27 02:01 AM