MCU (MCF52110) freezes and need a power cycle

キャンセル
次の結果を表示 
表示  限定  | 次の代わりに検索 
もしかして: 

MCU (MCF52110) freezes and need a power cycle

12,910件の閲覧回数
olivierdionne
Contributor I

Hi,

 

For an unknown reason our MCU (MCF52110CAE80) freezes and the only way to restart it is a power cycle.

 

When the MCU freezes, pulling RSTI pin low does not reset the MCU. We have a Voltage Supervisor with Watchdog Timer connected on RSTI pin but it can not do its job when the MCU is in this state.

 

When the MCU is in this state, RSTO pin is always low. It seems the MCU is stuck in reset and can not start. Also, the PSTCLK/CLKOUT pin is steady (high or low) on the scope instead of oscillating at 80MHz, so it may be the on-chip oscillator or the PLL that are not working.

 

Sometimes, the same MCU freezes 5 times a day and, then, it does not freeze for a week. We do not know the cause of the problem, but since it is much more common in the winter when the heating system is working, so it might be ESD.

 

Does someone know a way to prevent this "stuck in reset" state?

 

Note: The MCU page of our schematics is attached

ラベル(1)
0 件の賞賛
返信
10 返答(返信)

11,044件の閲覧回数
TomE
Specialist II

> schematics is attached

20,400 by 13,200 PNG file. It took a while to find something that could handle that!

Why the 10k pullup on XTAL. I haven't seen that before. Is that recommended somewhere, or was it on the Evaluation Board? Edit: That pin is sampled during Reset to determine the initial clocking mode, which is:

CLKMOD0: Pull Down

CLKMOD1: Not Connected??? Assuming internal pulldown

XTAL:         Pull Up

That Reset state corresponds to "PLL disabled, clock driven by on-chip oscillator". Is that what you expect? I assume you then switch over to the external oscillator as part of the startup.

Note 6 on Table 2.1 says "CLKMOD0 and CLKMOD1 have internal pull-down resistors; however, the use of external resistors is very strongly recommended."

You could restrap the board with CLKMOD0 pulled high (start on external crystal) and see if that fixes the problem.

Have you checked "SECF195: OSC: Limited input voltage range on EXTAL pin."? That pullup may violate the voltage requirement (unlikely, I wrote this before I read about the pullup required for the oscillator mode, but check the voltage on EXTAL anyway).

You may be getting the "SECF194:" problem that happens with crystals over 25MHz with your design.

> Also, the PSTCLK/CLKOUT pin is steady

Has it got an INPUT clock? Is the Crystal still running? If not, can you get it running again by "glitching" the crystal pins. You may have a bad crystal or a bad oscillator design. Except you're configured to start on the internal Oscillator, so I don't get it.

Is this a single board that might have an intermittent hardware problem, or are all of them doing it (in hotel rooms all over the place)?

Have you run a "negative resistance margin test" on your crystal and oscillator component selection? If you don't know what that is, ask Google.

You can get a CPU locked up if the JTAG signals aren't pulled up or down properly. Make sure they're not floating or that you haven't left off a needed pullup or pulldown. They look OK on the schematic, but compare every signal with the Evaluation Board setup.

> so it might be ESD.

So see if it passes a proper ESD test, or if that causes a lockup. Add all the ESD protection you can think of to a test unit and see if that fixes the problem. Add ground straps all over the place. Modem? Are these things connected to fixed phone lines? Add an ESD filter to the phone line.

Tom

0 件の賞賛
返信

11,045件の閲覧回数
olivierdionne
Contributor I

Hi Tom,

Thanks for the reply. Here are some info that could help:

> Why the 10k pullup on XTAL

I have never seen a Reference Design or an Evaluation Board with the MCF52110, so 10k is only a standard pull-up.

> CLKMOD1: Not Connected???

We use the MCF52110CAE80 which is the LQFP-64 version. CLKMOD1 is internal only on this version.

> That Reset state corresponds to "PLL disabled, clock driven by on-chip oscillator". Is that what you expect? I assume you then switch over to the external oscillator as part of the startup.

Yes, it is the expected Reset state. But we do not switch to an external oscillator. On the schematics, the external 8-MHz crystal is DNP, so we always use the on-chip oscillator. On the other hand, in the start-up we enable the PLL to operate at a system clock of 80 MHz.

> You could restrap the board with CLKMOD0 pulled high (start on external crystal) and see if that fixes the problem.

I have already tried and the problem occurs even if we boot from the external crystal.

> Has it got an INPUT clock? Is the Crystal still running?

Since we use the on-chip oscillator and that CLKOUT is steady, it seems the problem is that the oscillator is not working. Or it could be that the MCU is wrongly configured to use an external clock source and since there is none, it stays stuck in reset. What do you think?

> Is this a single board that might have an intermittent hardware problem, or are all of them doing it (in hotel rooms all over the place)?

This is not a single board issue. It can be seen on many boards. The problem occurs more often in some hotels and some rooms.

> So see if it passes a proper ESD test.

An ESD test can cause the "stuck in reset" state. The board is already in a grounded metal box, maybe this is not enough. But even if ESD is the main cause of the problem, I would prefer to fix the "stuck in reset" state. It is not normal that an MCU presents this kind of problem and it worries me much since there could be other causes than ESD.

> Modem? Are these things connected to fixed phone lines?

The MODEM is connected on the TV coaxial network of the hotel.

0 件の賞賛
返信

11,045件の閲覧回数
TomE
Specialist II

> If I brown out the 3V3 supply to 1.8V before the supervisor can pull RSTI pin low,

> when the supply goes back to 3.3V and the supervisor releases RSTI pin,

> the MCU is locked up and it can not be reset except by a power cycle.

Congratulations on finding a repeatable test.

The MCF52110 has a minimum voltage specification of 3.0V. You're using an MCP1320T-29LE part, which is specified to trip at 2.828/2.90/2.973. So that is slightly outside of specification. You should be using a 3.1V monitor at least.

You're not measuring it tripping at 1.8V. It trips at 2.9V, but is specified to have a 650us delay from the low trip point until when it drives RESET, given a slow ramp. You've giving it an astonishingly fast ramp.

The only way to remain fully within specification is to have a large enough power supply capacitor so that it takes at least 650us to fall from the 3.1V where the monitor trips until the minimum CPU spec of 3.0V. On your oscilloscope I'm seeing a fall from 3.3V to 1.8V (by 1.5V) in about 10 microseconds. To discharge 1 farad by 1 volt in 1 second takes 1 amp, so discharging a microfarad in a microsecond by a volt also takes an amp. You've got 240uF of storage, so you must have drawn 36A to drop 1.5V in 10us! Or at least 1.5A from the 10uF output capacitor. So I don't think you were just "turning it off and on again". What exactly were you doing in that test?

I suggest you see if you can get a more "realistic" test by disconnecting the input power for between "as fast as you can turn it off and on or unplug and plug" to about 5 seconds, and monitor how low the 3V3 goes. See if there are any "brownout voltage levels" where the voltage monitor DOES drive Reset that the CPU doesn't recover from.

You've found a problem, but I don't think this could be happening this way in the field. I think you've found a "power sensitivity", but in the field I suspect it may be caused by ESD or leakage back through protection diodes into the 3V3 rail. Have you checked for this case yet?

There's still a problem with the CPU where you seem to be able to "brown it out without reset" and then get it in a state where Reset won't recover it. You won't get this fixed any time soon, so you have to characterize it and then design to avoid that situation. So I'd suggest you disconnect the MCP1320 by replacing R9 with a switch. Bring it up normally, disconnect the switch, and then with a bench-top supply feeding into J1, find the minimum 3V3 voltage it won't recover from. Test this with and without the Reset signal - prove that this is a repeatable problem.

I've just noticed you're resetting the watchdog from the CPU's PWM4 pin. I hope you're driving that pin as GPIO from software and not from the timer. That could cause problems in the field.

Tom

0 件の賞賛
返信

11,045件の閲覧回数
TomE
Specialist II

Try adding a Mains Surge Protector in case there's some high voltage noise on the mains causing problems.

See if you can get the CPU locked up by "browning out" the power supply. Switch the power off and on for short periods so as to get the 3.3V supply falling to 2V, to 1.5V, to 1V and so on before coming back on. See if any of these "dips" causes the CPU to lock up. I've had this happen on the MCF5329, but that was due to the external SDRAM locking up, which isn't a problem for the MCF52110. But these complicated chips can sometimes lock up on brownouts.

Make sure Production didn't make a mistake and fit any "DNP" parts. If you had R25 and R28 both fitted it might cause a problem like this.

Then check "Injection Currents". Read Notes 4 and 5 (esp. 5) in the Data Sheet (not the Ref Manual) "Table 19. Absolute Maximum Ratings". You have some signals coming in from the right hand side which have 5.6V zener clamps on them and 100R series resistors. Are any of these 5V signals? Are you doing any "level conversion" there from 5V to 3.3V? I'm guessing you're using the nc7wz14p6x parts for that purpose. SPI_CLK comes through one of these, but I can't see how MISO and MOSI are connected (maybe on another sheet). If you're relying on any chip input clamp diodes to convert a 5V signal to 3.3V through a resistor, then the excess current will be getting "injected" into the 3V3 rail via those diodes. If the CPU isn't drawing more current than that, the 3V3 rail can go high. During Reset and;or without clocks, the CPU may be drawing very little current. As well, if this device can be powered off when the other ones are on, then the injection current can stop this device from powering off, or from powering off completely. So you may have some issues with the 3V3 rail going too high and out of spec or not going off and on cleanly.

Tom

0 件の賞賛
返信

11,045件の閲覧回数
olivierdionne
Contributor I

> See if you can get the CPU locked up by "browning out" the power supply.

We have a Voltage Supervisor (U4 - MCP1320T-29LE) that holds RSTI pin low when the 3V3 supply is under 2.9V. The Voltage Supervisor is also a Watchdog Timer. When the MCU is stuck in reset, the Watchdog Timer pulls RSTI pin low but this does not change the state of the MCU.

> Are any of these 5V signals? Are you doing any "level conversion" there from 5V to 3.3V? I'm guessing you're using the nc7wz14p6x parts for that purpose.

SPI_CLK, URXD2, UTXD2 and DTIN3 come from an external device. Some of these devices are 3.3V and others are 5V. The nc7wz14p6x is 5V-tolerant and has been added to the design for that purpose.

Before adding the nc7wz14p6x, SPI_CLK was directly connected to the MCU. With that configuration we could produce the same "stuck in reset" state. If we had powered off the board while a 5V signal was applied on SPI_CLK (directly connected to the MCU), the MCU would not come out of reset at power-on.

> I can't see how MISO and MOSI are connected

SCLK, MOSI and CS0 are connected to a 3.3V supplied MODEM through series resistors (10k, 10k and 100k respectively). MISO is not used. There is no level conversion on these signals.

0 件の賞賛
返信

11,045件の閲覧回数
TomE
Specialist II

> > See if you can get the CPU locked up by "browning out" the power supply.

> We have a Voltage Supervisor

I know you do. It isn't working, is it? You KNOW that the Reset line won't reset the CPU when it is in this state. I'm saying that even with the Reset pin driven, "Browning out" the 3V3 supply line to "intermediate" voltages may be locking the insides of the CPU up.

Dropping the #V# all the way to ground fixes this condition, but dropping to 0.5V or 1V or 1.5V or some other magic level may be CAUSING this problem. So I'm suggesting you test this possibility. You can even run the unit from a bench supply and turn the voltage down and up in a controlled fashion.

> If we had powered off the board while a 5V signal was applied on SPI_CLK (directly connected to the MCU), the MCU would not come out of reset at power-on.

So it seems you've known the cause of the problem all along. When the 3V3 rail is low and the CPU is being held in Reset by the reset controller, it isn't drawing much current, so external "injection" power from a 5V signal would pull the rail high enough for it to lock up. It seems to me that the same applies to the 3V3 signals too. You should have all external "still may be powered" signals going through those buffers.

There's a very simple brutal fix for this class of problem. Put a fat resistor on the 3V3 line so it gets pulled DOWN when this happens, or when anything else tries to keep the 3V3 high. Or worse, tries to pull the 3V3 above 3.3V. And no, you can't get a 3V3 Zener to clamp the rail, these things have horrible characteristics. Zeners don't have nice sharp conduction curvesl below about 12V.

> An ESD test can cause the "stuck in reset" state. The board is already in a grounded metal box, maybe this is not enough.

A metal box only works if there are no holes in in and the box isn't connected to anything. Obviously you have to run wires in and out, and if they can conduct ESD into the box then it might as well not be there. You have to have all the incoming lines grounded and bypassed (Zeners and capacitors) at a SINGLE star-ground point, and the board must only be connected to the metal box at that same point. If the board is earthed to the case at more than one point,then an ESD Discharge to the case will propagate through the case and through the board's multiple earth connections. If the cables ground on the board at different points to each other and the case-ground then the voltage spikes will go through the board between these points and through the chips. If you have this problem you may be able to improve the grounding easily by isolating redundant grounds and adding big ground straps.

> But even if ESD is the main cause of the problem, I would prefer to fix the "stuck in reset" state.

So you need a "watchdog" that turns the 3V3 rail off and on again, as that's the only way you've found to reset the CPU.

Try a resistor across the 3V3 rail. Go as low as you can without it getting too hot and without making the regulator too hot.

Tom

0 件の賞賛
返信

11,045件の閲覧回数
TomE
Specialist II

If the Regulator will take it (power supply, heat dissipation, heat-sinking., put a 33 ohm resistor on the 3V3 rail and see if the problem goes away.

You should also read the LM1086 Data Sheet carefully, specifically the "Stability Consideration" section. The 10uF capacitor that you're using on the output is OK as long as it is a Tantalum. If you're using an Electrolytic it should be 50uF. UNLESS you've got a capacitor across the "Adjust" pin in which case it should be 50uF Tantalum or 150uF Electrolytic. And you do have a cap on the Adjust pin.

The output cap HAS to be Tantalum or Electrolytic. It has to be a "bad capacitor" as the ESR is an essential part of the equation. You don't want a large ceramic or low ESR one there.

So you may well have regulator stability problems under some conditions.

The Data Sheet also says "Capacitors other than tantalum or aluminium can be used on the adjust pin and the input pin".

But you're using the same "10uF 10V 1206" capacitor on the Input, Adjust and Output pins, so you're either breaking the rules on the Output if they're all Ceramic, or breaking the rules on Input and Adjust if they're all Tantalum!

If you're not generating Audio or Video, and aren't using a precision ADC, so a bit of ripple on the 3V3 is unimportant. I'd suggest removing the cap on the Adjust pin, as it is only there for increased Ripple Rejection.

Tom

11,045件の閲覧回数
olivierdionne
Contributor I

Hi Tom,

1) You are absolutely right about the LM1086. When I took over the design, I supposed this part was well done and I should not have. C58, C59 and C60 are all X7R ceramic. As you suggest, I will remove the cap on Adjust pin (C58) and change the Output cap for a tantalum (C59).

What I do not understand is that you suggest a "bad capacitor" (with high ESR, as I understand) and in the LM1086 datasheet it is recommended to use low ESR:

"It is also desirable to provide low ESR to reduce the change in output voltage: ΔV = ΔI x ESR. It is common practice to use several tantalum and ceramic capacitors in parallel to reduce this change in the output voltage by reducing the overall ESR."

So, what ESR should I use?

2) I did some brown out tests. I changed the 390R resistor (R68) for a 1k trim pot, so I could adjust the output voltage of the regulator from 3.3V to 1.25V.


With the Voltage Supervisor

When the 3V3 supply drops to 2.9V, the supervisor pulls RSTI pin low.

Regardless of the value the 3V3 supply drops to, when it rises above 2.9V the supervisor releases RSTI pin and the MCU comes out of reset.

So, with the test I did (with the trim pot), the supervisor does its job correctly and prevent the "stuck in reset" state.

Without the Voltage Supervisor

The MCU stops working when the 3V3 supply falls under 2.2V.

When the 3V3 supply drops under 1.6V and goes back to 3.3V the MCU comes out of power-on reset.

When the 3V3 supply drops between 2.2V and 1.6V and goes back to 3.3V the MCU does not come out of reset.

As you suggested, the "stuck in reset" state is caused by a brown out of the 3V3 supply.

My test shows the MCU can stay locked if its supply drops between 2.2V and 1.6V. But it also shows the Voltage Supervisor prevents the MCU from locking. I looked in the supervisor datasheet and found the time to reset when the supply drops is typically 650µsec. So, if the 3V3 supply drops from 2.9V to 2.2V in less than 650µsec, the MCU stops working before the supervisor can pull RSTI pin low and that would cause the MCU to lock. I could not verify that supposition with my trim pot, but I think it is likely to happen.

Do you think that improving the regulator stability as said in 1 will be enough?

Would it help to change the regulator Output cap for a bigger one?

Olivier

0 件の賞賛
返信

11,045件の閲覧回数
TomE
Specialist II

> What I do not understand is that you suggest a "bad capacitor" (with high ESR, as I understand)

I meant that if you read the data sheet it says that the capacitor must have a certain amount of ESR in order for the filters inside the chip to work. I was using "bad capacitor" to mean "one not as good as a very low ESR ceramic", that's all.

> 2) I did some brown out tests.

That's a very SLOW test, ramping the voltage down and up again. That useful (but you could have done the same with an external bench supply easier", but I really mean for you to "brown it out" the way it happens in real life. Turn the mains power off and on again rapidly. Pull the power plug out of that socket on the board and plug it in again. Really go to town trying to make it fail. If you make it fail like this you've found a REAL case possibly more related to the failures in the field. Note the "ramp rate" of how fast the 3V3 is applied and goes away is also important (and hard to test).

I suspect that HIGH voltages rather that low ones might be causing the lockups. You might be getting the regulator unstable (with the wrong caps) and it may sometimes be overshooting in some circumstances. The resulting high voltage might be the cause of the problems. Then there's "leakage" from the other circuits that could be related to this.

The main thin is to get a repeatable failure, and then investigate and fix that.

> So, with the test I did (with the trim pot), the supervisor does its

> job correctly and prevent the "stuck in reset" state.

Yes, it works with the tests you've done, but it is not working in the failure case. It is probably "trying to work" and meeting its specifications, but when it goes wrong, the reset pin won't fix the problem.

> So, if the 3V3 supply drops from 2.9V to 2.2V in less than 650µsec, the MCU

> stops working before the supervisor can pull RSTI pin low and that would

> cause the MCU to lock.

Yes the CPU should stop working at a low voltage, but it shouldn't LOCK UP. A Reset signal should always bring it out of that. Can you test that? Disconnect the supervisor, brown it out, bring it back up and THEN give it a reset. If it resets properly this isn't your problem. Only if it doesn't reset after this have you found a problem.

> Do you think that improving the regulator stability as said in 1 will be enough?

No. You still don't know what the problem is. Until you have a repeatable and explainable situation you can't fix it.

Tom

0 件の賞賛
返信

11,045件の閲覧回数
olivierdionne
Contributor I

Hi Tom,

I found a repeatable way to lock the MCU even if the Voltage Supervisor is connected.

If I brown out the 3V3 supply to 1.8V before the supervisor can pull RSTI pin low, when the supply goes back to 3.3V and the supervisor releases RSTI pin, the MCU is locked up and it can not be reset except by a power cycle.

Here are the 3V3 supply and the RSTI on the scope. On this image we do not see the RSTI pin going back to high because the supervisor keeps it low for about 200msec.

3V3_MCU_freeze_zoom.PNG.png

Do you know what is the best way to fix this problem?

Olivier

0 件の賞賛
返信