SM_019 - continuous resets: implication of not detecting continuous resets

Matthias_LEHMANN · ‎06-22-2021

SM_019 in the S32K1xx Safety manual is fairly brief in saying that continous resets need to identified and signalled, because the system is not assumed to be in a safe state.

What exactly is the underlying concern or the other way round: what's the implication if continous resets are not detected or signalled? Would the controller fail to meet ASIL-B safety integrity if this function isn't present? Being in Reset is considered a safe state according to the safey manual, so why is a continuous Reset cycle not safe? (assuming that the Reset is triggered within FTTI after a dangerous fault occurs or the fault might not be dangerous at all)

The assumption also talks about "signalling" - so an accommodation is not required? From NXP's point of view - what are the options to signal continuous resets, when the controller and its interfaces can't be relied upon anymore?

There's a concern with our customer that having such a function (today it still is an accommodation, for the lack of a reliable signal path) might result in a less robust / reliable system.

Best regars

Matthias

\\// Matthias

Matthias_LEHMANN · ‎06-23-2021

Thanks Aarul. A counter is indeed what we've implemented, but probably still a bit too sensitive, e.g. counting power cuts as well, which didn't work out too well. I think we're going to change from monitoring the number of resets via an external SBC to monitoring the controller-internal RCM status registers. As an action we're preventing normal programm execution and put the system in a minimum functionality loop, without any safety-related functions.

What's still not entirely clear to me is how this assumption is justified or at least supported from a quantitive failure rate point of view. My understanding is, that credit is taken for certain assumptions in the FMEDA, i.e. if the assumption isn't implemented, DC will go down. However, SM_019 doesn't appear to be referenced in any of the FMEDAs. So strictly speaking from a quantitive point of view this assumption shouldn't make a difference.

Best regards

\\// Matthias

aarul · ‎06-27-2021

Hello Mathias

You are right. FMEDA is a verification procedure and not all assumptions are modeled in the FMEDA such that you see the impact of them in metrics. Assumptions are derived as part of safety concept development and are an input to FMEDA.

Regards

-Aarul

aarul · ‎06-23-2021

Hi

MCU in reset is considered safe-state. However, in case of continuous reset, we will transition from reset to boot and then execution. And so depending on the application and system safety concept we will move from safe state to normal state repeatedly. We think that this reduces the availability of the normal operation and can be a safety risk. However, this risk needs to be evaluated at the system level. For example, if the system sees the MCU reset and keeps the system in safe state forever then there is no risk due to MCU reset cycling.

Usually, such continuous resets happen due to a permanent fault which manifests itself into a failure only when the faulty circuitry is enabled/triggered. The detect continuous reset, the RESET pin of the MCU will assert and de-assert repeatedly. The system can potentially implement a counter to observe if the MCU is entering reset (safe state) repeatedly and take an action such as not try to recover the MCU again.

Hope this information helps you to understand the assumption better and helps you to analyze this case in your concept.

Regards

-Aarul