According to file <S32G2_Fault_Map>& FMEDA for S32G, We have collated CMU fault below. Since the fault will be collected and reacted by FCCU,and the recommended fault reaction is R1(local SW recovery).
I want to figure out what is R1 exactly.
For example, if CMU_REPORT_0 is reported to FCCU(monitoring the clock of safety cores peripherals), what should the application do? Besides ,what is the relation of R1 and <software reset domains>
Channel number | Source Module | Description | cause of fault | fault impaction of ECU | Recommended Fault Recovery Type | Recommended Recovery Mechanism | Vehicle Safety State | SW recation |
NCF[28] | CMU_REPORT_0 | CMU error for clock sources safety cores peripherals | 1.Wrong PLL output clock or wrong status due to Soft Error Upset 2.No Clock: Output is stuck at or floating 3.Wrong Clock: Output is out of range | Appliaction in Mcore and peripherals | SW | R1 | TBD | 1.Alarm state enabled 2. Alarm interrupt enabled 3.How should application react? functional reset? |
NCF[29] | CMU_REPORT_1 | CMU error for performance cores A53-core | Wrong Clock: jitter too high in the output signal
| Appliaction in Acore
| SW | R1 | TBD | Reset RD1 |
NCF[31] | CMU_REPORT_3 | CMU error for accelerator peripheral clocks
| 1.Wrong/No FIRC clock (faults: trim value, stable clock, mode) caused by d.c. faults stable clock、mode) 2.No Clock: Output is stuck at or floating 3.Wrong Clock: Output is out of range | Appliaction in Ethernet | SW | R1 | TBD | Reset RD2&3 |
Hi Zeyu,
As you state, each of the FCCU fault sources is configurable to allow you to choose a reaction that is appropriate to your application.
The R1 reaction means that an alarm interrupt to the master safety core is triggered. This allows you to take action in software to either attempt to recover the fault, or save status information about the fault prior to triggering a reset.
The decision on what action to take in SW, if any, is application dependent, and will depend on the safety concept in which S32G is used.
The software resettable domains you highlight could be a valid reaction if the fault is isolated within that reset domain. So for example if a fault is triggered from the PFE, you could choose to reset only the PFE SW resettable domain, rather than a full chip functional reset.
Many thanks,
Alison
just the alarm state in this fault signal flow:
and for NCF(non critical fault) , we should record the fault information using for example DTC in AutoSar,because for the NCF fault S32G still have opportunity to store fault status, and for critical fault ,S32G will go to functional reset or destructive reset.
however once the NCF fault occurs constantly,if application want to recovery the fault <like eDMA Lockstep Error Indication or lock-step error (A53 GIC clusters)>, app have no other choice but reset the S32G through FCCU pins.
the fault like memory ECC error occasionally, could application lead the system reset or just disable the output of S32G(like CAN, ethernet)
Am I right?
We have no much experience about the specific application reaction to the fault except storing fault status and system reset.
Thank you for your reply.
Hello,
Yes that’s correct: All faults routed to the FCCU are called “non-critical faults” and the reaction can be configured as shown in the diagram. “Critical faults” are those routed directly to the reset generation module, and therefore no configuration of the reaction is possible.
For each of the non-critical faults routed to the FCCU you can choose the reaction that is appropriate to your application.
For some faults there may be no chance to continue execution once the fault has occurred. In that case you could either use the fault interrupt to call fault handler SW to store data and trigger reset from software, or immediately trigger a reset from FCCU.
If there is a chance to recover from the fault and continue execution (if the fault occurs from a source that is not safety related, or the faulty IP can be reset ) then the alarm interrupt can call fault handler software to handle the recovery then clear the fault, allowing execution to continue without a chip reset.