The FMEDA for the Core module of the S32K142 (FMEDA SEooC Static - S32K142 / FMEDA_S32K142_Core_CUSTOMER.xlsx, version 1.0) shows a fairly high failure rate for Soft errors / SEU in the cache (80 FIT out of a total failure rate of 85 FIT for the core module). "parity" is claimed as a safety mechanism. Parity is listed on the sheet that lists all the safety mechanisms relevant for the core module, but not linked to any assumption in the safety manual. Can I assume, that parity is enabled by default? By enabled I mean errors are detected and handled. Or does any of these two things need to be manually enabled and / or configured?
Best regards
Matthias
已解决! 转到解答。
Hi Mathias,
Apologies for the delay.
As mentioned earlier, the cache parity mechanism is enabled by default for detection but not for reporting. However a parity error would result in a cache miss causing a reload of the affected Cache contents. The respective fields of MCM_LMDR2 have the default values to ensure that both cache parity and parity miss are enabled by default. So its certain that the end user doesn’t have to worry about the “handling” of the parity fault as the contents of cache will be reloaded anyway.
The respective flag bits in MCM_MPEIR will be raised depending upon whether or not reporting is enabled by writing to MCM_LMPECR.
I hope this helps !
Thanks!
-Yashwant
Hi Matthias,
Sorry to keep you waiting.
The cache parity mechanism is enabled by default for detection but not for reporting. The reporting needs to be enabled manually.
The following registers are related to enabling the error reporting:
Kindly refer the Reference Manual to find the exact field bits to be manipulated.
Do let us know if you have any follow up questions.
Thanks!
-Yashwant
Hi Matthias,
Yes, it seems latter is the case. The safety manual should have an explicit assumption regarding this.
We're looking into why it's not there and we'll get back to you shortly with an answer.
Thanks!
-Yashwant
Hi Yashwant,
is there any update on this yet? Since this currently looks like a safety-relevant problem with the potential to completely spoil the HW-metrics for the Controller, I'd appreciate a more timely feedback.
Our current understandig is the following: the best thing the application could do upon detection of a parity fault in the cache is to reset. The most effective solution however would be to simply reload the affected Cache contents from the Flash memory, which according to our understanding can only be triggered by the Core itself. So there is still a faint hope on our side, that there already is some "hidden accommodation" in place in the S32. But neither the reference manual nor the safety manual have much to say about this and even the public Arm documentation doen't help us very much here. (But for the latter, I think it should be NXP browsing through the Arm Architecture documents...)
Hi Matthias
Appreciate your patience. We are checking with design and verification teams if parity error leads to cache miss automatically. We will update you soon as soon as we get a confirmation.
Thanks and regards
Yashwant
Hi Matthias,
Apologies for the delay in response. We just got word from the applications team that they are still working on it. We'll post the answer to your query as soon as possible.
Thanks!
-Yashwant
Hi Mathias,
Apologies for the delay.
As mentioned earlier, the cache parity mechanism is enabled by default for detection but not for reporting. However a parity error would result in a cache miss causing a reload of the affected Cache contents. The respective fields of MCM_LMDR2 have the default values to ensure that both cache parity and parity miss are enabled by default. So its certain that the end user doesn’t have to worry about the “handling” of the parity fault as the contents of cache will be reloaded anyway.
The respective flag bits in MCM_MPEIR will be raised depending upon whether or not reporting is enabled by writing to MCM_LMPECR.
I hope this helps !
Thanks!
-Yashwant