Are cache parity checks and handling enabled by default in S32K142?

Matthias_LEHMANN · ‎12-12-2022

The FMEDA for the Core module of the S32K142 (FMEDA SEooC Static - S32K142 / FMEDA_S32K142_Core_CUSTOMER.xlsx, version 1.0) shows a fairly high failure rate for Soft errors / SEU in the cache (80 FIT out of a total failure rate of 85 FIT for the core module). "parity" is claimed as a safety mechanism. Parity is listed on the sheet that lists all the safety mechanisms relevant for the core module, but not linked to any assumption in the safety manual. Can I assume, that parity is enabled by default? By enabled I mean errors are detected and handled. Or does any of these two things need to be manually enabled and / or configured?

Best regards

Matthias

\\// Matthias

Yashwant_Singh · ‎02-20-2023

Hi Mathias,

Apologies for the delay.

As mentioned earlier, the cache parity mechanism is enabled by default for detection but not for reporting. However a parity error would result in a cache miss causing a reload of the affected Cache contents. The respective fields of MCM_LMDR2 have the default values to ensure that both cache parity and parity miss are enabled by default. So its certain that the end user doesn’t have to worry about the “handling” of the parity fault as the contents of cache will be reloaded anyway.

The respective flag bits in MCM_MPEIR will be raised depending upon whether or not reporting is enabled by writing to MCM_LMPECR.

I hope this helps !

Thanks!

-Yashwant

在原帖中查看解决方案

Yashwant_Singh · ‎01-13-2023

Hi Matthias,

Sorry to keep you waiting.

The cache parity mechanism is enabled by default for detection but not for reporting. The reporting needs to be enabled manually.

The following registers are related to enabling the error reporting:

MCM_LMDR2
MCM_LMPECR
MCM_MPEIR

Kindly refer the Reference Manual to find the exact field bits to be manipulated.

Do let us know if you have any follow up questions.

Thanks!

-Yashwant

Matthias_LEHMANN · ‎01-13-2023

Sorry to insist on this issue, Yashwant, but I'd like to have clarity on this: My question was about "detection" and "handling" of the cache parity faults. Your answer suggests that errors are "detected" in the default configuration, but not "reported". Now how does "report" relate to "handle" (or "accommodate"), which is what I'm really after?

The FMEDA for the S32K142 shows quite a high FIT-value for soft errors in the cache (80FIT out of a total of 93FIT for the overall Core module), which are assumed to be prevented with 99% effectiveness by the parity mechanism. I assume that this 99% number is only valid, if parity errors are detected AND accommodated in some way.
My question is: in the default configuration of the controller, does that 99% number hold (i.e. in your words: is the "detection" sufficient to ensure the claimed prevention of dangerous faults) or are additional configuration steps necessary (i.e. does the end-user need to enable error reporting and establish an application specific fault handler routine)? If the latter is the case, then there should be an explicit assumption in the safety manual about this, which currently is not the case.

Best regards, Matthias

\\// Matthias

Yashwant_Singh · ‎01-15-2023

Hi Matthias,

Yes, it seems latter is the case. The safety manual should have an explicit assumption regarding this.

We're looking into why it's not there and we'll get back to you shortly with an answer.

Thanks!

-Yashwant

Matthias_LEHMANN · ‎01-27-2023

Hi Yashwant,

is there any update on this yet? Since this currently looks like a safety-relevant problem with the potential to completely spoil the HW-metrics for the Controller, I'd appreciate a more timely feedback.

Our current understandig is the following: the best thing the application could do upon detection of a parity fault in the cache is to reset. The most effective solution however would be to simply reload the affected Cache contents from the Flash memory, which according to our understanding can only be triggered by the Core itself. So there is still a faint hope on our side, that there already is some "hidden accommodation" in place in the S32. But neither the reference manual nor the safety manual have much to say about this and even the public Arm documentation doen't help us very much here. (But for the latter, I think it should be NXP browsing through the Arm Architecture documents...)

\\// Matthias

Yashwant_Singh · ‎01-30-2023

Hi Matthias

Appreciate your patience. We are checking with design and verification teams if parity error leads to cache miss automatically. We will update you soon as soon as we get a confirmation.

Thanks and regards

Yashwant

aarul · ‎02-20-2023

Dear Matthias
An update has been posted by Yashwant Singh yesterday. Let us know if this resolves your query.
Appreciate your feedback.
Thanks and regards
-Aarul

Yashwant_Singh · ‎12-21-2022

Hi Mattias,

Apologies for the delay in response. We are currently awaiting some feedback from the applications team. We'll get back to you as soon as possible

Thanks!

-Yashwant

Matthias_LEHMANN · ‎01-09-2023

Hi Yashwant. Is there any update so far on this question?

Best regards

\\// Matthias

Yashwant_Singh · ‎01-09-2023

Hi Matthias,

Apologies for the delay in response. We just got word from the applications team that they are still working on it. We'll post the answer to your query as soon as possible.

Thanks!

-Yashwant

Yashwant_Singh · ‎02-20-2023

Hi Mathias,

Apologies for the delay.

As mentioned earlier, the cache parity mechanism is enabled by default for detection but not for reporting. However a parity error would result in a cache miss causing a reload of the affected Cache contents. The respective fields of MCM_LMDR2 have the default values to ensure that both cache parity and parity miss are enabled by default. So its certain that the end user doesn’t have to worry about the “handling” of the parity fault as the contents of cache will be reloaded anyway.

The respective flag bits in MCM_MPEIR will be raised depending upon whether or not reporting is enabled by writing to MCM_LMPECR.

I hope this helps !

Thanks!

-Yashwant

Matthias_LEHMANN · ‎02-21-2023

Thank you Yashwant. That's what I hoped for, but from the reference manual that behaviour wasn't 100% evident for me and our SW team.

\\// Matthias

Matthias_LEHMANN · ‎12-22-2022

Thank you Yashwant. Looking forward for a reply. Happy holidays!

\\// Matthias