watchdog failed to trigger reboot with BSP44 image on RDB3

キャンセル
次の結果を表示 
表示  限定  | 次の代わりに検索 
もしかして: 

watchdog failed to trigger reboot with BSP44 image on RDB3

ソリューションへジャンプ
1,211件の閲覧回数
hittzt
Senior Contributor I

Hi,

 

I tested watchdog status with BSP44 image on RDB3 board, but it failed:

s32g399ardb3 login: root
root@s32g399ardb3:~# echo V > /dev/watchdog
root@s32g399ardb3:~# echo X > /dev/watchdog
root@s32g399ardb3:~#
root@s32g399ardb3:~#
root@s32g399ardb3:~#
root@s32g399ardb3:~#
root@s32g399ardb3:~#
root@s32g399ardb3:~#
root@s32g399ardb3:~# dmesg | grep -i watch
[ 0.355630] hw-breakpoint: found 6 breakpoint and 4 watchpoint registers.
[ 1.128811] s32cc-wdt 4010c000.watchdog: S32CC Watchdog Timer Registered. timeout=30s (nowayout=0)
[ 1.139008] s32cc-wdt 40200000.watchdog: S32CC Watchdog Timer Registered. timeout=30s (nowayout=0)
[ 1.148260] s32cc-wdt 40204000.watchdog: S32CC Watchdog Timer Registered. timeout=30s (nowayout=0)
[ 1.157517] s32cc-wdt 40208000.watchdog: S32CC Watchdog Timer Registered. timeout=30s (nowayout=0)
[ 1.166787] s32cc-wdt 40500000.watchdog: S32CC Watchdog Timer Registered. timeout=30s (nowayout=0)
[ 1.176040] s32cc-wdt 40504000.watchdog: S32CC Watchdog Timer Registered. timeout=30s (nowayout=0)
[ 1.185300] s32cc-wdt 40508000.watchdog: S32CC Watchdog Timer Registered. timeout=30s (nowayout=0)
[ 1.194554] s32cc-wdt 4050c000.watchdog: S32CC Watchdog Timer Registered. timeout=30s (nowayout=0)
[ 5.012870] s32cc-dwmac 4033c000.ethernet: Enable RX Mitigation via HW Watchdog Timer
[ 6.407310] systemd[1]: Started Dispatch Password Requests to Console Directory Watch.
[ 6.435274] systemd[1]: Started Forward Password Requests to Wall Directory Watch.
[ 320.459240] watchdog: watchdog0: watchdog did not stop!
root@s32g399ardb3:~#
root@s32g399ardb3:~#

......

The board did not reboot after several mins.

The full log is attached.

Would you please help to check the issue?

 

Thanks,

Zhantao

0 件の賞賛
返信
1 解決策
723件の閲覧回数
alejandro_e
NXP TechSupport
NXP TechSupport

Hello @hittzt,

I have received updates from the internal team, the behavior is related to what you point out in your latest reply, here are all the needed details:

"

This behavior in BSP44 is expected. By default, the watchdog fails to trigger a functional reset because the reaction is set to S32CC_FCCU_REACTION_NONE.

alejandro_e_0-1754411952829.png

 

It supports the following reactions:

#define S32CC_FCCU_REACTION_NONE 0
#define S32CC_FCCU_REACTION_ALARM (1 << 1)
#define S32CC_FCCU_REACTION_FUNC_RESET (1 << 2)
#define S32CC_FCCU_REACTION_NMI (1 << 3)
#define S32CC_FCCU_REACTION_EOUT (1 << 4)
If you intend to enable functional reset, you should configure it as S32CC_FCCU_REACTION_FUNC_RESET in the DTS. Then, rebuild the DTS and deploy it to the board. You will see the watchdog functionality works as expected.
--------------

Why did it work in BSP43?

In BSP43, there was a software bug in the FCCU driver (as referenced in commit issue ALB-12478). Due to this bug, the system could still trigger a reset even when the reaction was set toS32CC_FCCU_REACTION_NONE during a watchdog timeout. This bug has been fixed in BSP44, and now the system will only trigger a reaction that aligns with the configuration specified in the DTS. You can see that in the BSP44 release notes:

alejandro_e_1-1754411953007.png

"

 

In conclusion, the correr approach to handle this issue is to update the device tree node to use the required reaction type.

Please let me know if this information solved your problem.

I apologize for the time taken, I hope this did not cause you any problem.

 

Thanks

元の投稿で解決策を見る

0 件の賞賛
返信
13 返答(返信)
1,192件の閲覧回数
alejandro_e
NXP TechSupport
NXP TechSupport

Hello @hittzt

Thanks for reaching out to us. I have checked the public and internal documentation regarding the NXP-AUTO-Linux and I was not able to find any use example.

Please share your expectations and use case for the Linux watchdog so I can help you more.

 

Thanks

0 件の賞賛
返信
1,184件の閲覧回数
hittzt
Senior Contributor I

Hi @alejandro_e,

 

The watchdog is to reboot the board when the system encouters collapse, and the same commands with BSP43 image:

echo V > /dev/watchdog

echo X > /dev/watchdog

will trigger the board reboot, so you can check and compare it.

There should be something wrong in BSP44?

 

Thanks,

Zhantao

0 件の賞賛
返信
1,152件の閲覧回数
alejandro_e
NXP TechSupport
NXP TechSupport

Hello @hittzt,

I have tested the behavior with BSP43, I have investigated more on the issue, and there are some commits applied in BSP44 to the /drivers/watchdog folder that might have introduced the problem. I will check this problem internally and let you know any relevant update.

 

Thanks for your patience.

0 件の賞賛
返信
1,138件の閲覧回数
hittzt
Senior Contributor I

Hi @alejandro_e,

 

Thanks for your reply.

Yes, I checked the related codes of kernel and atf, and it seems that some atf commits in BSP44 may affect the watchdog status.

I am working on this issue too and trying to fix it.

 

Thanks,

Zhantao

0 件の賞賛
返信
946件の閲覧回数
alejandro_e
NXP TechSupport
NXP TechSupport

Hello @hittzt,

Could you share the ATF commits that you see related to the watchdog functionality? I have not found any commits that seem related to the watchdog, at least not exclusively to BSP44.

 

Thanks!

0 件の賞賛
返信
910件の閲覧回数
hittzt
Senior Contributor I

Hi @alejandro_e,

 

I noticed that atf updates a lot in BSP44 release, and some of them change the power and psci related codes, but I did not find which patch or commit causd the issue, maybe it is not caused by only one or two patches. 

And I tried to revert some of them, but the issue still exists.

There need more work to get the way to fix it.

 

Thanks,

Zhantao

0 件の賞賛
返信
885件の閲覧回数
alejandro_e
NXP TechSupport
NXP TechSupport

Hello @hittzt,

I performed some tests and I can conclude the problem is in the kernel. If you flash the whole BSP44 sdcard image and change the kernel image for the BSP43 the watchdog works as expected. Is it possible for you to continue working with this workaround? I am still waiting for an official answer from the internal team.

 

Thanks

0 件の賞賛
返信
879件の閲覧回数
hittzt
Senior Contributor I

Hi @alejandro_e,

 

You can compare the watchdog driver in BSP43 and BSP44, there is no changes in the driver.

So it should not be a driver issue.

To make it clear, you say you just replace the kernel image from BSP44 to BSP43 and keep using atf, uboot images to test?

 

Thanks,

Zhantao

0 件の賞賛
返信
842件の閲覧回数
alejandro_e
NXP TechSupport
NXP TechSupport

Hello @hittzt,

I do see several changes in the drivers/watchdog directory, however none of them seem related to NXP products.

Never the less the results of my test stand correct. To clarify I used the precompiled BSP44 sdcard image, then I changed the "Image" file, which is the kenel, to the one in BSP43 (also in the precompiled binaries) this is shown in the only working partition when connecting the sdcard to a windows machine:

alejandro_e_0-1753898314834.png

You I did not change the DTB or the PFE FW.

With this test the watchdog Linux device worked as expected with the two command you shared previuosly.

 

Let me know if you are able to see the same behavior,

Thanks 

 

 

0 件の賞賛
返信
818件の閲覧回数
hittzt
Senior Contributor I

Hi @alejandro_e,

 

I found that the watchdog issue should be caused by the following commit:

From a565a2439fce7c9ad6c0a41d12c1353928838be4 Mon Sep 17 00:00:00 2001
From: Ciprian Marian Costea <ciprianmarian.costea@oss.nxp.com>
Date: Tue, 4 Feb 2025 10:55:56 +0200
Subject: [PATCH 941/977] s32cc: fccu: Fix reactions setup and enablement

commit a565a2439fce7c9ad6c0a41d12c1353928838be4 from
https://github.com/nxp-auto-linux/linux

Reset reactions counter for each element (fault) of the reactions list.
Otherwise, when one reaction is enabled, the rest which follow would be
wrongly enabled also, even if 'S32CC_FCCU_REACTION_NONE' is used as
action.

Issue: ALB-12478
Signed-off-by: Ciprian Marian Costea <ciprianmarian.costea@oss.nxp.com>

 

Would you please help to check and give a fix to the issue?

 

Thanks,

Zhantao

0 件の賞賛
返信
815件の閲覧回数
alejandro_e
NXP TechSupport
NXP TechSupport

Hello @hittzt,

Thanks a lot for the analysis, I tested reverting that commit and I can confirm that without that change the watchdog functionality works as expected. I will share this information with the respective team and let you know any relevant updates.

 

Thanks again

0 件の賞賛
返信
747件の閲覧回数
hittzt
Senior Contributor I

Hi @alejandro_e,

 

To fix the issue, we can udpate the fccu node as following:

diff --git a/arch/arm64/boot/dts/freescale/s32cc.dtsi b/arch/arm64/boot/dts/freescale/s32cc.dtsi
index cbdd4493d25c..b154098103b4 100644
--- a/arch/arm64/boot/dts/freescale/s32cc.dtsi
+++ b/arch/arm64/boot/dts/freescale/s32cc.dtsi
@@ -873,7 +873,7 @@ fccu: fccu@4030c000 {
nxp,ncf_fault_list = <0 10 35 36 37 38>;
nxp,ncf_actions = <S32CC_FCCU_REACTION_ALARM
S32CC_FCCU_REACTION_ALARM
- S32CC_FCCU_REACTION_NONE
+ S32CC_FCCU_REACTION_FUNC_RESET
S32CC_FCCU_REACTION_NONE
S32CC_FCCU_REACTION_NONE
S32CC_FCCU_REACTION_NONE>;

 

Then the issue can be fixed.

I am not sure wether the above change is the best way to fix the issue, you can try and share it with your internal development team.

 

Thanks,

Zhantao

タグ(1)
0 件の賞賛
返信
724件の閲覧回数
alejandro_e
NXP TechSupport
NXP TechSupport

Hello @hittzt,

I have received updates from the internal team, the behavior is related to what you point out in your latest reply, here are all the needed details:

"

This behavior in BSP44 is expected. By default, the watchdog fails to trigger a functional reset because the reaction is set to S32CC_FCCU_REACTION_NONE.

alejandro_e_0-1754411952829.png

 

It supports the following reactions:

#define S32CC_FCCU_REACTION_NONE 0
#define S32CC_FCCU_REACTION_ALARM (1 << 1)
#define S32CC_FCCU_REACTION_FUNC_RESET (1 << 2)
#define S32CC_FCCU_REACTION_NMI (1 << 3)
#define S32CC_FCCU_REACTION_EOUT (1 << 4)
If you intend to enable functional reset, you should configure it as S32CC_FCCU_REACTION_FUNC_RESET in the DTS. Then, rebuild the DTS and deploy it to the board. You will see the watchdog functionality works as expected.
--------------

Why did it work in BSP43?

In BSP43, there was a software bug in the FCCU driver (as referenced in commit issue ALB-12478). Due to this bug, the system could still trigger a reset even when the reaction was set toS32CC_FCCU_REACTION_NONE during a watchdog timeout. This bug has been fixed in BSP44, and now the system will only trigger a reaction that aligns with the configuration specified in the DTS. You can see that in the BSP44 release notes:

alejandro_e_1-1754411953007.png

"

 

In conclusion, the correr approach to handle this issue is to update the device tree node to use the required reaction type.

Please let me know if this information solved your problem.

I apologize for the time taken, I hope this did not cause you any problem.

 

Thanks

0 件の賞賛
返信