IMX93 watchdog stopped during probe causing unrecoverable errors

キャンセル
次の結果を表示 
表示  限定  | 次の代わりに検索 
もしかして: 

IMX93 watchdog stopped during probe causing unrecoverable errors

588件の閲覧回数
Bence
Contributor II

We are using the imx93 processor and I came across an issue where there was crash/error during boot and the system got stuck. I tought it was not possible because the watchdog is enabled in the bootloader and it should restart the board if the boot process is interrupted before it gets to the watchdog daemon that starts feeding it.

Looking at the driver (imx7ulp_wdt.c) it seems to me that the probe function calls the init function which unlocks the watchdog and writes the CS register without setting the EN bit, essentially disabling the watchdog. In the bootlogs I can see that the probe runs before I run into the problem I described above and so I think this the reason my board won't restart.

The expected behaviour is that there is no point in the boot process at all where an issue could halt the board permanently. If my assessment is correct then after the probe and before the watchdog daemon starts this is the case. In the imx2_wdt driver for the older processors I can see that the WDOG_HW_RUNNING bit is set, so the watchdog framework starts and feeds the watchdog (if im correct, not 100% sure). I can't see anything similar in this newer driver.

After the kernel is booted and the watchdog daemon (systemd wdctl in our case) starts the watchdog behaves as expected. I can trigger a reset by inducing a kernel panic intentionally, so I assume the device tree and everything else is correct.

For testing purposes I hacked into the driver a call to start the watchdog in the init function and then it resets the board in case of an error. So to me this proves the issue is in the driver not starting the watchdog after initializing it.

Is there something I missed? To me, this seems like a significant bug in the driver. What do you think?

4 返答(返信)

535件の閲覧回数
Bence
Contributor II

It happens after the watchdog driver is loaded.

I can see in the logs the line:

 

[    1.185698] imx7ulp-wdt 42490000.watchdog: imx7ulp wdt probe

 

A few seconds later the error occurs and the board does not restart.
I don't think the error itself matters, what matters is that the watchdog is not running at this stage of the boot process even though it was started in the bootloader.
 
To verify that the problem is in the probe function (more precisely in the init called by probe) I tried setting the enable bit in init.

 

static int imx7ulp_wdt_init(struct imx7ulp_wdt_device *wdt, unsigned int timeout)
{
        /* enable 32bit command sequence and reconfigure */
        u32 val = WDOG_CS_CMD32EN | WDOG_CS_CLK | WDOG_CS_UPDATE |
                  WDOG_CS_WAIT | WDOG_CS_STOP | WDOG_CS_EN; // WDOG_CS_EN added as test

 

 As expected, the watchdog runs and resets the board if there is a problem during the boot process. This is not a good solution, just a proof that the problem is that probe clears te enable bit.
0 件の賞賛
返信

515件の閲覧回数
Zhiming_Liu
NXP TechSupport
NXP TechSupport

Hi @Bence 

The watchdog enable bit will be set in imx7ulp_wdt_start when user operate watchdog timer in sysfs interface. So the watchdog driver is not used for resetting when kernel panic, it's used for watchdog timer interface. Of cource you can enable it when probe refering your application.

0 件の賞賛
返信

505件の閲覧回数
Bence
Contributor II

I understand how it works. That is exactly the problem. What you describe means that until the sysfs interface is used in the late userspace the watchdog does not operate. There exists this 5-10  seconds during the boot process where this very basic safety function is not operational. That is a big problem.

0 件の賞賛
返信

546件の閲覧回数
Zhiming_Liu
NXP TechSupport
NXP TechSupport

Hi @Bence 

Can you share where the unrecoverable errors occurs? Before or after WDT driver is loaded?

 

 

0 件の賞賛
返信