IMX93 watchdog stopped during probe causing unrecoverable errors

cancel
Showing results for 
Show  only  | Search instead for 
Did you mean: 

IMX93 watchdog stopped during probe causing unrecoverable errors

721 Views
Bence
Contributor II

We are using the imx93 processor and I came across an issue where there was crash/error during boot and the system got stuck. I tought it was not possible because the watchdog is enabled in the bootloader and it should restart the board if the boot process is interrupted before it gets to the watchdog daemon that starts feeding it.

Looking at the driver (imx7ulp_wdt.c) it seems to me that the probe function calls the init function which unlocks the watchdog and writes the CS register without setting the EN bit, essentially disabling the watchdog. In the bootlogs I can see that the probe runs before I run into the problem I described above and so I think this the reason my board won't restart.

The expected behaviour is that there is no point in the boot process at all where an issue could halt the board permanently. If my assessment is correct then after the probe and before the watchdog daemon starts this is the case. In the imx2_wdt driver for the older processors I can see that the WDOG_HW_RUNNING bit is set, so the watchdog framework starts and feeds the watchdog (if im correct, not 100% sure). I can't see anything similar in this newer driver.

After the kernel is booted and the watchdog daemon (systemd wdctl in our case) starts the watchdog behaves as expected. I can trigger a reset by inducing a kernel panic intentionally, so I assume the device tree and everything else is correct.

For testing purposes I hacked into the driver a call to start the watchdog in the init function and then it resets the board in case of an error. So to me this proves the issue is in the driver not starting the watchdog after initializing it.

Is there something I missed? To me, this seems like a significant bug in the driver. What do you think?

4 Replies

668 Views
Bence
Contributor II

It happens after the watchdog driver is loaded.

I can see in the logs the line:

 

[    1.185698] imx7ulp-wdt 42490000.watchdog: imx7ulp wdt probe

 

A few seconds later the error occurs and the board does not restart.
I don't think the error itself matters, what matters is that the watchdog is not running at this stage of the boot process even though it was started in the bootloader.
 
To verify that the problem is in the probe function (more precisely in the init called by probe) I tried setting the enable bit in init.

 

static int imx7ulp_wdt_init(struct imx7ulp_wdt_device *wdt, unsigned int timeout)
{
        /* enable 32bit command sequence and reconfigure */
        u32 val = WDOG_CS_CMD32EN | WDOG_CS_CLK | WDOG_CS_UPDATE |
                  WDOG_CS_WAIT | WDOG_CS_STOP | WDOG_CS_EN; // WDOG_CS_EN added as test

 

 As expected, the watchdog runs and resets the board if there is a problem during the boot process. This is not a good solution, just a proof that the problem is that probe clears te enable bit.
0 Kudos
Reply

648 Views
Zhiming_Liu
NXP TechSupport
NXP TechSupport

Hi @Bence 

The watchdog enable bit will be set in imx7ulp_wdt_start when user operate watchdog timer in sysfs interface. So the watchdog driver is not used for resetting when kernel panic, it's used for watchdog timer interface. Of cource you can enable it when probe refering your application.

0 Kudos
Reply

638 Views
Bence
Contributor II

I understand how it works. That is exactly the problem. What you describe means that until the sysfs interface is used in the late userspace the watchdog does not operate. There exists this 5-10  seconds during the boot process where this very basic safety function is not operational. That is a big problem.

0 Kudos
Reply

679 Views
Zhiming_Liu
NXP TechSupport
NXP TechSupport

Hi @Bence 

Can you share where the unrecoverable errors occurs? Before or after WDT driver is loaded?

 

 

0 Kudos
Reply