We are using the imx93 processor and I came across an issue where there was crash/error during boot and the system got stuck. I tought it was not possible because the watchdog is enabled in the bootloader and it should restart the board if the boot process is interrupted before it gets to the watchdog daemon that starts feeding it.
Looking at the driver (imx7ulp_wdt.c) it seems to me that the probe function calls the init function which unlocks the watchdog and writes the CS register without setting the EN bit, essentially disabling the watchdog. In the bootlogs I can see that the probe runs before I run into the problem I described above and so I think this the reason my board won't restart.
The expected behaviour is that there is no point in the boot process at all where an issue could halt the board permanently. If my assessment is correct then after the probe and before the watchdog daemon starts this is the case. In the imx2_wdt driver for the older processors I can see that the WDOG_HW_RUNNING bit is set, so the watchdog framework starts and feeds the watchdog (if im correct, not 100% sure). I can't see anything similar in this newer driver.
After the kernel is booted and the watchdog daemon (systemd wdctl in our case) starts the watchdog behaves as expected. I can trigger a reset by inducing a kernel panic intentionally, so I assume the device tree and everything else is correct.
For testing purposes I hacked into the driver a call to start the watchdog in the init function and then it resets the board in case of an error. So to me this proves the issue is in the driver not starting the watchdog after initializing it.
Is there something I missed? To me, this seems like a significant bug in the driver. What do you think?
It happens after the watchdog driver is loaded.
I can see in the logs the line:
[ 1.185698] imx7ulp-wdt 42490000.watchdog: imx7ulp wdt probe
static int imx7ulp_wdt_init(struct imx7ulp_wdt_device *wdt, unsigned int timeout)
{
/* enable 32bit command sequence and reconfigure */
u32 val = WDOG_CS_CMD32EN | WDOG_CS_CLK | WDOG_CS_UPDATE |
WDOG_CS_WAIT | WDOG_CS_STOP | WDOG_CS_EN; // WDOG_CS_EN added as test
Hi @Bence
The watchdog enable bit will be set in imx7ulp_wdt_start when user operate watchdog timer in sysfs interface. So the watchdog driver is not used for resetting when kernel panic, it's used for watchdog timer interface. Of cource you can enable it when probe refering your application.
I understand how it works. That is exactly the problem. What you describe means that until the sysfs interface is used in the late userspace the watchdog does not operate. There exists this 5-10 seconds during the boot process where this very basic safety function is not operational. That is a big problem.
Hi @Bence
Can you share where the unrecoverable errors occurs? Before or after WDT driver is loaded?