Kernel prints warnings/error while temperature goes below zero degree celcius

cancel
Showing results for 
Show  only  | Search instead for 
Did you mean: 

Kernel prints warnings/error while temperature goes below zero degree celcius

433 Views
roshanmohammed
Contributor I

Board: i.MX6ULL

Version: Yocto 5.4.15

On boot at cold (below zero degree celcius) temperatures, I am getting the following kernel stack traces, but the module kept booting:

[    1.873145] ------------[ cut here ]------------ [    1.877807] WARNING: CPU: 0 PID: 63 at kernel/irq/chip.c:242 __irq_startup+0xa8/0xac
[    1.885554] Modules linked in: [    1.888622] CPU: 0 PID: 63 Comm: kworker/0:2 Not tainted 5.4.70 #2
[    1.894805] Hardware name: Freescale i.MX6 UL/ULL (Device Tree)
[    1.900738] Workqueue: events deferred_probe_work_func
[    1.905906] [<8010efc0>] (unwind_backtrace) from [<8010c170>] (show_stack+0x10/0x14)
[    1.913666] [<8010c170>] (show_stack) from [<8012686c>] (__warn+0xcc/0xe4)
[    1.920557] [<8012686c>] (__warn) from [<801268fc>] (warn_slowpath_fmt+0x78/0xa8)
[    1.928055] [<801268fc>] (warn_slowpath_fmt) from [<801698f8>] (__irq_startup+0xa8/0xac)
[    1.936159] [<801698f8>] (__irq_startup) from [<8016993c>] (irq_startup+0x40/0x5c)
[    1.943742] [<8016993c>] (irq_startup) from [<8016667c>] (enable_irq+0x44/0xa0)
[    1.951066] [<8016667c>] (enable_irq) from [<80575288>] (imx_get_temp+0x188/0x2bc)
[    1.958654] [<80575288>] (imx_get_temp) from [<80572a44>] (thermal_zone_get_temp+0x48/0x68)
[    1.967020] [<80572a44>] (thermal_zone_get_temp) from [<8056f8ac>] (thermal_zone_device_update.part.0+0x28/0x168)
[    1.977293] [<8056f8ac>] (thermal_zone_device_update.part.0) from [<80570330>] (thermal_zone_device_register+0x4d8/0x600)
[    1.988260] [<80570330>] (thermal_zone_device_register) from [<80575828>] (imx_thermal_probe+0x330/0x5d4)
[    1.997840] [<80575828>] (imx_thermal_probe) from [<8049244c>] (platform_drv_probe+0x48/0x98)
[    2.006378] [<8049244c>] (platform_drv_probe) from [<8048ff38>] (really_probe+0x258/0x4bc)
[    2.014654] [<8048ff38>] (really_probe) from [<804905a4>] (driver_probe_device+0x78/0x1c4)
[    2.022929] [<804905a4>] (driver_probe_device) from [<8048db40>] (bus_for_each_drv+0x80/0xd0)
[    2.031465] [<8048db40>] (bus_for_each_drv) from [<8049026c>] (__device_attach+0xd0/0x1d4)
[    2.039739] [<8049026c>] (__device_attach) from [<8048ee58>] (bus_probe_device+0x84/0x8c)
[    2.047928] [<8048ee58>] (bus_probe_device) from [<8048f31c>] (deferred_probe_work_func+0x74/0xb4)
[    2.056899] [<8048f31c>] (deferred_probe_work_func) from [<8013f8cc>] (process_one_work+0x1b0/0x43c)
[    2.066044] [<8013f8cc>] (process_one_work) from [<8013fde0>] (worker_thread+0x288/0x60c)
[    2.074233] [<8013fde0>] (worker_thread) from [<80145508>] (kthread+0x174/0x17c)
[    2.081641] [<80145508>] (kthread) from [<801010e8>] (ret_from_fork+0x14/0x2c)
[    2.088871] Exception stack(0x9c4fdfb0 to 0x9c4fdff8)
[    2.093930] dfa0:                                     00000000 00000000 00000000 00000000
[    2.102118] dfc0: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
[    2.110302] dfe0: 00000000 00000000 00000000 00000000 00000013 00000000
[    2.116921] ---[ end trace 0e110773c6f3345c ]---

 

What is the meaning of these prints? Is there any patch available for resolving this issue?

I want to add some more points to it.

I saw the register 'Tempsensor Control Register 2' to which I can set 'LOW_ALARM_VALUE' . Please the register details.

Address: 20C_8000h base + 290h offset + (4d × i), where i=0d to 3d .

roshanmohammed_1-1675689602411.png

 

I saw in the linux driver that the register value is 290h+4. It indicates the variable i is 1 and the value setting to this register is 0xfff. From the experiments I understood that this setting is for zero degree celcius (As per our testing with the help of chamber. We are seeing the warning/error kernel prints on boot up whenever temperature is on or below zero degree celcius).

What is the mechanism for setting the 'LOW_ALARM_VALUE' to -35 degrees.

Labels (1)
0 Kudos
2 Replies

387 Views
Zhiming_Liu
NXP TechSupport
NXP TechSupport

Hello @roshanmohammed 

1. Please refer the imx_set_alarm_temp function and it's calltrace.

2. Please check your thermal-zones in dts.

0 Kudos

370 Views
roshanmohammed
Contributor I

Hi @Zhiming_Liu 

imx_set_alarm_temp function is below

static void imx_set_alarm_temp(struct imx_thermal_data *data,
int alarm_temp)
{
struct regmap *map = data->tempmon;
const struct thermal_soc_data *soc_data = data->socdata;
int alarm_value;

data->alarm_temp = alarm_temp;

if (data->socdata->version == TEMPMON_IMX7D)
alarm_value = alarm_temp / 1000 + data->c1 - 25;
else
alarm_value = (data->c2 - alarm_temp) / data->c1;
regmap_write(map, soc_data->high_alarm_ctrl + REG_CLR,
soc_data->high_alarm_mask);
regmap_write(map, soc_data->high_alarm_ctrl + REG_SET,
alarm_value << soc_data->high_alarm_shift);
}

I am not seeing any setting for the low temperature in this function. Writing to the Address: 20C_8000h base + 180h offset + (4d × i), where i=0d to 3d .

Low temperature is setting in the below driver part I think

/* make sure the IRQ flag is clear before enabling irq on i.MX6SX */
if (data->socdata->version == TEMPMON_IMX6SX) {
regmap_write(map, IMX6_MISC1 + REG_CLR,
IMX6_MISC1_IRQ_TEMPHIGH | IMX6_MISC1_IRQ_TEMPLOW
| IMX6_MISC1_IRQ_TEMPPANIC);
/*
* reset value of LOW ALARM is incorrect, set it to lowest
* value to avoid false trigger of low alarm.
*/
regmap_write(map, data->socdata->low_alarm_ctrl + REG_SET,
data->socdata->low_alarm_mask);

Writing to the Address: 20C_8000h base + 290h offset + (4d × i), where i=0d to 3d .

I am getting a value 0xfff while reading the address. REG_SET value is 0x4.

In this case, kernel warnings are coming below zero degree celcius.

Driver is not reading temperature zone value from dtsi file and dtsi file does not contains temperature zones also.

0 Kudos