I.MX8M would automatically crashes under do not do anything

cancel
Showing results for 
Show  only  | Search instead for 
Did you mean: 

I.MX8M would automatically crashes under do not do anything

20,353 Views
dylan_lin
Contributor III

Hello experts ,

I found my I.MX8M board would automatically crashes under do not do anything,

( can't output log or key any command )

it might happen between 1~2 hour,

but after  i keyin suspend order , the I.MX8M would not crash ,

this scenario can't happen in heavy loading and any also can't happen in any thread was running.

Device information as below:

CPU : MIMX8MQ6CVAHZAB

LPDDR4 : MT53B768M32D4NQ-062 WT:B

PMIC : PF4210

Labels (1)
25 Replies

8,354 Views
Tomas555
Contributor I

We are running into the same problem. Our device running Android 11.2.6.0 with kernel 5.10.72 and IMX8MP is randomly locking up after few hours or few days - it is random. The only thing showing up in the log is this without any previous warning or anything in the dmesg:
02-07 04:45:06.035 0 0 W : [T17870] **************************
02-07 04:45:06.040 0 0 W : [T17870] *** GPU DRV CONFIG ***
02-07 04:45:06.045 0 0 W : [T17870] **************************


 followed by insane amount of dumps and registers.

After that, the Android layer is completely locked until hard reboot.

However, I can still connect through adb and gather logs, anrs, tombstones etc. But it happens randomly. We are stressing the device as much as we can, we get it to 85degC, whatever, then we leave it, leave the office and in the morning we sometimes come to just completely frozen device, sometimes it works for 2-3 days and then it freezes. Image on display is just stuck like if you pause a video.

I have tried 50+ things to "unfreeze" the device with no actual luck. The only thing that works is a power cycle or reboot. 

 

Any help would be highly appreciated.

0 Kudos
Reply

12,720 Views
Bashcroft1
Contributor I

Dear @igorigor  (and anybody else that reads this)

We are experiencing what appears to be 'random' reboots, similar to what has been described in this thread. We have exhausted all manner of options trying to debug, so I have resorted to checking online to see if anybody else has experienced the same problem. I found this thread; the symptoms that have been described sound similar to what we experiencing. Those being:

- What appear to be random reboots, however, there is some correlation to loading of the iMX8. Those with less load appear to reboot more often than those under more.

- The reboot is being instigated by a watchdog timer, because of some kind of lockup of the CPU. We have disabled the watchdog timer and observed the processor just stopping (and becoming unresponsive indefinitely).

Hardware details are as follows:

NXP MIMX8MQ6DVAJZAB

RAM = (1GB to 4GB) LPDDR4. Kingston Q3222PM1WDGTK-U 

Does anybody have any advice?

Thanks for your attention.

0 Kudos
Reply

18,410 Views
robin_stridh
Contributor I

Hi Dylan and Igor,

We have a very similar problem with random freezes on i.MX8MQ and 4GB LPDDR4.

I didn't really understand what actually solved the problem in your case - was it the settings of DDRC.PWRCTL and DDRC.DERATEEN or just using RPA_v24 or something else (there is a note about some additional documents sent by email)?

If it was just the change to RPA_v24 - did you analyze what caused the change?

Best Regards,

Robin

0 Kudos
Reply

18,409 Views
dylan_lin
Contributor III

Hi  Robin

Could you describe what issue do you have with in your device ??

My device has system crashed when system into idle , because i found it was the DDR refresh issue ,

The RPA_V24 add option to choice the LPDDR enable or disable auto derating  , but RPA_V23 does not.

The RPA_V23 default use auto derating depend on DDR MR4 value .

0 Kudos
Reply

18,410 Views
robin_stridh
Contributor I

Hi Dylan,

Thank you very much for answering!

Our problem seems very similar to yours. The system only freezes when doing nothing - when heavily loaded it never crashes. Normally it takes between a couple of hours and a week between freezes so it's quite hard to debug. 

We have initially used RPA_V23 with NXP BSP 4.14.78 and later RPA_v24 with BSP 4.19.35 but we didn't change the setting about DRAM derating. We will certainly test this!

What confuses me a bit is the note in RPA_v24 that says this is only relevant when booting the system at over 85 deg C but I'm certain our system is cooler than that when started. Was this the same for you?

Did you only change the "LPDDR4 MR4 manual de-rate workaround" from "Automatic" to "Option 1" or "Option 2" or did you also change some of the parameters for MR4?

Best Regards,

Robin

0 Kudos
Reply

18,410 Views
dylan_lin
Contributor III

Dear  Robin

Ok i try my best to let you know how to describe this option.

Automatic means that the i.MX8M DRAM controller automatically  refreshes through LPDDR parameter " MR4 " , if LPDDR4's MR4 is not  changed , the DRAM controller will not change refresh timing ,which may cause data loss at higher temperature.

Option1 means that forces i.MX8M Dram controller running 2x refresh timing at <= 105 degree  ( industrial temperature grade LPDDR4 --> 105 degree )

Option2 means that forces i.MX8M Dram controller running 4x refresh timing at <= 125 degree  ( automotive temperature grade LPDDR4 --> 125 degree )

When i use option 1 & option 2 that i.MX8M DRAM controller refreshes faster and ignores MR4 value , both of two option can fixed this issue,

I observed this issue the LPDDR MR4 value may be abnormal.

18,410 Views
robin_stridh
Contributor I

Hi Dylan,

We have been running the system with "Option 1" now for 24 hours without crash so at least it starts to seem hopeful. What concerns me is that the note in RPA_24 suggest that this fix is only relevant at temperatures over 85 deg C but our system is never close to that temperature. 

Was your system running that hot or did the fix also work for you at lower temperatures?

Best regards,

Robin

0 Kudos
Reply

18,410 Views
dylan_lin
Contributor III

Hi  Robin

It was a good news for you , i knew your concern , so you can do two things ,

1 -->  Make sure  the die temperature over 85 deg C in i.MX8M within heavy loading , not junction temperature .

2 -->  Observe whether the device power consumption has  increased ?

I also had the same concern with you , and my device add aluminum heat sink on that , so i thought my 

device was never over 85 deg C , but i thought this issue may be a bug with i.MX8M.

0 Kudos
Reply

18,410 Views
94393400
Contributor III

Hi Dylan,

        Which codes do you changed to confige the 4GB LPDDR4, I changed the codes , but always system_server native crash when use or do monkey test.

       what I change is below, please help to have a check. Many thanks.

native crash and low mem when take monkey test with LPDDR 4GB in imx8Mq evk 

0 Kudos
Reply

18,410 Views
dylan_lin
Contributor III

Hi Zhulin

 Does your device have 4GB LPDDR4 , for per controller ?  or Totally ?

1. I check your RPA2_V24 document,the parameter is wrong , the I.MX8M maximum supports 24Gb , check RM           file   2.1.2 Cortex-A53 Memory Map

2. When you modify parameter of LPDDR4 , you should calibration it and stress it before you use monkey app

3. In addition , the LPDDR4 MR4 option in RPA_V24 should chooce option1 ,not automatic.

0 Kudos
Reply

18,408 Views
94393400
Contributor III

Hi Dylan,

     If I confige the 4GB DDR, how to change the codes in uboot and kernel?  I configed the 3+1 is correct or not?

      Many thanks.

0 Kudos
Reply

18,398 Views
dylan_lin
Contributor III

Hi  Zhulin

The I.MX8M only support totally 24Gb density,it means only support 3GB density,if you use more density dram,only 3GB can use ,other density are not useful,you can check RM file "Dram Memory Map" to double check memory address,

so your code ( bootloader & kernal ) should modify correct dram density.

0 Kudos
Reply

18,394 Views
94393400
Contributor III

Hi Dylan,

       

1. the RPA2_V24 should confige to 24GB ?

2. so your code ( bootloader & kernal ) should modify correct dram density.    how to change the bootloader & kernal codes? 

0 Kudos
Reply

18,410 Views
94393400
Contributor III

Hi Dylan,

        Thanks for your reply,  the 4GB LPDDR4 of our device is Total , about the 3 points you refer to, the 2rd and 3rd , we have modified, do calibration and stress , also chose the option1.

   But the 1st point I can not follow , you said " the I.MX8M maximum supports 24Gb , check RM           file   2.1.2 Cortex-A53 Memory Map"   

    It means that I.MX8M can't support 3GB?     RM  file   2.1.2 Cortex-A53 Memory Map ,   how to config the parameter of  RPA2_V24 document,Total DRAM density (Gb) only can set 24GB?

pastedImage_1.png

0 Kudos
Reply

18,410 Views
dylan_lin
Contributor III

Hello Igor

i am back to say thank you

I am sure it was a DDR refresh issue , use RPA_v24 can fixed it .

thanks a lot .

0 Kudos
Reply

18,392 Views
dylan_lin
Contributor III

Dear  Igor

I tried to reproduce issue on I.MX8M EVK , but it can't reproduce issue on EVK.

I used this LPDDR4 4GB and 512 MB to build up image before i already used DDR Tool v2.10 to calibration

and stress complete .

so you suggest me to re-calibration LPDDR4 parameter?

0 Kudos
Reply

18,392 Views
igorpadykov
NXP Employee
NXP Employee

yes please recalibrate using latest MX8M_LPDDR4_RPA_v24.xlsx

and L4.14.98  Linux 4.14.98_2.2.0 Documentation

from linux-imx - i.MX Linux kernel 

Best regards
igor

0 Kudos
Reply

18,392 Views
dylan_lin
Contributor III

Dear  Igor

I tried to use your RPA_v24 to re-calibration LPDDR4 ,

the result was the same .

But I analysed the EVK's DDR parameter , and use suspend command , the EVK's parameter as the below :

DDRC.PWRCTL=0x00000000

But i check my board parateter as below :

DDRC.PWRCTL=0x00000203 

 

I read the RM with in 9.3.3.1.108.3 Fields to find the related description is 3D4F_0020 address

derate_enable :

Enables derating. Present only in designs configured to support LPDDR2/LPDDR3/LPDDR4. This field
must be set to '0' for non-LPDDR2/LPDDR3/LPDDR4 mode.
0b - Timing parameter derating is disabled
1b - Timing parameter derating is enabled using MR4 read value

so i modified

DDRC.PWRCTL=0x00000203 ---> DDRC.PWRCTL=0x00000088

DDRC.DERATEEN.DERATE_ENABLE=0x01

as so far , the system was not crash after 24 hours , so can you explain this ? 

or has more information ?

17,781 Views
hthiery
Contributor I

Hi Dylan,

was the issue solved in the end?

BR,

Heiko

Tags (1)
0 Kudos
Reply

18,392 Views
igorpadykov
NXP Employee
NXP Employee

Hi Dylan

 

I sent additional document via mail.

 

Best regards
igor