Currently, we are working on Android 13 porting in an iMX8MQ based EVK.
I downloaded the Android BSP from the NXP website and followed the Android user guide as it is to compile and flash the BSP.
The bootup process stopped at "Starting kernel ..." After some time, it started rebooting. The u-boot log is attached here.
Further debugging, We verified the DTB image by comparing the HEX value available @426f0400(by printing the HEX value in u-boot) and dtbo.img (using the HEX editor). It is fine. u-boot moved the image from 0x40480000 to 0x40600000 (debug messages available in the U-boot log). The kernel Image is available @40480000 but not in @40600000. We hope the image was not moved properly and the u-boot stuck at the "armv8_switch_to_el2" function(bootm.c), The exception level transition is not happening properly.
For testing, We have hardcoded the kernel address as 0x40480000 and got "Synchronous Abort". Crash log is attached here.
Kindly share your thoughts on this in order to debug further and resolve it. Thanks in advance.
Are you using EVK or your customized board? Can you share the i.MX8MQ part number, like MIMX8MQXXXX?
If you are using customized board, have you changed the DDR model compared with i.MX8MQ EVK?
Best Regards
Zhiming
Hi @Zhiming_Liu
iMX part number - MIMX8MQ6CVAHZAB
DDR - mt53e1g32d2fw-046 wt:c
Yes. It is our customized board.
For different DDR, you need to finish DDR test refer below guide. Then replace new ddr timing about your DDR in Uboot after you finish the DDR calibration.
Best Regards
Zhiming
Hi, We have done our DDR calibration & training. It was success. We are getting below error during boot up,
Starting kernel ...
"Error" handler, esr 0xbf000002
elr: ffffffff81123ea8 lr : ffffffff81129058 (reloc)
elr: 00000000405fbea8 lr : 0000000040601058
x0 : 00000000fd6c9208 x1 : 00000000fd6c92d0
x2 : 00000000ff7f2478 x3 : 0000000000000000
x4 : 0000000040600000 x5 : 0000000000000001
x6 : 0000000000000008 x7 : 000000000000000f
x8 : ffffffffffffffff x9 : 00000000fd6c94d8
x10: 0000000000000010 x11: 0000000000000002
x12: 0000000000000002 x13: 000000000000a1dc
x14: 00000000fd6c9128 x15: 00000000ffffffff
x16: 00000000f8000000 x17: 0000000006b3ac22
x18: 00000000fd6d7d70 x19: 0000000000000000
x20: 0000000000000000 x21: 00000000fd6c9208
x22: 00000000fd6c92d0 x23: 00000000fd6c94d8
x24: 00000000fd6c94d8 x25: 00000000ff6dad2c
x26: 0000000000000000 x27: 00000000ff7b00ab
x28: 0000000042760000 x29: 00000000fd6c91b0
Code: 910083fd d5384108 d5384114 f9431d08 (52800c89)
Resetting CPU ...
resetting ...
Further our debugging, we found the output kernel image ($ ./imx-make.sh kernel -c -j4) and the kernel image @0x1000 in the boot.img($ make bootimage -j4) looks different(attached the Image and boot.img - opened both the image on HEX viewer and compared). Looks there is no encryption/decryption was enabled in the u-boot. Is there any input on the same?
Can you test that if you can read/write DDR memory in Uboot terminal?
Best Regards
Zhiming
Hi @Zhiming_Liu,
Yes. It is fine.
This is log from EVK without encryption/decryption. The correct kernel load address is 0x40600000. Have you release your hardcode about 0x40480000 and test 0x40600000?
Kernel load addr 0x40480000 size 33035 KiB
kernel @ 40480000 (34537472)
ramdisk @ 44680000 (21557536)
fdt @ 426f0400 (49451)
Moving Image from 0x40480000 to 0x40600000, end=426f0000
## Flattened Device Tree blob at 426f0400
Booting using the fdt blob at 0x426f0400
Working FDT set to 426f0400
Using Device Tree in place at 00000000426f0400, end 00000000426ff52a
Working FDT set to 426f0400
Hi @Zhiming_Liu,
Yes. We removed the hardcoded part and tested with actual address @40460000. Now also we are getting the same sync abort error.
Starting kernel ...
"Error" handler, esr 0xbf000002
elr: ffffffff81123ea8 lr : ffffffff81129058 (reloc)
elr: 00000000405fbea8 lr : 0000000040601058
x0 : 00000000fd6c9208 x1 : 00000000fd6c92d0
x2 : 00000000ff7f2478 x3 : 0000000000000000
x4 : 0000000040600000 x5 : 0000000000000001
x6 : 0000000000000008 x7 : 000000000000000f
x8 : ffffffffffffffff x9 : 00000000fd6c94d8
x10: 0000000000000010 x11: 0000000000000002
x12: 0000000000000002 x13: 000000000000a1dc
x14: 00000000fd6c9128 x15: 00000000ffffffff
x16: 00000000f8000000 x17: 0000000006b3ac22
x18: 00000000fd6d7d70 x19: 0000000000000000
x20: 0000000000000000 x21: 00000000fd6c9208
x22: 00000000fd6c92d0 x23: 00000000fd6c94d8
x24: 00000000fd6c94d8 x25: 00000000ff6dad2c
x26: 0000000000000000 x27: 00000000ff7b00ab
x28: 0000000042760000 x29: 00000000fd6c91b0Code: 910083fd d5384108 d5384114 f9431d08 (52800c89)
Resetting CPU ...resetting ...
Try to enable earlyprintk in kernel config and add earlyprintk in bootargs.
The bootargs should like this, 0x30860000 is your debug uart base address.
"console=ttymxc0,115200 earlycon=ec_imx6q,0x30860000,115200"
Hi @Zhiming_Liu
We have enabled earlyprintk in kernel config (gki_defconfig:-CONFIG_SERIAL_IMX_EARLYCON=y) and added earlyprintk in bootargs (imx8mq-evk.dts:- bootargs = "console=ttymxc0,115200 earlycon=ec_imx6q,0x30860000,115200"). We have attached the output logs here.
It' hard to debug SError,typically this error is caused by hardware, either a CPU layout problem or a problem with the external storage DDR. Do you have any other test boards with this problem? If all of them have it, please check your hardware design.
Best Regards
Zhiming
I traced my problem further down the line. It appears to be stuck just before the kernel is entered. When switching to el2, the blocking point appears to be in the arch->arm->lib->bootm.c. The code continues until armv8_switch_to_el2((u64)images->ft_addr, 0, 0, 0, images->ep, ES_TO_AARCH64);, however it does not return from this function, as I discovered by adding debug messages. In transition, the switch-function is transition.S
I have attempted to comment out this section, but doing so causes the board to reset.
I would be grateful for any assistance with this, thanks!
Maybe this is hardware issue? Have you checked the power sequence refer the hardware deign guide and datasheet?
Best Regards
Zhiming
Hi @Zhiming_Liu
Yes. We verified the power sequence and it is good. We verified complete DDR by Write/Read the data on it. Verified the Kernel & DTB on DDR by printing the data from u-boot. This also good, both are available in the DDR. Still it is stuck @armv8_switch_to_el2 function. We tried to debug the ASM code by writing the debug messages to the UART Data buffer but not helping. Is there a way to debug the ASM code further? and what is happening on armv8_switch_to_el2?
Thanks in advance.
Hi
The armv8_switch_to_el2 jump to kernel address(x4=images->ep) refering the kernel address, i don't think this address is wrong.
Have you tried same image on different board?
Hi @Zhiming_Liu,
Sorry for the late reply. We dive in-depth and started debugging the ASM code. Image and DTB are properly loaded but Kernel was not started booting(with boot.img). For testing purpose, we flashed the boot-imx.img and its started booting. What is the major difference between them? and looks android is not up and running. in boot-imx.img, we are getting some error messages in $ logcat. Attached the error log for your reference. Please share your comments on it.
boot.img: AOSP GKI kernel + vendor kernel modules
boot-imx.img: NXP linux-imx kernel image, not GKI kernel.
Have your modified the /device folder in AOSP refering the EVK configuration?
Maybe your modifications import these errors.
Hi @Zhiming_Liu
No. We didn't changed anything under /device. (1) Android 13 NXP kernel source + our DDR & PMIC Patch in U-boot. 2) Booting with boot-imx.img)
BTW, we are getting the following error in the dmesg log.
[ 22.727060][ T221] galcore 38000000.gpu3d: deferred probe timeout, ignoring dependency
[ 22.741628][ T221] galcore: probe of 38000000.gpu3d failed with error -110
There is no /dev/galcore is detected. This is the root cause of previous logcat error. What is the difference in galcore on Android 11 an Android 13. We really appreciate on your input on this.
There are lots of differences between Andorid13 and Android11 in galcore source code.
From this log, it seems that the galcore driver has been deferred due to it's resources. Thes resources could relate to hardware or device tree settings.
The gpu3d difference is about the last clock, from 800000000 to 400000000, but i don't think this is root casue.