Does the source address passed to bootaux have to be 8-byte aligned?

cancel
Showing results for 
Show  only  | Search instead for 
Did you mean: 

Does the source address passed to bootaux have to be 8-byte aligned?

Jump to solution
4,618 Views
xiaokaoy
Contributor I

Hello

I'm working with imx8dual. I'm trying to start M4 from Uboot, using the bootaux command. I've found that bootaux will only succeed if the source address passed to it is 8-byte aligned.

e.g.

If bootaux 0x90000000, M4 can start to run successfully.

If bootaux 0x90000004, M4 cannot.

Reading the source code of bootaux, we can see that basically, bootaux calls arch_auxiliary_core_up, which, in turn, calls memcpy((void *)aux_core_ram, (void *)addr, size)

 

In the case of  bootaux 0x90000004, however, M4 can also start by any of the following means:

(1) calls memset((void *)aux_core_ram, 0, size) before

memcpy((void *)aux_core_ram, (void *)addr, size)

(2) calls memset((void *)aux_core_ram, 0xff, size)before

memcpy((void *)aux_core_ram, (void *)addr, size)

(3) replace memcpy((void *)aux_core_ram, (void *)addr, size) with 

for (int i=0; i<size/4; i++)
*((uint32_t*)aux_core_ram+i) = *((uint32_t*)addr+i);

 

I must add that 
for (int i=0; i<size; i++)
*((uint8_t*)aux_core_ram+i) = *((uint8_t*)addr+i);

doesn't help.

 

Can anyone explain this?

 

0 Kudos
Reply
1 Solution
4,383 Views
jimmychan
NXP TechSupport
NXP TechSupport

Hello,

 

I got the reply from the AE

----------------

I think the reason is ECC. In iMX8, ECC is enabled in TCM.

When ECC is enabled, user need to do a ECC clean first (write 0 to whole TCML) and then write customized image to TCML area.

----------------

 

Best regards,

Jimmy

View solution in original post

0 Kudos
Reply
14 Replies
4,589 Views
jimmychan
NXP TechSupport
NXP TechSupport

I will check this for you.

0 Kudos
Reply
4,586 Views
xiaokaoy
Contributor I

Thanks.

 

Our tests show that at least 128K must be written into TCM, and that they must be written word by word (a 32 or 64-bit number at a time).

Otherwise the M4 core wouldn’t be able to start running successfully (at least no output that was expected).

 

memcpy provided by U-boot copies byte by byte unless both the source and destination start address are a multiple of 8

(See https://source.codeaurora.org/external/imx/uboot-imx/tree/lib/string.c?h=imx_v2020.04_5.4.70_2.3.0&i...).

Thus, if the M4 image is not put at a 8-byte aligned address in the DDR memory, the memcpy in arch_auxiliary_core_up

(at https://source.codeaurora.org/external/imx/uboot-imx/tree/arch/arm/mach-imx/imx8/cpu.c?h=imx_v2020.0...)

will copy byte by byte.

In that case, the M4 core won’t be able to start successfully.

 

0 Kudos
Reply
4,577 Views
jimmychan
NXP TechSupport
NXP TechSupport

I got the reply :

Have customer changed the load address in M4 app linker script?

In the command "bootaux <addr>", here the <addr> need to be aligned with the entry address defined in M4 app linker scrpt.

0 Kudos
Reply
4,571 Views
xiaokaoy
Contributor I

Thanks.

I didn't change the load address in M4 app linker script.

The address in "bootaux <addr>" command is an address in the DDR RAM. 

bootaux will copy the M4 app bin file from there to M4's TCM before kicking off M4.

What does "aligned with the entry address defined in M4 app linker script" mean?

0 Kudos
Reply
4,568 Views
jimmychan
NXP TechSupport
NXP TechSupport

The load address of M4 image should be the same with the entry address defined in linker script.

0 Kudos
Reply
4,562 Views
xiaokaoy
Contributor I

Thanks. But I guess that requirement is for imx7. I'm using imx8dual.

0 Kudos
Reply
4,534 Views
jimmychan
NXP TechSupport
NXP TechSupport

What does you mean about "But I guess that requirement is for imx7. I'm using imx8dual."?

0 Kudos
Reply
4,506 Views
xiaokaoy
Contributor I

https://source.codeaurora.org/external/imx/uboot-imx/tree/arch/arm/mach-imx/imx_bootaux.c?h=imx_v202...

I think that this function is for imx7 and that the addr parameter for it must be the start address of M4 TCML from the view of A core. 

 

However, https://source.codeaurora.org/external/imx/uboot-imx/tree/arch/arm/mach-imx/imx8/cpu.c?h=imx_v2020.0...

This function is for imx8qxp, and the boot_private_data parameter (i.e. the addr for bootaux) doesn't have to be the same as what the linker script specifies. Actually they mustn't be the same (see

https://source.codeaurora.org/external/imx/uboot-imx/tree/arch/arm/mach-imx/imx8/cpu.c?h=imx_v2020.0...

 

 

0 Kudos
Reply
4,503 Views
jimmychan
NXP TechSupport
NXP TechSupport

Do you mean that for M4 image on imx8qxp, the entry address defined in linker script can be different from the load address in memory?

On imx8qxp, for example, from CM4 local view, the TCML address is 0x1FFE0000 and from AP view, the TCML address is 0x34FE0000.

So in linker script of M4 image, the entry address is defined as 0x1FFE0000.

But in u-boot, this image will be loaded to 0x34FE0000, which is 0x1FFE0000 from CM4 local view.

0 Kudos
Reply
4,470 Views
xiaokaoy
Contributor I

Thanks, jimmychan. I knew that.

What bootaux does is copy the M4 image from somewhere in the DDR RAM to 0x34FE0000 and then kick off M4.

I've found that if it copies one 32/64-bit integer after another, it's OK. Otherwise (e.g. copies 8/16-bit integer after another) M4 would be unable to start/run successfully.

I was wondering if you could confirm that it is really required to copy the M4 image to 0x34FE0000 (i.e. the TCM of M4) that way.

0 Kudos
Reply
4,450 Views
jimmychan
NXP TechSupport
NXP TechSupport

I got the reply from the internal.

 

About "if it copies one 32/64-bit integer after another, it's OK. Otherwise (e.g. copies 8/16-bit integer after another) M4 would be unable to start/run successfully.", it seems to be related to alignment.

It is said in RM that:

"

Because AHB-Lite does not support write data strobes when accessing AHB-Lite slaves
from an AXI master, care must be taken not to generate transactions that have partial
strobes. Make sure to not have unaligned accessing to TCM from an AXI master. For
example, when writing data to TCM from A53, ensure every write strobe address is 64bit

aligned. When the MMU is enabled, the TCM memory range must have the
MT_DEVICE_NGNRNE type attribute set. This will avoid A53 sparse writes to the
TCM memory region.

"

  I don't know if this explanation is related to your findings.

  Could you do more tests on it?

 

The M4 image can also support running in DDR. So it is not required to copy the image to TCM at 0x34FE0000.

 

0 Kudos
Reply
4,421 Views
xiaokaoy
Contributor I

I've read the excerpt of the RM before. Thanks, anyway.

I've also found that writing 64/32-bit word by word is not enough; I must write the whole TCML, even if my M4 image is actually far smaller than TCML in size.

Furthermore, if I zero'ed the whole TCML by writing 64/32-bit word by word (whose value is 0) first, then copying the M4 image to TCML byte by byte would also work.

0 Kudos
Reply
4,384 Views
jimmychan
NXP TechSupport
NXP TechSupport

Hello,

 

I got the reply from the AE

----------------

I think the reason is ECC. In iMX8, ECC is enabled in TCM.

When ECC is enabled, user need to do a ECC clean first (write 0 to whole TCML) and then write customized image to TCML area.

----------------

 

Best regards,

Jimmy

0 Kudos
Reply
4,364 Views
xiaokaoy
Contributor I

Thank you, jimmychan.

0 Kudos
Reply