First instruction in normal mode gives an external abort.

billpringlemeir · ‎02-17-2015

The first instruction I ever execute on the Vybrid tower Cortex-A5 in normal world with external aborts trapping to monitor gives a fault. I looked at a CP15 register and I believe that the 'External Abort' was clear on boot, but the first instruction I execute gives an external abort. I read the Cortex-A5 TRM and have the following initialization code of the normal world CP15,

@ Clear fault registers.

mov r1, #0

mcr p15, 0, r1, c6, c0, 0   @ dfar

mcr p15, 0, r1, c5, c0, 0   @ dfsr

mcr p15, 0, r1, c6, c0, 2   @ ifar

mcr p15, 0, r1, c5, c0, 1   @ ifsr

mcr p15, 2, r1, c0, c0, 0   @ csselr (cache size select)

mcr      p15, 0, r1, c12, c0, 0 @ VBAR

mcr p15, 0, r1, c2, c0, 0   @ TTBR0

mcr p15, 0, r1, c2, c0, 1   @ TTBR1

mcr p15, 0, r1, c2, c0, 2   @ TTBCR (ttb control)

mcr p15, 0, r1, c13, c0, 1 @ CONTEXTIDR

movw r1, #0x5555

movt r1, #0x5555

mcr p15, 0, r1, c3, c0, 0   @ NS-DACR set to all access.

@ Zero non-reserved bits in SCTLR, system control register.

mrc p15, 0, r1, c1, c0, 0 @ Read Control Register configuration data

bic r1, r1, #(7<<28)|(1<<25) @ TE, AFE, TRE, EE

bic r1, r1, #(3<<12)|(1<<10) @ V, I, SW

bic r1, r1, #7    @ C, A, M

mcr p15, 0, r1, c1, c0, 0 @ Write Control Register configuration data

@@ Cortex-A Series programmers guide pg15.-3, example 15-3...

@ Invalidate L1 Caches (normal world)

@ Invalidate Instruction cache

mov r1, #0

mcr p15, 0, r1, c7, c5, 0

@ Invalidate Data cache

@ to make the code general purpose, we calculate the

@ cache size first and loop through each set + way

    mrc p15, 1, r0, c0, c0, 0 @ Read Cache Size ID

    ldr r3, =0x1ff

    and r0, r3, r0, lsr #13   @ r0 = no. of sets - 1

    mov r1, #0     @ r1 = way counter way_loop

2:    mov r3, #0     @ way_loop

1:    mov r2, r1, lsl #30 @ r3 = set counter set_loop

    orr r2, r2, r3, lsl #5 @ r2 = set-way cache operation format

    mcr p15, 0, r2, c7, c6, 2 @ Invalidate line described by r2

    add r3, r3, #1 @ Increment set counter

    cmp r0, r3     @ Last set reached yet?

    bne 1b         @ if not, iterate set_loop

    add r1, r1, #1 @ else, next

    cmp r1, #4     @ Last way reached yet?

    bne 2b         @ if not, iterate way_loop

    @ Invalidate TLB

    mcr p15, 0, r1, c8, c7, 0

isb

    @ Branch Prediction invalidate

mcr p15, 0, r1, c7, c5, 6

Are there other CP15 registers that may cause the ExtAbt? I have no clue as to what peripheral may cause this as the 'FSR/FAR' are all none relevant. I only look at the PC (dabt lr) and it give the first instruction +8. The exception mode is at monitor mode 10h or data abort. It seems to happen whether it is OCRAM or DDR3. I am guessing that it is due to the CP15 environment of the first instruction to execute. The initial monitor code to secure switch is,

/* The 'System control register' or SCR in CP15 c1, c1, 0

configures TrustZone; monitor mode and 'normal' capabilities.

*/

/* Set 'A' (external abort) flag from 'normal' mode. */

#define SCR_AW (1<<5)

/* Mask FIQ/secure interrupts from 'normal' mode. */

#define SCR_FW (1<<4)

/* Branch to monitor on external abort. */

#define SCR_EA (1<<3)

/* Branch to monitor on FIQ. */

#define SCR_FIQ (1<<2)

/* Branch to monitor on IRQ. */

#define SCR_IRQ (1<<1)

/* 1 is non-secure/normal and 0 is secure! */

#define SCR_NS (1<<0)

/* Switch to non-secure world, route FIQ to monitor,

no alignment traps. */

mov    r1, #(SCR_XX|SCR_FIQ|SCR_NS)

mcr    p15, 0, r1, c1, c1, 0

isb

/* load gen-regs */

msr spsr_fsxc, lr

ldm    sp, {r0 - r12, pc}^

This is executing in monitor mode. After the initial 'external abort' the secure world to normal world switching is fine without any abort exceptions. SCR_XX is to trap external aborts to the monitor (SCR_EA); after the initial fault I use the SCR_AW flag as there are no more (spurious?) external aborts. Is there any way to know if it is a Freescale IP which causes the ExtAbt vs. a Cortex-A5 internal (Freescale vs ARM CPU issue)? I read the L2 cache document and nothing jumped out. Initially, the caches are off when the normal world executes, but I guess data may pass through the caches and some access check happens? Dumping the L2 registers didn't seem to show anything significant. The 'TZASC' fuse is off and not configured (although defaults indicate all access). I can only start the normal world from TZASC peripherals?

CommunityBot · ‎09-03-2020

This an automatic process.

We are marking this post as solved, due to the either low activity or any reply marked as correct.

If you have additional questions, please create a new post and reference to this closed post.

NXP Community!

View solution in original post

CommunityBot · ‎09-03-2020

This an automatic process.

We are marking this post as solved, due to the either low activity or any reply marked as correct.

If you have additional questions, please create a new post and reference to this closed post.

NXP Community!

VilemZ · ‎02-24-2015

Hi Bill,

Do you using anysample code?

And have you got handle all exceptions?

Best regards

Vilem

billpringlemeir · ‎03-04-2015

Attached is an MQX project that is a test case. There are several source files,

monitor.c - an MQX application with 'shell' support.
Monitor.S - a monitor handling file.
uart.S - some PC relative code that prints to the UART.

In the 'Monitor.S' there are a few '@' lines.

      @orr r1, r1, #0x100 @ mask aborts WJP: disable for now.

Enable this and the ISR will be set as the monitor will never take an abort.

      b    monitor_exit
      @ subs pc,lr,#8

If you enable the 'subs pc,lr,#8' and disable the 'b monitor_exit', then the uart.S hello_poll() will run to completion. When it is not set, the dabt_lr is set to the first instruction + 8.

The attached project compiles with MQX 4.1 with the gcc tools and a Linux host. I run the command with 'tftp 0x3f000000 monitor.bin; go 0x3f000000' on a Tower board.

Run the shell command 'test' to start the monitor. Then call 'yield' to let the normal world run. You can also change the 'Dead' tasks priority. With the stock MQX, I need to keep interrupts masked.

The code in 'uart.S' is completely PIC and you can change the line,

+   monitor_run(0, (void (*)(void))hello_poll);
-   monitor_run(0, (void (*)(void))COPY_ADDR);

The code in the normal world then executes from OCRAM; but everything seems the same.

karina_valencia · ‎03-10-2015

timesyssupport can you help to review this case?

karina_valencia · ‎03-13-2015

reminder

Timesys Support can you help to review this case?

timesyssupport · ‎03-17-2015

Hello karinavalencia and billpringlemeir

Our Vybrid maintainers are reviewing this issue - Bill - can you clarify what Linux kernel you are using? Is this the Timesys Vybrid 3.0 or 3.13 kernel?

Thank you,

Timesys Support

billpringlemeir · ‎03-26-2015

Sorry, I was on vacation last week. I have an SPR open and I have attached a 'bare metal normal world' example using MQX on the A5 (see above). It doesn't matter what software is running in the normal world. So Linux is not needed to replicate the issue. Alejandro is currently trying to replicated the issue in SPR# 1-3667992351.

karina_valencia · ‎03-20-2015

billpringlemeir please provide the information requested.

billpringlemeir · ‎02-24-2015

Are you using any sample code?

I have various set-ups and they are all the same. The above code would allow a 'bare metal' run the normal world. I prefer not to post the whole monitor mode code. However, you need to set MVBAR and handle the 'data abort' exception. Here is a sample,

ldr r0, =monitor_vector_base

mcr p15, 0, r0, c12, c0, 1   @ write MVBAR.

...

/****************************/

/* The monitor vector table */

/****************************/

.balign 32

monitor_vector_base:

        b    .            /* 00:RESET    */

        b    .            /* 04:UNDEF    */

        b    mon_swi_entry        /* 08:SWI    */

        b    pabort_tz        /* 0C:IABORT    */ /* WJP */

        b    dabort_tz              /* 10:DABORT    */ /* WJP */

        b    .            /* 14:reserved    */

        b    mon_irq_entry        /* 1C:IRQ    */

        /* b    mon_fiq_entry */    /* 20:FIQ    */

The 'pabort_tz' and 'dabort_tz' will only be used if you trap the normal world external aborts. The first instruction to execute in the normal world will cause an external abort. I believe I have already given enough of a sample. In my case, I have both Linux and a 'bare metal' normal world program working. The 'bare metal' would normally keep EA masked. However, Linux will unmask them. If I unmask them in the bare metal, then I immediately get a fault. I think I setup the 'normal world' start above to enable them. I have MQX hosting the monitor. I can not post my modification to the Internet, both because of MQX licensing and it is my companies IP. If I make a 'sample', it would be better to tell me what you want. I will do this work if you promise that it can be fixed and/or will be timely attended by Freescale. I don't want to wait a year for information.

And have you got handle all exceptions?

I handle the exceptions as in the vector table above. Note that the 'dabort_tz' is only taken in the MVBAR when an external abort happens in the normal world (some of these vectors (WJP) should not be needed in a real system). It seems that there is some external abort when running the first instruction in the normal world. How can I tell what is causing this?

VilemZ · ‎02-26-2015

Hi Bill,

I'm sorry, I don't understand everything in your application. Can you explain me this?

1. You are using two applications? One of them is Linux? What kind of Linux? What kernel? Second of them is MQX? Is it true?

2. Both applications are runnign simultaneously and working with some exceptions? OK?

3. When you unmask EA in Linux, then everything is OK? But when you unmask EA in MQX, then fault?

Am I understand well?

Best regards

Vilem

billpringlemeir · ‎02-26-2015

1. You are using two applications? One of them is Linux? What kind of Linux? What kernel? Second of them is MQX? Is it true?

I am running two OS's. One is a stock Linux. The 2nd is MQX with modified interrupt handling, no DMA, no NEON, etc. However, that really doesn't matter to the problem.

2. Both applications are runnign simultaneously and working with some exceptions? OK?

Yes. Trustzone works with three sets of vector tables. Again, it really doesn't matter.

3. When you unmask EA in Linux, then everything is OK? But when you unmask EA in MQX, then fault?

No. This is incorrect. EA is mask/unmasked in the secure world (MQX), then there is no issue; there is no issue with the secure world (or a regular Vybrid boot) When the 'normal' world first executes, there is some 'EA' signaled to the CPU. If the normal world is allowed to mask EA (configured via TrustZone) and it is always masked, then there is no issue. However, Linux wishes to unmask the EA; normally it is quite far into the boot process. I modified the Monitor/TrustZone to not allow the normal world (Linux OR bare metal) to not be able to mask EA and to trap these to the monitor. [Aside: After this, I disable the trap to monitor on EA and allow Linux (normal world) to mask/unmask and everything is fine]. In all cases, the monitor gets the EA with the 'lr' set to the first instruction + 8. Ie, there seems to be some EA signaled as soon as the first instruction executes. The only way to clear it, is to take the exception in the monitor and return. After doing this, everything can work as normal. The MQX exeception table has nothing to do with it. There are three exception tables,

Linux (normal world)
MQX (secure world)
Monitor

It is configured which mode takes the exceptions when an abort happens. I have tried several permutations. I don't have trouble handling the execption. I want to know WHY does the first normal world instruction cause an external abort? It certainly could be something that I need to set up before executing in the 'normal world', but I have no way to figure out what it is. There are many sub-systems to the Vybrid and I can't really know what device caused the exception. The normal world banked FAR is bogus and set with whatever I initially write to it before starting the 'normal world'. The symptoms are the same whether Linux runs or a bare metal application. It is does not needed to have two OS's running. Just setup an environment to run a small routine in the 'normal world'; interrupts do not need to be handled in the 'normal world' as a pure polling system also exhibit the problem.

So perhaps, the CAAM, the L1/L2 cache, TZASC, etc are triggering an EA (I haven't explicitly configured every peripherals in the system). Or maybe there is some spurious EA signaled when the normal world first starts. I want to know what peripheral/sub-system it is and if it can be prevented or if I must trap the first instruction to execute. Perhaps I will execute some 'trusted' code in the normal world to get rid of the spurious EA (if that is the case). I believe that I read the ISR,

asm(" mrc p15, 0, %0, c12, c1, 0\n" : "=r"(tmp)); /* ISR */

And it does not appear until the normal world executes (bit 8) and it stays until someone takes an EA exception and then it is cleared and does not re-occur.

First instruction in normal mode gives an external abort.

First instruction in normal mode gives an external abort.

VF5xx

VF6xx

	@ Clear fault registers.
	mov	r1, #0
	mcr	p15, 0, r1, c6, c0, 0 @ dfar
	mcr	p15, 0, r1, c5, c0, 0 @ dfsr
	mcr	p15, 0, r1, c6, c0, 2 @ ifar
	mcr	p15, 0, r1, c5, c0, 1 @ ifsr
	mcr	p15, 2, r1, c0, c0, 0 @ csselr (cache size select)
	mcr p15, 0, r1, c12, c0, 0 @ VBAR
	mcr	p15, 0, r1, c2, c0, 0 @ TTBR0
	mcr	p15, 0, r1, c2, c0, 1 @ TTBR1
	mcr	p15, 0, r1, c2, c0, 2 @ TTBCR (ttb control)
	mcr	p15, 0, r1, c13, c0, 1 @ CONTEXTIDR
	movw	r1, #0x5555
	movt	r1, #0x5555
	mcr	p15, 0, r1, c3, c0, 0 @ NS-DACR set to all access.

	@ Zero non-reserved bits in SCTLR, system control register.
	mrc p15, 0, r1, c1, c0, 0 @ Read Control Register configuration data
	bic r1, r1, #(7<<28)\|(1<<25) @ TE, AFE, TRE, EE
	bic r1, r1, #(3<<12)\|(1<<10) @ V, I, SW
	bic r1, r1, #7	@ C, A, M
	mcr p15, 0, r1, c1, c0, 0 @ Write Control Register configuration data

	@ Invalidate L1 Caches (normal world)
	@ Invalidate Instruction cache
	mov r1, #0
	mcr p15, 0, r1, c7, c5, 0

	@ Invalidate Data cache
	@ to make the code general purpose, we calculate the
	@ cache size first and loop through each set + way

mov r1, #0	@ r1 = way counter way_loop
2: mov r3, #0	@ way_loop
1: mov r2, r1, lsl #30	@ r3 = set counter set_loop
orr r2, r2, r3, lsl #5	@ r2 = set-way cache operation format

add r3, r3, #1	@ Increment set counter
cmp r0, r3	@ Last set reached yet?
bne 1b	@ if not, iterate set_loop
add r1, r1, #1	@ else, next
cmp r1, #4	@ Last way reached yet?
bne 2b	@ if not, iterate way_loop

	/* Switch to non-secure world, route FIQ to monitor,
	no alignment traps. */
	mov r1, #(SCR_XX\|SCR_FIQ\|SCR_NS)
	mcr p15, 0, r1, c1, c1, 0
	isb

	ldr r0, =monitor_vector_base
	mcr p15, 0, r0, c12, c0, 1 @ write MVBAR.