Linux in non-secure world, Missing prefetch aborts and page faults

cancel
Showing results for 
Show  only  | Search instead for 
Did you mean: 

Linux in non-secure world, Missing prefetch aborts and page faults

Jump to solution
3,248 Views
harald_walter3
Contributor II

Dear all,

I am running a Linux in the non-secure world with TrustZone enabled.

Let me first state out that this is already running on a i.MX6 Quad processor, however on the Dual Lite I have an issue during

kernel startup.

The startup fails after kernel was initialized properly and the kernel tries to jump to the

user-space the first time (init process, see function ret_to_user).

The address where the kernel wants to jump to is valid. I compared it with the memory areas of the process (via the mm pointer of

the process descriptor). After a scan of the MMU page table I can see that the memory is not mapped yet. In the reference system

on i.MX6 Quad I see that a prefetch aborts comes up that calls the page fault handler. After the page fault handler, the memory

area is mapped and ret_to_user is called again. This time no exception occurs when the kernel tries to jump to

the address of the init process and init takes place.

However in the i.MX6 DL the prefetch exception and the page fault handler are never called. If ret_to_user is called the second time it

hangs in the work_pending subroutine.

I can see that the do_page_fault handler is called during kernel startup several times, but out of the data abort exception. So it seems

to me that it works for data that is accessed but not for instructions.

If I disable the TrustZone by means of running everything in secure world the issue is gone.

In summary it seems to me that this issue is related to TrustZone AND DualLite but not Quad processors.

Now let me describe our system:

- We run a 3.14.51 Linux kernel

- The linux kernel runs on the second core, on the first core another operating system is running asynchronously (AMP).

- The boot manager is self-written. Within the boot manager the core where Linux runs is set to non-secure. The NSACR, and Secure Config

  Register is set like in a U-Boot. After this, the GIC is set to non secure and the switch to normal world takes place. Then the GIC

  distributor is enabled in the non-secure world. Interrupts are working in general.

- The boot manager takes also care about a lot of different ARM erratas regarding the L2 cache. Especially writes to the diagnostic

  control register cannot be done by the non-secure world anymore. Therefore it is done by the bootmanager before the switch to non-secure

  takes place.

Labels (3)
1 Solution
1,775 Views
harald_walter3
Contributor II

Hi,

we finally found the issue. It is related to an Errata for the Level 2 cache that is not applicable when starting in non-secure world. We moved the Errata toour boot manager and removed it in the kernel code. That fixes it.

In system.c there is an imx_init_l2cache function. Within here the L2 prefetch control and power control registers are written. However in non-secure world those registers are read only. When trying to write the registers in non-secure world there is no immediate error. It seems that the writes are just without effect. Nevertheless a couple of seconds later when the kernel is nearly started up this leads to an imprecise external abort.

When we remove the code the kernel starts up.

This is the function:

#ifdef CONFIG_CACHE_L2X0

void __init imx_init_l2cache(void)

{

    void __iomem *l2x0_base;

    struct device_node *np;

    unsigned int val;

    np = of_find_compatible_node(NULL, NULL, "arm,pl310-cache");

    if (!np)

        goto out;

    l2x0_base = of_iomap(np, 0);

    if (!l2x0_base) {

        of_node_put(np);

        goto out;

    }

    /* Configure the L2 PREFETCH and POWER registers */

    val = readl_relaxed(l2x0_base + L2X0_PREFETCH_CTRL);

    val |= 0x70800000;

    /*

     * The L2 cache controller(PL310) version on the i.MX6D/Q is r3p1-50rel0

     * The L2 cache controller(PL310) version on the i.MX6DL/SOLO/SL is r3p2

     * But according to ARM PL310 errata: 752271

     * ID: 752271: Double linefill feature can cause data corruption

     * Fault Status: Present in: r3p0, r3p1, r3p1-50rel0. Fixed in r3p2

     * Workaround: The only workaround to this erratum is to disable the

     * double linefill feature. This is the default behavior.

     */

    if (cpu_is_imx6q())

        val &= ~(1 << 30 | 1 << 23);

    writel_relaxed(val, l2x0_base + L2X0_PREFETCH_CTRL);

    val = L2X0_DYNAMIC_CLK_GATING_EN | L2X0_STNDBY_MODE_EN;

    writel_relaxed(val, l2x0_base + L2X0_POWER_CTRL);

    iounmap(l2x0_base);

    of_node_put(np);

out:

    l2x0_of_init(0, ~0UL);

}

#endif

Our workaround is to check in system.c if the Level 2 cache is already enabled. Like in cache-l2x0.c this is used as an indication that Linux is running in non-secure world and the Level 2 cache was already enabled by the boot manager:

/* Check if l2x0 controller is already enabled.

* If you are booting from non-secure mode

* accessing the below registers will fault.

*/

if (!(readl_relaxed(l2x0_base + L2X0_CTRL) & L2X0_CTRL_EN)) {

  // do the rest

}

View solution in original post

11 Replies
1,775 Views
jamesbone
NXP TechSupport
NXP TechSupport

Hello Harald,

Your request it is discussing internally we will provide a response as soon as possible.

0 Kudos
1,776 Views
harald_walter3
Contributor II

Hi,

we finally found the issue. It is related to an Errata for the Level 2 cache that is not applicable when starting in non-secure world. We moved the Errata toour boot manager and removed it in the kernel code. That fixes it.

In system.c there is an imx_init_l2cache function. Within here the L2 prefetch control and power control registers are written. However in non-secure world those registers are read only. When trying to write the registers in non-secure world there is no immediate error. It seems that the writes are just without effect. Nevertheless a couple of seconds later when the kernel is nearly started up this leads to an imprecise external abort.

When we remove the code the kernel starts up.

This is the function:

#ifdef CONFIG_CACHE_L2X0

void __init imx_init_l2cache(void)

{

    void __iomem *l2x0_base;

    struct device_node *np;

    unsigned int val;

    np = of_find_compatible_node(NULL, NULL, "arm,pl310-cache");

    if (!np)

        goto out;

    l2x0_base = of_iomap(np, 0);

    if (!l2x0_base) {

        of_node_put(np);

        goto out;

    }

    /* Configure the L2 PREFETCH and POWER registers */

    val = readl_relaxed(l2x0_base + L2X0_PREFETCH_CTRL);

    val |= 0x70800000;

    /*

     * The L2 cache controller(PL310) version on the i.MX6D/Q is r3p1-50rel0

     * The L2 cache controller(PL310) version on the i.MX6DL/SOLO/SL is r3p2

     * But according to ARM PL310 errata: 752271

     * ID: 752271: Double linefill feature can cause data corruption

     * Fault Status: Present in: r3p0, r3p1, r3p1-50rel0. Fixed in r3p2

     * Workaround: The only workaround to this erratum is to disable the

     * double linefill feature. This is the default behavior.

     */

    if (cpu_is_imx6q())

        val &= ~(1 << 30 | 1 << 23);

    writel_relaxed(val, l2x0_base + L2X0_PREFETCH_CTRL);

    val = L2X0_DYNAMIC_CLK_GATING_EN | L2X0_STNDBY_MODE_EN;

    writel_relaxed(val, l2x0_base + L2X0_POWER_CTRL);

    iounmap(l2x0_base);

    of_node_put(np);

out:

    l2x0_of_init(0, ~0UL);

}

#endif

Our workaround is to check in system.c if the Level 2 cache is already enabled. Like in cache-l2x0.c this is used as an indication that Linux is running in non-secure world and the Level 2 cache was already enabled by the boot manager:

/* Check if l2x0 controller is already enabled.

* If you are booting from non-secure mode

* accessing the below registers will fault.

*/

if (!(readl_relaxed(l2x0_base + L2X0_CTRL) & L2X0_CTRL_EN)) {

  // do the rest

}

1,773 Views
gary_bisson
Senior Contributor III

Hi harald.walter3@de.bosch.com​,

Sorry to revive an old thread but I was wondering if your modifications to get the i.MX6 Linux Kernel running in non-secure state were accessible in some repositories (Github or other)? Or would you be willing to share them?

Thanks in advance.

Regards,

Gary

1,773 Views
harald_walter3
Contributor II

Hi Gary,

yes sure. It went into the Linux ARM kernel:

https://patchwork.kernel.org/patch/8356811/

Regards,

Harald

1,775 Views
ailtonlopes
Contributor III

hey harald.walter3@de.bosch.com‌ do you have a patch to change the kernel to run in non-secure world, im having a problem running linux in non-secure world.
Thanks in advance, kind regards 

0 Kudos
1,773 Views
harald_walter3
Contributor II

Hi Ailton,

the mentioned patch above is the only that was necessary to run Linux in a non-secure world. Since this should be merged already the kernel is in a shape to handle this. Just avoid to enable the L2 cache in the bootmanager of your normal world if it got already enabled by the secure world.  

1,775 Views
ailtonlopes
Contributor III

Thanks for the quick response, also I saw that you mentioned above that you used a self made bootloader and me I’m using Linux in an hypervisor and I wanted to create a bin of the Linux and the device tree and I made a small boot loader passing to Linux the machine Id and the dtb and zImage address but I couldn’t get Linux running, so do you have any advice? I’m using the imx6q 

thanks again for the help

0 Kudos
1,775 Views
harald_walter3
Contributor II

The only hint that comes in my mind is that you should already enable the MMU and the L1 cache. This is what U-Boot basically is doing as well. The other hint is: Try U-Boot as a reference. Without knowing where Linux stops running the question is difficult to answer.

1,775 Views
ailtonlopes
Contributor III

Hey harald.walter3@de.bosch.com‌ sorry for bothering you but i tried all the patches above with the linux 3.11.52 and i get no response from the kernel running in seure or non-secure. do you have any suggestion?

Also i wanted to ask if you know of the existence of patches for linux 4.9?

Kind regards, 
Ailton Lopes

0 Kudos
1,775 Views
gary_bisson
Senior Contributor III

Hi Harald,

Thank you very much for this.

Regards,

Gary

0 Kudos
1,775 Views
vsiles
Senior Contributor I

Hi ! I'm in a similar situation as you described: I have a secure OS in Trustzone and want to move Linux to normal world on an imx6q lite (boundary devices)

For the moment, I can boot with only one core, but SMP setting is failing to boot. We have patched access to l2 cache (by using and smc and implementing the code in the monitor), the nsacr, generic timers & diagnostic register. The CSU/AIPSTZ are wide open. But the boot fails during init: I am trying to pinpoint the exact location, but each time I add some printk, the kernel goes a bit further, so I'm thinking cache issue.

Is this patch the only modification you performed in Linux ? Could you give me some more details of your modification if any, like git revision / .config / errata configuration ?

Thank you !

Vincent

0 Kudos