Hello,
We want to run motor control loops with minimal jitter on our quad core i.MX6 platform next to our main application. To implement this we use the private watchdog timer of one of the Cortext-A9 cores to generate a periodic IRQ (1kHz) (we cannot use the private timer of one of the cores, because that timer is already in use by the Linux scheduler).
The watchdog timer is used in timer mode with auto reload enabled. The core is running at 998Mhz, so we program the load register with the value 0x79950 togenerate the 1kHz period. In the interrupt handler the current value of the count down register is read and compared with the value in the load register to determine the latency between the interrupt being raised and the handler being scheduled. The handler is programmed to be handled on CPU3 only and the kernel is programmed to isolate the cpu (using the isolcpu kernel option).
We measured the latencies when the IRQ is handled as a normal irq and when it is handled as an FIQ on four of our own i.MX6 boards:
Board1 + 2
----------
IRQ when idle:
Maximum latency: 8731 nsec
Average latency: 1239 nsec
IRQ under stress (using ‘hackbench -g 20’):
Maximum latency: 47116 nsec
Average latency: 2289 nsec
FIQ when idle:
Maximum latency: 4695
Average latency: 84
FIQ under stress:
Maximum latency: 5558
Average latency: 265
Board 3 + 4
-----------
IRQ when idle:
Maximum latency: 28151 nsec
Average latency: 1249 nsec
IRQ Under stress (using ‘hackbench -g 20’):
Maximum latency: 137964 nsec
Average latency: 3707 nsec
FIQ when idle:
Maximum latency: 6243
Average latency: 83
FIQ under stress:
Maximum latency: 40219
Average latency: 436
(all latencies are measured during +/- 250000 interrupts)
These numbers raise two questions:
- an FIQ has the highest priority of all interrupts and is not disabled by the Linux kernel. But still we see latencies of more than 40 microseconds! How can this happen?
- There is a huge difference between the maximum latencies measured on different boards.
All boards run exactly the same software (Linux 3.12.19 with the PREEMPT_RT patches).
We have also have tested the same software on the SabreSD evaluation board in FIQ mode and our own i.MX6 board assembled with the i.MX6 Solo. In those cases we measure the same latencies as measured on board1 and board2.
To explain the differences between the maximum latencies, we have been looking for differences in the CPUs. But all our own quad core boards have the same processor on it (MCIMX6Q5EYM10AC - SBAQ1247 – KOREA – XUAQSBP).
So our question is: how can we explain the bit FIQ latencies and the latency difference between the boards?
Regards,
Jaccon
We have a problem with the FIQ occasionally falling back to a GIC interrupt (some kind of a race with the INTACK register??).
Maybe this is your problem too.
I just ran across a discussion on the LKML of two workarounds:
LKML: Marek Vasut: Re: [PATCH v8 0/4] arm: KGDB NMI/FIQ support
Hi,
Even if you use FIQs they are handled by the kernel and therefore the latency. I wonder if you have used the Enhanced Periodic Interrupt Timer (EPIT) to generate those interrupts.
Also I believe that there must be a way to "bypass" the kernel so the CPU handles the ISR as soon as possible. Let me delve into this.
I will get back to you as soon as possible.
Best Regards,
Alejandro
Hello Alejandro,
We know that the FIQs are handled by the kernel and that we therefore can expect some latency. But we don’t understand why we see such huge latency differences between boards. These latencies are consistent boot after boot. Some boards consistently show much higher latencies boot after boot than other boards.
Note that we use the private watchdog timer of one of the Cortex-A9 cores to generate these IRQs and FIQs.
Regards,
Jaccon
Hello Alejandro,
We found out that using the Linux 'nohlt' kernel parameter makes a difference. When using this kernel parameter, the processor will not execute a WFI when the processor is idle. Using this parameter, we see that the Phantoms which showed huge latencies now show the expected latencies (and only once or twice a huge latency). Does this observation help in finding the root cause?
Regards,
Jaccon
alejandrolozano, Can you please check?