Possible SDK Deficiency for ISR Exit Barrier

cancel
Showing results for 
Show  only  | Search instead for 
Did you mean: 

Possible SDK Deficiency for ISR Exit Barrier

4,721 Views
nicholas2000
Contributor I

Hello, I am using the MKS20FN128VLL12 micro controller for a new project. I am updating a pre-existing MCU (MK64FN1M0VLL12) project to this MCU. I have installed the same drivers and made corresponding changes to pin configs.

I noticed in the fsl_common.h driver for the MKS20 SDK (version 2.4.1) there's no definition for a ISR exit barrier. In the the fsl_common.h driver for the MK64FN SDK there is the following definition:

/*! @name ISR exit barrier
* @{
*
* ARM errata 838869, affects Cortex-M4, Cortex-M4F Store immediate overlapping
* exception return operation might vector to incorrect interrupt.
* For Cortex-M7, if core speed much faster than peripheral register write speed,
* the peripheral interrupt flags may be still set after exiting ISR, this results to
* the same error similar with errata 83869.
*/
#if (defined __CORTEX_M) && ((__CORTEX_M == 4U) || (__CORTEX_M == 7U))
#define SDK_ISR_EXIT_BARRIER __DSB()
#else
#define SDK_ISR_EXIT_BARRIER
#endif

 

In my source code for my older project I used SDK_ISR_EXIT_BARRIER at the conclusion of some IRQ handler. Is there a work around for this? I thought of adding the definition into one of my header files for now. Or is there a substitute in the driver files?

9 Replies

4,714 Views
ErichStyger
Specialist I

What I do is simply adding __DSB(); to my ISR routines directly, not relying on the macro in the SDK.

0 Kudos
Reply

4,710 Views
bobpaddock
Senior Contributor III

What is sad about being forced to use these types of instructions is they can consume a lot of cycles, wasting time and power.  Making fast ISR's an oxymoron.

4,707 Views
ErichStyger
Specialist I

Agreed. But ARM has screwed this up and not having this barrier can cause an application crash. It would be possible to avoid it (at least on M4) with carefully looking at the assembly code (or write the ISR in assembly). I had an application in the field which was affected by this ARM bug, or at least the issue was going away with adding the dsb to the ISRs.

4,701 Views
myke_predko
Senior Contributor III

@ErichStyger @bobpaddock 

Could you elaborate a bit more on the issue of the DSB instruction?  

My understanding is that it waits for all pending IO operations to complete and that can be up to 2,000 clock cycles.  BUT, if you aren't doing any IO operations in your ISR, the number of clock cycles taken by DSB is in the single digits.  

Now, the definition of "IO operations" is probably somewhat fluid, but I'm thinking that the biggest concern would be with GPIO reads and writes, simple register accesses do not result in long DSB instruction execution - right?  

I can't find a definitive answer on this when I do a search.  

0 Kudos
Reply

4,693 Views
bobpaddock
Senior Contributor III

"I can't find a definitive answer on this when I do a search."

Yes, that is a problem.  They expect you to dig into the core implementation files.
ARM's once excellent online documentation was also destroyed in one of their "modernization efforts".
Now you have to dig through github to find stuff that once easily popped up on the ARM site by using the 'search' box.  See bottom of message for example.

It is my understanding that the IO being waited for does not have to be something that was just done in the current ISR.  It could be IO set off prior to the ISR.  It is more about what is going on, on the internal busses.  Sometimes even these instructions are not enough and a dummy read of the last written register is required before leaving the ISR.  I've noted this mostly with the timers.

You had mentioned issues with FreeRTOS at some point.  That could require both DSB and ISB for a context switch, especially if done inside of a ISR.

My comment was about how these instructions times are not deterministic.
They may take one cycle, they make take thousands.

In something running on very small batteries, it matters.

If using both DSB and ISB how do they interact cycle wise?
Never found an answer.

Below is what I have in my source code, copied from the once easy to find documentation.

#define ATTR_NO_INSTRUMENT_FUNCTION __attribute__( ( no_instrument_function ) )

/*
* Data Memory Barrier (DMB): Ensures that all explicit data memory
* transfers before the DMB are completed before any subsequent data
* memory transfers after the DMB starts.
*
* The use of DMB is rarely needed in Cortex-M processors because they
* do not reorder memory transactions. However, it is needed if the
* software is to be reused on other ARM processors, especially
* multi-master systems.
*
* Semaphores in a multi-master system:
* Two accesses to differing addresses in normal memory, or to
* different devices in Device memory, or to two different memory
* types, with the exception of those that Table 2 shows, are not
* guaranteed to be ordered. If the order of any of these
* transactions is required for the purpose of semaphore
* communication in a multi-master system then a DMB instruction
* must be inserted, for example, between a payload and
* spin-lock.
*/
static inline ATTR_NO_INSTRUMENT_FUNCTION void sync_barrier_memory( void )
{
__asm__ __volatile__ ("dmb");
}

/*
* Data Synchronization Barrier (DSB): Ensures that all explicit data
* memory transfer before the DSB are complete before any instruction
* after the DSB is executed.
*
* Ensure effects of an access to SCS take place before the next
* operation
*
* Ensure memory is updated before the next operation, for
* example, SVC, WFI, WFE.
*
* Vector table changes:
* If the program changes an entry in the vector table,
* and then enables the corresponding exception, a DSB
* instruction should be used between these two
* operations. This ensures that if the exception is
* taken after being enabled the processor uses the new
* exception vector. If the updated vector table is
* required immediately, for example if an SVC
* immediately follows an update to the SVC table entry
* via a store, then a DSB is also required.
*
* Memory Map modifications:
* If the system contains a memory map switching
* mechanism then use a DSB instruction after switching
* the memory map in the program. This ensures subsequent
* instruction execution uses the updated memory map, if
* the memory system makes the updated memory map visible
* to all subsequent memory accesses.
*
* Note:
* An ISB or an exception entry/return is required
* to ensure that the subsequent instructions are
* fetched using the new memory map.
*
* The memory barrier instructions, DMB and DSB, can be used to ensure
* that the write buffer on the processor has completed its operation
* before subsequent operations can be started. However, it does not
* check the status of the bus level write buffers. In such cases, if
* the system is based on AHB or AHB Lite, you might need to perform a
* dummy read through the bus bridge to ensure that the bus bridge has
* completed its operation.
*
* The Cortex-M0 processor (r0p0) and the Cortex-M0+ processor (r0p0)
* do not include a write buffer in their processor bus interface.
*
* Architecturally, a DSB instruction should be used after changing
* the VTOR if an exception is to be generated immediately and should
* use the latest vector table setting.
*
* In Cortex-M3, Cortex-M4 and Cortex-M0+ processors, accesses to the
* SCS have the DSB behavior, so there is no need to insert the DSB
* instruction. [Which is where the errata says this is broken.]
*
* A DSB is required before generating self-reset to ensure all
* outstanding transfers are completed. The use of the CPSID I
* instruction is optional.
*/
static inline ATTR_NO_INSTRUMENT_FUNCTION void sync_barrier_data( void )
{
__asm__ __volatile__ ("dsb");
}

/*
* Instruction Synchronization Barrier (ISB): Ensures that the effects
* of all context altering operations prior to the ISB are recognized
* by subsequent instructions. This results in a flushing of the
* instruction pipeline, with the instruction following the ISB being
* re-fetched.
*
* In addition to the ISB instruction, the architecture defines
* exception entry and exception return to also have ISB semantics,
* causing a fresh fetch of instructions and a re-evaluation of
* interrupts.
*
* http://infocenter.arm.com/help/index.jsp?topic=/com.arm.doc.dui0552a/CHDEBIEG.html
*
*/
static inline ATTR_NO_INSTRUMENT_FUNCTION void sync_barrier_instruction( void )
{
__asm__ __volatile__ ("isb");
}

/*
* Wait for Interrupt:
*/
static inline ATTR_NO_INSTRUMENT_FUNCTION void wait4irq( void )
{
sync_barrier_data();
sync_barrier_instruction();
__asm__ __volatile__ ("wfi");
}

===

Searching 'eight byte stack alignment' on ARM once got useful information, as example.
Now you have to do this:


The latest version of the Procedure Call Standard for the Arm Architecture is available on GitHub: https://github.com/ARM-software/abi-aa/blob/main/aapcs32/aapcs32.rst. Hence all the links below are to GitHub.

If this is for the 32-bit PCS the constraints on the stack are given in https://github.com/ARM-software/abi-aa/blob/main/aapcs32/aapcs32.rst#6211universal-stack-constraints
 
With the specific sentence in:
 
The stack must also conform to the following constraint at a public interface https://github.com/ARM-software/abi-aa/blob/main/aapcs32/aapcs32.rst#6212stack-constraints-at-a-publ...
 
  *   SP mod 8 = 0. The stack must be double-word aligned.
The definition of va_list is given in https://github.com/ARM-software/abi-aa/blob/main/aapcs32/aapcs32.rst#814additional-types

 

4,684 Views
ErichStyger
Specialist I

You had mentioned issues with FreeRTOS at some point. That could require both DSB and ISB for a context switch, especially if done inside of a ISR.

FreeRTOS has DSB/ISB added in the context switch ISR, at least in the port I'm using:

/*-----------------------------------------------------------*/
void vPortYieldFromISR(void) {
  /* Set a PendSV to request a context switch. */
  *(portNVIC_INT_CTRL) = portNVIC_PENDSVSET_BIT;
  /* Barriers are normally not required but do ensure the code is completely
  within the specified behavior for the architecture. */
  __asm volatile("dsb");
  __asm volatile("isb");
}

dsb/isb are used for entering critcal sections too (vPortEnterCritical).

Erich

4,678 Views
myke_predko
Senior Contributor III

@ErichStyger @bobpaddock 

Thank you for your excellent posts.

This post (ARM errata 838869) summaries the possible issue quite well but it doesn't jive with the comments above about the DSB instruction taking an indeterminate and unreasonably long time to complete - when I look at what Erich said at the time, I would think that there would only be a few (8 or so?) extra clock cycls running DSB and not the 2,000 that I found when I searched regarding the SDB instruction operation.  So I would think that DSB should just put into the end of every ISR and be done with it without worrying about having a significant performance/response hit (ie nothing happening for a long period of time while DSB is running).

Comments?  

One more comeback.  @ErichStyger - you said that you see the ISB/DSB instructions executing in VPortYieldFromISR() but I can't find this method nor is it listed in the FreeRTOS 10.0.0 manual.  I did find this reference to it Updating to FreeRTOS 8.2.2 What version/"port" are you running?  

0 Kudos
Reply

4,626 Views
bobpaddock
Senior Contributor III

"Barrier Litmus Tests and Cookbook"

https://developer.arm.com/documentation/genc007826/latest

From page six:


"A DSB completes when both:

• all explicit memory accesses that are observed by Pe before the DSB is executed, are of the required
access types, and are from observers in the same required shareability domain as Pe, are complete for
the set of observers in the required shareability domain

• all cache, branch predictor, and TLB maintenance operations issued by Pe before the DSB are complete for the required shareability domain."

The indeterminate nature comes down to the specific core,  what is happening on the busses etc.

The document gives other examples besides ending ISRs, such as setting up DMA registers where DSB should (shall?) be used.


4,659 Views
ErichStyger
Specialist I

What version/"port" are you running?

I'm using 10.4.1 with the following port: https://github.com/ErichStyger/McuOnEclipseLibrary/blob/master/lib/FreeRTOS/Source/portable/GCC/ARM_...

Erich