Why SPRF is never being set on SPI0 read for MKL25Z128VFT4?

cancel
Showing results for 
Show  only  | Search instead for 
Did you mean: 

Why SPRF is never being set on SPI0 read for MKL25Z128VFT4?

Jump to solution
4,793 Views
bobpaddock
Senior Contributor III

I am having a problem getting SPI0, on PORTC, to work on a MKL25Z128VFT4 part.

The SPRF data ready flag is never getting set, causing my SPI read/write routine to lock up.

I'm tiring to use the simplest code possible, no DMA, Interrupts or frameworks.

The problem is that the SPRF data bit is never getting set in the status register.
This makes me fell I've missed something in clock initialization.

I have no external crystals, I want to run on the internal clocks (cost issue).

I have tried code produced by Code Warrior and also done no clock
initialization at all, just going with the defaults out of reset, and get
the same results.

SPI0 is clocked by the Bus Clock.  I do not see, or am constantly
overlooking, in the data sheet anything that indicates I need to
specifically enable the bus clock after a hardware reset.

There is no debugger running that would corrupt the status registers.

Code below, what am I missing?:

static void spi0_init( void );
static void spi0_init( void )
{
  SIM_SCGC5 |= SIM_SCGC5_PORTC_MASK;    /* Enable PORTC clock */
  SIM_SCGC4 |= SIM_SCGC4_SPI0_MASK;     /* Enable SPI0 clock  */

#if 0 /* Slave Select control is not shown in this posting */
    PORTC_PCR4 = PORT_PCR_MUX( 2U );    /* Set PTC4 Pin Mux Control Alternative 2 [SPI0_PCS0] */
#endif

    PORTC_PCR5 = PORT_PCR_MUX( 2U );    /* Set PTC5 Pin Mux Control Alternative 2 [SPI0_SCK]  */
    PORTC_PCR6 = PORT_PCR_MUX( 2U );    /* Set PTC6 Pin Mux Control Alternative 2 [SPI0_MOSI] */
    PORTC_PCR7 = PORT_PCR_MUX( 2U );    /* Set PTC7 Pin Mux Control Alternative 2 [SPIO_MISO] */

    /* SPIE=0, SPE=0, SPTIE=0, MSTR=1, CPOL=0, CPHA=0, SSOE=0, LSBFE=0: */
    SPI0_C1 = SPI_C1_MSTR_MASK;                           /* Set SPI0 to Master */

    SPI0_BR = (SPI_BR_SPPR(0x07) | SPI_BR_SPR(0x08));       /* Set baud rate prescale divisor and baud rate divisor to the slowest frequency for testing */

    /* SPIE=0, SPE=1, SPTIE=0, MSTR=1, CPOL=0, CPHA=0, SSOE=0, LSBFE=0: */
    SPI0_C1 |= SPI_C1_SPE_MASK;                             /* Enable SPI0 */
}

static uint8_t spi0_tx_rx( uint8_t const spi_write_data_u8 );
static uint8_t spi0_tx_rx( uint8_t const spi_write_data_u8 )
{
  while( 0U == (SPI0_S & SPI_S_SPTEF_MASK) )            /* Wait for SPI transmit empty flag to set */
    {
      NOP();
    }

  SPI0_D = spi_write_data_u8;                           /* Write data to SPI */
  /* By toggling a pin it is known we *DO* get here */

  while( 0U == (SPI0_S & SPI_S_SPRF_MASK) )             /* Wait for receive flag to set */
    {
      NOP();
    }
  /* By toggling a pin it is known we do *NOT* get here */

  uint8_t const spi_read_data_u8 = SPI0_D; /* Read the SPI data register, with SPRF set, above, this will clear SPRF */
  return( spi_read_data_u8 );
}

Labels (1)
Tags (4)
1 Solution
3,672 Views
adriancano
NXP Employee
NXP Employee

Hi Bob,

I copied and pasted your code in a new project in CodeWarrior 10.6. I am testing it on a FRDM-KL25 board in the default FEI mode. I run the code normally and it does not lock up. I wrioe in the infinite loop the spi0_tx_rx function and I let the program run for a time and was ok. I really think that is a problem with your compiler.

I hope this information can help you.

Regards,

-----------------------------------------------------------------------------------------------------------------------

Note: If this post answers your question, please click the Correct Answer button. It would be nice!

-----------------------------------------------------------------------------------------------------------------------

View solution in original post

0 Kudos
Reply
23 Replies
3,563 Views
bobpaddock
Senior Contributor III

If you'd like to increase your level of worry about instruction order read up on the ARM Memory Barrier Instructions:

http://infocenter.arm.com/help/topic/com.arm.doc.dai0321a/DAI0321A_programming_guide_memory_barriers...

Not to much of a concern with M0+ parts.  I do wonder with the large parts how may bugs lurk in peoples code due to not understanding the issue.

0 Kudos
Reply
3,673 Views
adriancano
NXP Employee
NXP Employee

Hi Bob,

I copied and pasted your code in a new project in CodeWarrior 10.6. I am testing it on a FRDM-KL25 board in the default FEI mode. I run the code normally and it does not lock up. I wrioe in the infinite loop the spi0_tx_rx function and I let the program run for a time and was ok. I really think that is a problem with your compiler.

I hope this information can help you.

Regards,

-----------------------------------------------------------------------------------------------------------------------

Note: If this post answers your question, please click the Correct Answer button. It would be nice!

-----------------------------------------------------------------------------------------------------------------------

0 Kudos
Reply
3,563 Views
bobpaddock
Senior Contributor III

> I copied and pasted your code in a new project in CodeWarrior 10.6.

Yeah, I tried that early on.  When I went to download the code with the PE Mulitilink FX I was told my 300 byte program was to large and I needed a CodeWarrior license. I can load much large code if the project was produced from CW from the start.  Gave up on CW at that point.  If you start a project with CW it seems to work well, my colleagues and I have had zero success in pasting foreign code into CW and have it do anything but waste our time.

> I really think that is a problem with your compiler.

That got me digging as nothing else was making sense, and you are correct, it is a compiler issue.

These two version work:

gcc version 4.7.3 20121207 (FSL release r785) [ARM/embedded-4_7-branch revision 194305] Shipped with CW 10.6
gcc version 4.7.4 20130613 (release) [ARM/embedded-4_7-branch revision 200083]

This one is producing hardware bus fault on read of SPI0_D.
It does not produce an error on write of SPI0_D.

Doesn't seem to be an issue with any other register, which I do not understand?:

gcc version 4.8.3 20131129 (release) [ARM/embedded-4_8-branch revision 205641]

I was using 4.8 because it has better support of Pedantic error message and suppression there of, as Freescale headers are not C99 compliant.  That issue shows up when you enable all of the GCC warning/error directives, as is policy here.

The original problem of SPRF not being set seems to be what Mark describes above.  It needs to be in an errata.

Thank you, and Mark, and everyone else for your help.

3,563 Views
heliosfa
Contributor II

Hi Bob,

Just to say thank you for your efforts - I hadn't got as far as as debugging this properly due to other work.

I am also having issues with arm-none-eabi-gcc 4.8.3 whereas 4.6.2 works fine with Codewarrior 10.3.

As I have mentioned in my thread, I am wondering if this is specific to the non-CMSIS Freescale headers so if I get a chance, I may try it this week.

Thanks,

Graeme

0 Kudos
Reply
3,563 Views
heliosfa
Contributor II

well, it looks like there are more issues than just with SPI0_D with arm-none-eabi-gcc 4.8.3 - reading LPTMR0_CNR also causes a hard fault with both 4.8.2 and 4.7.

As an observation, reading from SPI0_D works in the latest 4.7 toolchain from launch pad.


I am now going to go back to 4.6.2 on my linux build system...

0 Kudos
Reply
3,563 Views
bobpaddock
Senior Contributor III

Yesterday I found issues with optimization.

What do you have the optimization level set for? -O?

I'd been using a legacy Makefile from a AVR project that was using -Os (Optimize code for size).

Neither 4.7.4 nor 4.8.4 20140516 would produce working code.  "ldr r3,[sp, #4]" to get a function parameter that was not on the stack.

Using any other optimization level 0/1/2/3 produced working code.

With 4.8.4, I went with -O2 and moved on...

0 Kudos
Reply
3,563 Views
mjbcswitzerland
Specialist V

Bib

As reference, I use exclusively "-Os"

- with 4.8.0 (as used by KDS V1.0.1)

- with 4.7.3 (as used by CW10.5)

- some others that the same code has been used with (can't say which versions off the top-of-my-head though)

and haven't experienced optimisation "errors" with these.

I did have a "problem" due to agressive optimisation but which was not purely due to the compiler but could be explained by code (although initially not obvious):

fnEnterInterrupt(irq_ETH_RX_ID, PRIORITY_EMAC, _fec_rx_frame_isr); // configure and enter the Ethernet rx handling interrupt routine in the vector table

fnEnterInterrupt(irq_ETH_TX_ID, PRIORITY_EMAC, _fec_txf_isr); // configure and enter the Ethernet tx handling interrupt routine in the vector table

fnEnterInterrupt(irq_ETH_ERR_MISC_ID, PRIORITY_EMAC, _fec_misc); // configure and enter the Ethernet error and miscellaneous handling interrupt routine in the

One of the GCC version started inlining the functions that were enabling the interrupts. The routine uses a pointer which is set to the core register that sets the interrupt enable (IRQ0_31_SER_ADD, IRQ32_63_SER_ADD, etc.) and writes a single bit (write '1' to enable) [as well as some other work]. This was then effectively resulting in in-lined code writing to this same register three times - each time with just 1 bit set. So the optimiser decided that it only needed to do the final write and so the first two interrupt were not enabled (just the third).

In the subroutine the pointer was only used once [unsigned long *ptrIntReg)] and so caused no problems when it was a subroutine.

The fact that there was an in-lining step involved meant that the pointer had to be modified to volatile unsigned long * although this didn't make any real sense to the code "viewed as a pure subroutine".

Therefore, although there are of course some 'real' bugs too, it is often useful to anaylse the disassembled code to see whether there "may" be some 'hidden logic' behind the behaviour.

Some projects don't allow developers to use more than -O2 (eg. safety related systems) - presumably to reduce the likehood of optimisation errors causing unexpected effects, but maybe also to reduce the chances of a developer's 'sloppiness' (for example, my "mistake" above of not expecting a routine to be handled as in-lined code) to result in a failure. Of course in my case (above) the Ethernet interface just stopped working and this could be solved after specific anaysis but in other cases it may be more subtle and so cause critical system failure in a very specific situation (that was not caught during testing).

Regards

Mark

0 Kudos
Reply
3,563 Views
bobpaddock
Senior Contributor III

"One of the GCC version started inlining the functions that were enabling the interrupts."

-O2 sets -finline-small-functions.  -Os inherits -O2 with -freorder-blocks  -freorder-blocks-and-partition.

I can't find the number at the moment.  Some place around 200 bytes is the decision point for when a function is 'small' and will be automatically inlined, unless the attribute noinline is used on the function.

I did note an issue of code motion where my code initialized the hardware driving the LEDs then turned on the LED, where the compiler reordered to turn the LED on first then initialize, which turned the LED back off.

That -Os does encourage code motion to make the code smaller does support your conjecture that it is an issue of code motion.  My reading of the compiler generated assembly had the compiler getting a function parameter from the wrong place in RAM, rather than actual code execution order.  If the reordering of the code is causing function parameters to change it is still an issue with the compiler and using -Os.

The IRQ instructions need to marked volatile to prevent code motion:

/*
* This is the only method available in the ARMv6-M architecture
* (Cortex-M0/M0+)
*
* In normal applications there is no need to add any barrier
* instruction after using a CPS instruction to enable an interrupt.
*
* If an interrupt was already in the pending state, the processor
* accepts the interrupt after the “CPSIE I” is executed. However,
* additional instructions can be executed before the processor enters
* the exception handler:
*
*  For Cortex-M3 or Cortex-M4, the processor can execute up to TWO
*  additional instructions before entering the interrupt service
*  routine.
*
*  For Cortex-M0, the processor can execute up to ONE additional
*  instruction before entering the interrupt service routine.
*
*/
static inline void irq_enable( void )
{
  __asm__ __volatile__ ("cpsie i"); /* Clear PRIMASK */
}

/*
* The CPSID instruction is self-synchronized to the instruction
* stream and there is no requirement to insert memory barrier
* instructions after CPSID.
*/
static inline void irq_disable( void )
{
  __asm__ __volatile__ ("cpsid i");
}

0 Kudos
Reply
3,563 Views
bobpaddock
Senior Contributor III

" -Os inherits -O2 with -freorder-blocks  -freorder-blocks-and-partition."

The above should read:

--freorder-blocks  -freorder-blocks-and-partition disabled, along with auto alignment disabled.

Is there a way to edit these messages after they've been posted?

0 Kudos
Reply
3,563 Views
mjbcswitzerland
Specialist V

Hi Bob

Good to hear that you are past his problem and that you could prove it to be compiler dependent. As mentioned previously that is what I was expecting for the other thread.

It is a fact of life that embedded programmers need to always remain open to the possibilities of such effects even if it is difficult to show other people the code that you are convinced is basically correct but does "funnies" without them expaining to you how your coding style must be to blame or explaining how it must be some pre-emption effect of the operating system. In one forum I was trying to prove a GCC error and was told by someone that they didn't understand the code so it must be my own fault - i.e. if they didn't understand the code how could I expect the compiler to do it right...(!!) Then you have to send off complete projects with exact instructions of how to reproduce it and usually you never hear anything back - sometimes you do and it is often a recommendation to use a different version because this was solved some time ago and is Ok in that one.

Since I work with a number of different IDEs and compilers I estimate that I invest about one week a year working on proving such problems and working on workarounds.

Pure coincidence but a collegue and myself had to work around a Texas Instruments compiler issue today which cost 8 hours (4 each) before we could actually locate where it was actually coming from and rewrite a single line of code to do the same thing but avoiding the error.

There are a couple of "funnies" that have been discussed here too:

Problem started with CW10.4 (Kinetis)

Compiler/Optimisation error CW10.2 (Kinetis)

By the way I am not absolutely sure whether you could confirm the SPI start-up workaround or not (?)

Regards

Mark

0 Kudos
Reply
3,563 Views
bobpaddock
Senior Contributor III

I've helped develop the AVR GCC/avr-libc compiler for years, so I'm well aware of the treatment when you report a bug upstream. :-(

Well aware of the standard 'FAQ' "your compiler is broken", "no you forgot to use volatile in your code", kind of crap that goes on.

I'm tired of the bleeding edge, I've lost to much blood over the years. :-)

Your SPI fix did solve my original problem with SPFR.

The SPI issues where compounded by me switching between compiler versions.  I've got a product to ship so was not inclined to waste yet more time on this issue, and hope it is fixed in a future 4.8.4 or later release.  There are still some issues with volatile in the ARM GCC per the release notes and I wonder if could be related?

This link on memory barriers may fit into the SPI issue somehow:

http://infocenter.arm.com/help/index.jsp?topic=/com.arm.doc.dai0321a/index.html

I've seen the SPI code set master, then baud, then enable SPI,

and I've seen SPI code set master with enabling SPI at the same time then set the baud.

Not sure which one is correct?

I expect there is some issue with getting the baud rate counter synced to some internal clock domain, is the real issue inside the part, which is why the dummy reads are required.

Hopefully someone at Freescale will be putting this in a errata HINT HINT!

I'll check out the other 'funnies'.  Before my wife's suicide, see http://www.kpaddock.com and http://wiki.kpaddock.com that I'm trying to move to, I was a prolific blogger on software safety issues at http://blog.softwaresafety.net .  Hopefully I'll find a new 'normal' and get back to that.  I was always documenting 'funnies' there.

0 Kudos
Reply
3,563 Views
mjbcswitzerland
Specialist V

Bob

Thanks for the feedback.

I checked your links and wish you strength.

Regards

Mark

0 Kudos
Reply
3,563 Views
martynhunt
NXP Employee
NXP Employee

Hi Bob,

This document may help you: L-Series SPI GPIO CS/SS

Best regards,

Martyn

0 Kudos
Reply
3,563 Views
bobpaddock
Senior Contributor III

Sadly no, it was of no help.

It only reinforces my frustration by saying:

"When the SPRF bit is set,"

No one is telling what causes it never be set.

0 Kudos
Reply
3,563 Views
martynhunt
NXP Employee
NXP Employee

Would it be possible for you to share the code that is calling the SPI functions you shared in your original post?

Which version of CW are you using? Also, what system & bus clock frequencies are you running at?

0 Kudos
Reply
3,563 Views
bobpaddock
Senior Contributor III

> Would it be possible for you to share the code that is calling the SPI functions you shared in your original post?

static void tps65290_write( uint8_t const adr_u8, uint8_t const data_u8 );
static void tps65290_write( uint8_t const adr_u8, uint8_t const data_u8 )
{
  PWR_CTRL_ASSERT();

  (void) spi0_tx_rx( adr_u8 );
  (void) spi0_tx_rx( TPS65290_WRITE );

  (void) spi0_tx_rx( data_u8 );

  PWR_CTRL_DEASSERT();
}

static uint8_t tps65290_read( uint8_t const adr_u8 );
static uint8_t tps65290_read( uint8_t const adr_u8 )
{
  PWR_CTRL_ASSERT();

  (void) spi0_tx_rx(  adr_u8 );
  (void) spi0_tx_rx( TPS65290_READ );

  uint8_t const data_u8 = spi0_tx_rx( 0U );

  PWR_CTRL_DEASSERT();

  return( data_u8 );
}

> Which version of CW are you using?

10.6.  Other than to generate code for comparison I'm not using it.

Doing a bare metal project using GCC 4.7.3, have also tried GCC 4.8.2.

> Also, what system & bus clock frequencies are you running at?

I've tried both doing nothing at all and run with the CW generated code giving ~21 MHz from the FLL for the system clock, putting bus clock at FLL/2.

What I'm now wondering is why does writing to SPI0_D not cause a hard fault, when reading SPI0_D does cause a hard fault?

0 Kudos
Reply
3,563 Views
mjbcswitzerland
Specialist V

Bob

To your question about the need for the sequence in the uTasker code:

- the first time the SPI was used in the KL project it hung waiting for the first SPI transfer to complete.

- working with a debugger displaying the SPI registers it was found that it wouldn't hang if the code was stepped but it would if left to run.

- in both cases it was only ever an issue for the very first character - following ones never had this problem.

- experimenting showed that the sequence of reading the status register and then reading the data register always allowed the first byte to be successfully transferred (i.e. its completion to be correctly detected).

- the 'workaroud' was tested on all other KL parts (from KL02..KL46) and found to have the same behaviour (i.e. to be a workaround for it otherwise hanging for the reason that the status flag would not otherwise react)

In other forum thread there is a discussion about why there are waits in the Freescale generated KL SPI code - I understand that without the waits the code won't work but no one knows why (I never looked into that code and can't be sure whether it is the exact same issue). I never needed any waits (with the two byte read sequence workaround).

So the state is that there are no erratas but there are supposedly 2 workarouds in practice for the non-errata. In your case you are seemingly also encountering a non-errata problem. I don't know whether it is a situation that could result in an errata being generated (erratas are not created with the chips but are developed after experience with them is collected) but I do know that non of the KL boards will get past sending the first byte if the sequence is not performed in the uTasker code - if it is performed then utFAT works with SD card and a file or parameter system works in SPI Flash. Therefore that is the reason why it is considered (by me) to be standard initialisation code becaus it works around an otherwise non-explained behaviour.

The uTasker setup for the KL25 will be using the PLL (the FLL is only used with the devices without USB and with 32kHz crystals) so there may be a difference in that area.

If you post your binary (if it can start on a KL25 board with 8MHz or 32MHz crystal) I can load it to see whether I can see why there is an access problem. It will help if you tell me the code address where it take splace.

Regards

Mark

0 Kudos
Reply
3,563 Views
mjbcswitzerland
Specialist V

Bob

I saw Graeme Bragg's post but since it was only an issue due to one compliler I put it down to either a compiler 'funny' or an optimisation issue, rather than a coding/SPI usage issue (although coding/optimisation issues may go hand-in-hand).

In such cases one needs to study what is happening in the disassemble view. In the uTasker project there are a few workarounds that are restricted to certain compilers (the CW7.2 version for Coldfire had some nasties that could stop the TCP/IP stack from operating correctly and could be seen to be incorrect translation when studied in disassembled code - usually a workaroud is very simple - although not necessarily logical).

Regards

Mark

0 Kudos
Reply
3,563 Views
mjbcswitzerland
Specialist V

Hi Bob

After the initialisation of the SPI try adding

(volatile unsigned char)SPI0_S;

(volatile unsigned char)SPI0_D;

After that the flag should behave as expected.

Regards

Mark

0 Kudos
Reply
3,563 Views
bobpaddock
Senior Contributor III


> (volatile unsigned char)SPI0_D;

I installed a Hard Fault routine and found that my lockup was actually jumping to the unrecoverable hard fault vector.

What is that telling me?  The address of SPI0_D is bad in the header file or I need to turn something on/off someplace?

I see that Greme Bragg ran into this too, his thread "SPI0_D read causes heardfault".

He received no replies.

0 Kudos
Reply