AnsweredAssumed Answered

MQX 4.1 SPI with interrupt problem?

Question asked by Tsvetan Mudrov on May 30, 2014

Hello. I'm using MQX 4.1 with K20_120 microcontroller. It's clear that DMA usage in SPI drivers is not stable, so I decide to switch driver to interrupt driven transfers.

 

In my project I have ADS1298 connected to SPI0 port. This chip asserts a pin to event the microcontroller regularly on each 1ms and I need to read the 27 bytes shortly after the event (before next 1ms event comes). The baud rate of the SPI is 1Mhz. At the begging, all is working fine, but after a given time, all the communication goes out of synchronization.

 

See scope_1.png : D0 - nCS, D1 - SCLK, D2 - MISO, D3 - MOSI. I'm using regular _io_read() access of the driver. The CS goes active from the driver management and returns inactive by me with _io_flush() after finishing of _io_read().

 

Note that, the CS signal returns to inactive state very shortly after start of the frame. The data transfer is not interrupted as you see - there is 216 clocks which are exactly 27 bytes. But the CS signal brakes the communication. In fact the CS is returned inactive by my code, because _io_read() function finishes. It reports 27 bytes read, even it returns before the first byte is finished.

 

After 2 days digging in spi_dspi.c file I have found the following: There is a semaphore which is used to synchronize the state machine of the driver. The logic is as follow - first write 4 bytes in SPI TX fifo, enable RX interrupt and sleep on semaphore. The SPI RX interrupt transfers the rest of the frame and at the end it disables SPI interrupts and posts a semaphore to wakeup the thread when communication ends. All this works fine but for a short time. At some moment, the nCS signals becomes crazy and returns inactive before frame end. I found that after a given time,  the sleep on semaphore not works, and this is because semaphore is already there ( there is a semaphore posted in queue just before _lwsem_post(&dspi_info_ptr->EVENT_IO_FINISHED) ); So the thread do not sleeps and get out of the _io_read() function after which I'm calling _io_flush() and which inactivates CS. The ISR continues to work so the frame is transmitted in parallel with my thread operation.

 

So after additional 2 days behind the monitor,  I have tried to catch the problem with pin toggling. So I have put a simple ON/OFF on a test pins on semaphore post and semaphore wait in spi_dspi.c file . See scope_2.png - this is at the time when all is working.

 

D4 - set before _lwsem_wait() call / cleared after _lwsem_wait() finished.

D5 - set before _lwsem_post() call / cleared after _lwsem_post() finished.

 

All seems normal - the thread goes to sleep for the entire transfer (D4 high)), the semaphore is posted at the end of the transfer ( peak on D5) ,after the thread weak ups (D4 goes low). But look on next scope_3.png. This is the "last" frame before the communication goes bad. There are two short spikes on D5. This means that at the end of transfer two post of semaphore are made in some reason in the ISR.

 

When I have checked the timing of the scope, I have found that these spikes are very close each to other, seems that in some reason the interrupt is serviced twice. Normally this should not be possible, because after last byte, the SPI_RSER register is cleared. So to check this I have included the following test code in _dspi_isr() function ( just at the begging):

 

SR_value = dspi_ptr->SR;

RSER_value = dspi_ptr->RSER;

if( RSER_value == 0 ){

     _ASM_NOP();

}

 

SR_value and RSER_value are global variables declared for test. Normally _ASM_NOP() should never happens - this means that we are in ISR but the interrupts are disabled, so normally we cannot be there. But surprise - when I put a breakpoint at this point - after a given time it stops there. I have checked the RSER register - it's 0! If I continue - future communication drops down, since two semaphores are posted, and thread no more can go in wait for semaphore state. The efect continues, after a given time the debugger stops on NOP again and again. So the semaphore increases more and more, and in fact wait for semaphore never more sleeps.

 

So the question is how we can enter in _dspi_isr() function with cleared RSER register ??? Is it a bug in the silicon? Or something happens with int_kernel_isr() function which wraps the spi_dspi interrupt function? The problem is that this kernel interrupt function is written in assembler so it's really difficult to check what happens there... for the moment, I don't have a time to dig inside it.

 

Now I have put a "return" at the place of _ASM_NOP(). So I don't service interrupt if it's disabled by RSER - all works fine now. But it's not a solution, it's a hiding of the problem.

 

I'm attaching the pictures as follow:

1 - the bad communication frame (CS is not correct)

2 - normal frame with two pins toggles on semaphore post/wait.

3 - last frame before communication goes down (with pin toggling again).

 

I'm sending you the modified dspi driver function, which I'm using for tests. The changes are marked with //TM.

 

I will appreciate any help or ideas from peoples which are known with MQX structure, since I'm pressed in the project scheduling!

Original Attachment has been moved to: spi_dspi.c.zip

Attachments

Outcomes