UART is dropping bytes

cancel
Showing results for 
Show  only  | Search instead for 
Did you mean: 

UART is dropping bytes

3,872 Views
rdeml
Contributor I

I have 2 boards that talk to each other via UART.  I have test programs in each board so that one board sends bytes and the other board receives and checks the data.  The data is 0x00 through 0xFF.

 

When I am running at a high baud rate (~1Mbps), the Coldfire 5272 is dropping bytes.  To help debug this I added a 10ms pause after sending the data.  Now the Coldfire is reading all the bytes.  The Coldfire UART interrupts are set to fire when the FIFO is 25% full (6 bytes) and on a receive timeout.  If the UART can keep up with 200 bytes, why does it need a pause before the next block of data?

Labels (1)
Tags (3)
0 Kudos
6 Replies

1,604 Views
TomE
Specialist II

Kenny's suggestions are all good.

> If the UART can keep up with 200 bytes, why does it need a pause before the next block of data?

I'm interpreting that statement to say that the sender is sending 200 bytes as a "packet" of some sort, and the receiver is also expecting 200 bytes.

200/6 = 33.333

So you'll get 33 25% full interrupts, and then the last two bytes will sit in the FIFO until the timer goes off, or until the next message comes in and fills the FIFO up to the trigger point.

So if the sender waits long enough for the receive timer to go off it all works, but if it "pushes" them through it doesn't.

That suggests that whatever your receiving code does after receiving that "200 byte message" is the problem.

Somehow it is locking out the receiver interrupts. Maybe it is written to somehow reprogram or reset the UART every message. Maybe the receive buffer is "locked" or "full" and the receiver interrupt can't write the data (and doesn't count this as an error condition). Maybe the receiver is "wrapping the ring" without checking.

WHICH bytes go missing? Are they the first ones after the receiver does something (like process a message/packet)? How many does it drop at a time? The number (dropped) will tell you how long the receiver was locked out or getting errors.

Are you getting any UART receiver errors like overruns? Are you checking or counting them, or ignoring all errors? You should add error checking to the receiver.

Do you have a circular buffer, or a single buffer that needs to be "processed" at some point? Do you have "ping-pong" receive buffers? Do you handle/count/report software buffer overflow? Does your "end of circular buffer" code work properly?

Do you disable interrupts around ring pointer access in the mainline? Are all your shared variables marked "volatile"? If you've got any shared variables they may be suffering interrupt hazards.

Tom

0 Kudos

1,604 Views
rdeml
Contributor I

I've also done an experiment where I only sent 5 byte and waited for the timeout interrupt.  This worked.  I also disabled the timeout interrupt and sent 5 bytes, waited, then sent another 5 bytes.  The receive interrupt worked and the timeout interrupt (that was disabled) did not fire.

Right now the receive code is looking for 200 bytes and checks each byte that is is correct (0x00 to 0xFF).

The buffer is an RTOS Queue in SRAM.

0 Kudos

1,604 Views
TomE
Specialist II

If you are suffering from an interrupt latency or lockout problem, then there is a very easy way to find out what is causing it.

First you add a test in the interrupt code for the overrun condition, testing the error bits in the UART. Then you add some code to increment a counter or something.

Then you run your code with a debug pod connected and a breakpoint on that code.

When you hit the breakpoint, the interrupt will have been delayed, but here's the neat trick. Look back on the stack to see what code the interrupt service routine interrupted. It should be the code that enabled the interrupts (after their being disabled for too long), or it will be the code that re-enabled the UART interrupts after somehow disabling them for too long.


The interrupt routine will point the finger of blame back to precisely the code that caused the problem, and then it should be easy to fix.


If that doesn't find the problem you've got some other software bug.

Tom

0 Kudos

1,604 Views
TomE
Specialist II

> The buffer is an RTOS Queue in SRAM

So maybe that's the problem. What sort of queue is it? Are you putting byte on the OS queue one by one, or does it provide a buffer?

How about the other questions I asked?

Tom

0 Kudos

1,604 Views
razed11
Contributor V

Assuming 6 bytes, 10 bits per byte, and 1 MHz your interrupt should be firing about every 60 us. You might toggle a pin in your interrupt and check it out on a scope to get some idea of what is going on.

Have you tried interrupts at 50% FIFO?

What does your interrupt do with the data? Write to SRAM or DDR?

What other things are going in your system? Can you set the interrupt priority of the interrupt to highest and shutdown other items just to see if you can sustain this and storage to memory?

Perhaps something is adding latency to your interrupt response.

Have you checked the signal integrity of the UART signals? If the chip provides the option perhaps reduce the drive strength if you have some excessive ringing.

Good luck!

K

0 Kudos

1,604 Views
rdeml
Contributor I

The FIFO Interrupt was originally set for 75% full.  But I was getting overrun errors.  So I set it to 25% full, with the thought that is other things in the system take too long it would still have 18 bytes of FIFO before the overrun error happened.

The interrupt copies the data to an SRAM RTOS queue.

0 Kudos