Hi,
I have a somewhat rare issue with my UART RX FIFO. The K64 UART is connected to a half duplex RS485. I haven't pinned down exactly how to cause the issue yet, but it seems to occur more frequently when I send lots of contiguous serial data. If I had to guess, it would be caused by a bit r framing error on the UART bus.
Setup:
- K64 & MQX 4.1
- UART1 interrupt mode (8N1 @ 115200)
- RTS pin controls RS485 flow control (flag: IO_SERIAL_HW_485_FLOW_CONTROL)
The symptom:
Once the issue occurs, it never seems to go away until I reset the K64. As an example: the master will send a packet of say 10 bytes to the k64 slaves on the rs485 bus. Then the k64 will read one byte at a time from the UART driver. The following function is used to determine if data is ready to be read:
_io_ioctl(port, IO_IOCTL_CHAR_AVAIL, &ready);
When the issue occurs, I can read say 6 bytes from the UART buffer, and then ready becomes FALSE, even though there is still 4 more bytes that should be ready. This function will continue so indicate no data is ready until an additional byte is sent by the master, in which case I will be able to ready 1 more byte from the UART driver.
It seems as if the data is stuck in the FIFO, but is not accessible. I have put a sniffer on the bus to confirm that the data wasn't being cached in the master side UART and can see all the data on the bus. I've also had multiple k64's on the same bus all receiving the same data and some of the K64s will receive all the data properly and others will get into this erroneous state.
Seems to be related to: K22 MXQ4.02 serial port will miss characters if paste a lot of commands together(cause overflow and ...
Any help would be appreciated.
Thanks,
Tim
已解决! 转到解答。
Hello, I had a similar Problem with other Kinetis (K70/60/10) and after some heavy debugging I think there's a hardware bug in the Kinetis Uart module.The Kinetis Uart itself has a internal FIFO (some chips more deep, some chips less deep). After a overrun of this FIFO occurs (a character isn't fetched by IRQ when it is already full and a new one arrives) this FIFO stops functioning: Write and read pointers of that Hardware FIFO will have an offset afterwards, which means you will get your following characters with a delay. The Interrupt for character n+1 lets you read character n, the Interrupt for character n+2 lets you read character n+1 and so on.
After messing around with that problem I found the following solution (more or less empirically):
Locate all occurences of
if(stat & UART_S1_OR_MASK) {
++sci_info_ptr->RX_OVERRUNS;
}
and replace it with
if (stat & UART_S1_OR_MASK) {
sci_ptr->CFIFO |= UART_CFIFO_RXFLUSH_MASK;
sci_ptr->PFIFO &= ~UART_PFIFO_RXFE_MASK;
c = sci_ptr->D;
sci_ptr->PFIFO |= UART_PFIFO_RXFE_MASK;
++sci_info_ptr->RX_OVERRUNS;
}
The locations are in
serl_pol_kuart.c and in serl_int_kuart.c
Hope it helps.
tmeyer did you ever try your application at lower baud rates? While I agree that hardware could be at the root of this UART problem, I don't understand why I get an RX overrun in the first place. Although I have the FIFO disabled, I would still expect the K64F to be able to keep up with pulling a byte of data from the UART at 115200. However, I've lowered my speed to 38400 and still get overrun errors. The funny thing is that I have two other identical board configurations talking to identical slave devices at 115200, and neither of them exhibits these RX overrun problems. So I'm just wondering if you've tried your test at lower speeds, as I have to wonder if there's also something else at play here.
Hev dave408,
It has been a while since I dealt with this issue, but I think I was able to see it at both 9600 and 115200. What I don’t remember is what, other than handling a UART interrupt, the code busy doing. At 100MHz, the k64 should be able to keep up to the 115200 baud. This works out to an incoming byte every 80us (roughly).. so as long as you have your system setup to allow the UART ISR to run (or polling loop) at least this frequent you should be okay!??
I am not confident that this issue is caused by the overrun, perhaps when the issue occurs a overrun is signalled?! The reason I think this, is that I have in the past reproduced this issue on a 115200 command line shell by my slow typing. I suppose I cannot rule out that I may have copy/pasted a command into the shell.
Sorry I couldn’t be more help.
Tim
I have detected the same problem in KDS v3.0.0 with SDK v1.2.0 using MQX.
The UART driver is totally different, but the same solution applies :-)
I find very interesting that Freescale does not confirm if this is a hardware-related issue or not. And I find even more interesting that Freescale does not solve the problem, as this existed for a very very long time...
Thank you, Johannes, for your answer :-)
EDIT: oh, I forgot to say that I found the problem using a K64. The same problem happened in a K10 using CodeWarrior and MQX 4.1
Hello, I had a similar Problem with other Kinetis (K70/60/10) and after some heavy debugging I think there's a hardware bug in the Kinetis Uart module.The Kinetis Uart itself has a internal FIFO (some chips more deep, some chips less deep). After a overrun of this FIFO occurs (a character isn't fetched by IRQ when it is already full and a new one arrives) this FIFO stops functioning: Write and read pointers of that Hardware FIFO will have an offset afterwards, which means you will get your following characters with a delay. The Interrupt for character n+1 lets you read character n, the Interrupt for character n+2 lets you read character n+1 and so on.
After messing around with that problem I found the following solution (more or less empirically):
Locate all occurences of
if(stat & UART_S1_OR_MASK) {
++sci_info_ptr->RX_OVERRUNS;
}
and replace it with
if (stat & UART_S1_OR_MASK) {
sci_ptr->CFIFO |= UART_CFIFO_RXFLUSH_MASK;
sci_ptr->PFIFO &= ~UART_PFIFO_RXFE_MASK;
c = sci_ptr->D;
sci_ptr->PFIFO |= UART_PFIFO_RXFE_MASK;
++sci_info_ptr->RX_OVERRUNS;
}
The locations are in
serl_pol_kuart.c and in serl_int_kuart.c
Hope it helps.
Hi johannesschock, is there any chance you have used the KSDK and have seen something similar? I'm using MQX for KSDK 1.2 and have a very similar sort of behavior where over time, something the RX interrupt causes me to miss a byte, but rather than ending up with one bad packet, every packet is bad. If I look at my buffer, it seems like data from the previous partially received packet is read as part of the next packet.
For example, let's say I normally get something like this:
0B 04 08 00 00 00 00 00 04 FA 3F 46 A4
But at the failure point, the packet could look like this:
15 0B 04 08 00 00 00 00 00 04 FA 3F 46
so the 15 comes from a previous packet's CRC, and therefore the last byte of the CRC for the present packet gets cut off. And of course, this just completely messes up everything.
I know I'm asking for a miracle, but I am hoping that you or someone else has seen this with the KSDK as well. I'm going to take the information you have provided to tmeyer and see if it helps me find anything that can be fixed in the KSDK, if applicable.
Hi dave408,
I was forced to switch to KDS and KSDK, and as I could see, the same issue happens in KSDK 1.2 and the newest KSDK 1.3.
To solve this problem you will have to apply Johannes' solution. In KSDK the UART driver is different, but the same concept applies. When you detect a RX overrun, you will have to perform the same steps shown in Johannes' solution. This worked for me, almost magically :-)
I'm not allowed to share my code, but it's not difficult to do.
I don't understand how Freescale have not pronounced on this issue yet... Everybody uses a UART for almost all projects, and it has to be robust...
EDIT: Please, note that when you fetch a character from UART, status flags are cleared. You will have to check for kUartRxOverrun before UART_HAL_Getchar is called.
cgarcia thanks for the edit. At least now I'm seeing some problems I hadn't caught before. I can see that I am getting RX overruns and framing errors as well. Now I just have to solve the issue. I took johannesschock's solution and applied it to my application like this:
Modification to fsl_uart_driver.c in UART_DRV_IRQHandler:
bool detected_overrun = false;
if (UART_HAL_GetStatusFlag(base, kUartRxOverrun)) {
// Current test doesn't have this commented out, but it really should be because this function reads the D register.
//UART_HAL_ClearStatusFlag(base, kUartRxOverrun); // Clear the flag, OR the rxDataRegFull will not be set any more */
base->CFIFO |= UART_CFIFO_RXFLUSH_MASK;
base->PFIFO &= ~UART_PFIFO_RXFE_MASK;
uartState->rxBuff[0] = base->D;
base->PFIFO |= UART_PFIFO_RXFE_MASK;
detected_overrun = true;
}
if (UART_HAL_GetStatusFlag(base, kUartFrameErr)) {
// TODO: I have verified framing errors, need to address after RX overrun
UART_HAL_ClearStatusFlag(base, kUartFrameErr);
}
if (UART_HAL_GetStatusFlag(base, kUartNoiseDetect)) {
UART_HAL_ClearStatusFlag(base, kUartNoiseDetect);
}
/* Get data and put into receive buffer */
if( !detected_overrun) {
UART_HAL_Getchar(base, uartState->rxBuff);
}
/* Invoke callback if there is one */
if (uartState->rxCallback != NULL)
{
uartState->rxCallback(instance, uartState);
}
else...
If I understand correctly, the main idea here is to correctly detect the overrun condition, save the data in the RX buffer, then flush it out so that incoming bytes no longer get offset.
Hi Dave,
I'm on KSDK 1.2 (KDS v3.0.0 - same as Carlos). My situation is a bit different. When I'm sending large amounts of data via ethernet, the UART will timeout. The UART interrupt wasn't getting triggered and a character would be lost. I tried your solution, and indeed during the losses, the detected_overrun flag is set to true. However, I still lose the data and the packet I transmitted to UART is invalid.
Is that your expectation? Or was your solution to avoid the loss of data to UART?
The code you posted is a bit different than what I saw in fsl_uart_driver.c - so I have posted that to see if I misinterpreted something.
Thanks in advance!
void UART_DRV_IRQHandler(uint32_t instance)
{
uart_state_t * uartState = (uart_state_t *)g_uartStatePtr[instance];
UART_Type * base = g_uartBase[instance];
/* Exit the ISR if no transfer is happening for this instance. */
if ((!uartState->isTxBusy) && (!uartState->isRxBusy))
{
return;
}
/* Handle receive data register full interrupt, if rx data register full
* interrupt is enabled AND there is data available. */
if((UART_BRD_C2_RIE(base)) && (UART_BRD_S1_RDRF(base)))
{
#if FSL_FEATURE_UART_HAS_FIFO
/* Read out all data from RX FIFO */
while(UART_HAL_GetRxDatawordCountInFifo(base))
{
#endif
/* Dave's changes from https://community.freescale.com/thread/341862 */
bool detected_overrun = false;
if (UART_HAL_GetStatusFlag(base, kUartRxOverrun)) {
base->CFIFO |= UART_CFIFO_RXFLUSH_MASK;
base->PFIFO &= ~UART_PFIFO_RXFE_MASK;
uartState->rxBuff[0] = base->D;
base->PFIFO |= UART_PFIFO_RXFE_MASK;
detected_overrun = true;
_rbatra_count++;
}
if (UART_HAL_GetStatusFlag(base, kUartFrameErr)) {
// TODO: I have verified framing errors, need to address after RX overrun
UART_HAL_ClearStatusFlag(base, kUartFrameErr);
}
if (UART_HAL_GetStatusFlag(base, kUartNoiseDetect)) {
UART_HAL_ClearStatusFlag(base, kUartNoiseDetect);
}
/* Get data and put into receive buffer */
if( !detected_overrun) {
// Originally wasn't rapped with detected_overrun but added per forum.
UART_HAL_Getchar(base, uartState->rxBuff);
}
/* End of changes - but see below for comment out section*/
/* Invoke callback if there is one */
if (uartState->rxCallback != NULL)
{
uartState->rxCallback(instance, uartState);
}
else
{
++uartState->rxBuff;
--uartState->rxSize;
/* Check and see if this was the last byte */
if (uartState->rxSize == 0U)
{
UART_DRV_CompleteReceiveData(instance);
#if FSL_FEATURE_UART_HAS_FIFO
break;
#endif
}
}
#if FSL_FEATURE_UART_HAS_FIFO
}
#endif
}
/* Handle transmit data register empty interrupt, if tx data register empty
* interrupt is enabled AND tx data register is currently empty. */
... DIDN'T POST THIS CODE AS NO CHANGES TILL END....
/* Handle receive overrun interrupt */
/* COMMENTED THIS OUT AS DAVE HAD CHECK ABOVE */
//if (UART_HAL_GetStatusFlag(base, kUartRxOverrun))
//{
// /* Clear the flag, OR the rxDataRegFull will not be set any more */
// UART_HAL_ClearStatusFlag(base, kUartRxOverrun);
//}
}
Hi Raj,
In the buffer overrun condition, I do expect you to lose data. It is telling you that it had an overrun condition, so all you can do is continue and handle the error. Ideally, whatever protocol you are using that uses the UART is going to have some sort of integrity check. In my case, I am using Modbus RTU, so I have plenty of opportunities to catch downstream problems.
The point of this fix is to acknowledge the buffer overrun condition before potentially calling UART_HAL_GetChar(), because once you do that with the UART in the buffer overrun state, the UART will not behave properly. I have seen it where (at least via the KSDK), it will always look like an old byte is stuck in the UART buffer.
Hope this helps!
Hi Dave,
It does help. I'm surprised that I'd run into a case where we'd hit an overflow. I wonder if there is too much going on in the interrupts. On some less luxurious uPs that don't come with a rich SDK, I've hand written interrupts that get bytes as quickly as possible to a circular buffer never running into this case.
Thanks,
-Raj
I apologize for the long-winded post but...
I've been spending some time looking into this further and haven't come up with any conclusions. Perhaps the community has ideas.
I am using UART3, which I discovered has a RX FIFO buffer size of 1. So I switched to UART0 which has a size of 8. When using ethernet concurrently with RS-232, I still would get RXOF errors but much less frequently than with UART3. So I figure perhaps the ethernet interrupt is taking too much time not allowing the UART IRQ to empty the buffer. I'm sending 10 bytes continuously at 115,200 bps to the UART. Then I tried something else that has me a bit puzzled.
(1) I went back to UART3 but unplugged ethernet, so no communication that way. I would expect that I won't have any communication problems as it's just MQX (running a few threads plus RTCS and USB driver) and the UART responding back with data.
(2) I then continuously sent 10 bytes at 115.2kbps from PC to UART (and send back a response packet) which I consume - so fully hand shaken. I also use SEGGER_RTT_printf to write to the console (in a thread) the bytes that I received via UART3.
(3) After some time, the UART_DRV_ReceiveDataBlocking() fails to return the packet I transmitted. Strangely, when transmitting on the Mac, the failure happens much faster than on the PC. I am using USB 2 serial dongles and have a scope attached to one of them (that hasn't failed yet), to see if the dongle fails to transmit.
(4) I set a memory watchpoint on 0x4006 D012 (SFIFO - for the K64 uP). When the timeout occurs, the watchpoint isn't triggered immediately.
I first get:
C 0 0 0 0 0 1 0 0 0 0 0 (PACKET TRANSMIT OK)
C 0 0 0 0 0 1 0 0 0 0 0 (PACKET TRANSMIT OK)
C (WHERE'S THE REST OF THE PACKET??)
(No watchpoint break occurred yet).
Then I transmit another packet and get 3 more bytes (see red) and the RXOF bit is triggered (I.e. watchpoint hit!)
C 0 0 0 0 0 1 0 0 0 0 0
C 0 0 0 0 0 1 0 0 0 0 0
C 0 0 0 (3 more bytes sent - appears to be from previous packet, and new packet didn't show up)
I implemented the protocol over USB-CDC and have ran for several million cycles without ever seeing an error (likewise with ethernet), only RS-232 seems to be giving me problems.
I initialize my UART3 as follows (using KSDK 1.2 + MQX):
/* External method declaration */
extern void UART_DRV_IRQHandler(uint32_t instance);
/* UART IRQ handler */
void MQX_UART3_RX_TX_IRQHandler(void)
{
UART_DRV_IRQHandler(UART3_IDX);
}
// Initialize variable uartState of type uart_state_t
static uart_state_t uartState;
// Fill in uart config data
uart_user_config_t uartConfig = {
.bitCountPerChar = kUart8BitsPerChar,
.parityMode = kUartParityDisabled,
.stopBitCount = kUartOneStopBit,
.baudRate = 115200
};
Error uartInitialize(void)
{
NVIC_SetPriority(UART3_RX_TX_IRQn, 6U);
OSA_InstallIntHandler(UART3_RX_TX_IRQn, MQX_UART3_RX_TX_IRQHandler);
// Initialize the uart module with base address and config structure
if (kStatus_UART_Success !=
UART_DRV_Init(UART3_IDX, &uartState, &uartConfig))
{
DPRINT("UART Failed to initialize! Exiting.\n");
return(ERROR);
}
DPRINT("UART initialized.\n");
return(NO_ERROR);
}
Thanks,
-Raj
Hi Raj, I'm using UART0 in my application with the FIFO buffer disabled, so it should be similar to yours with UART3. I actually neglected to change it back to 8 bytes after addressing the bug because everything is working quite nicely at this point. The only difference I see between my code and yours is that I also use UART_DRV_InstallRxCallback to disable interrupts, read data out of the UART, mess with a couple of timers, and then re-enable interrupts. Where in your code are you reading data out of the UART?
Hi Dave,
Thanks for taking the time to reply.
Basically, I have a thread (MQX) that feeds a small state machine that builds the packet. Here's a simpler version of the code:
void uartTask(U32 arg)
{
U8 byte; //U8 is unsigned char
while(1)
{
// read 1 byte, have a 5 second timeout.
status = UART_DRV_ReceiveDataBlocking(UART3_IDX, &byte, 1, 5000);
// process byte and last state is to transmit it through UART
buildProcessPacket(byte);
}
else
{
// Timeout or something bad happened
//reset the packet state machine
}
}
I was able to capture on the scope the bits that transferred to the UART when failure occurred. They were just fine; so the USB 2 RS-232 dongle is not the culprit. However, one thing I did notice is that if I spread out the timing between RX and TX packets (I.e. transmit packet to K64, receive packet on PC, [PAUSE], transmit packet to K64...), I haven't seen the issue (yet).
Basically, if PAUSE is very small (2.4 ms) I would get timeout but 5ms didn't seem to trigger it. Not sure what your timing between bytes or packet is...
I wonder if the MQX scheduler or something else needs time. However, I don't see this on USB or Ethernet, but there maybe some sort of flow control that occurs with those protocols. If I had RTS/CTS, perhaps this wouldn't be a problem for me.
Thanks,
-Raj
I don't process my Modbus RTU packets in MQX because I could never get it to work, it just seems too slow and unpredictable. That's why I changed my design to use an ISR to pull data out of the UART and stick it into a temporary buffer, and then when the packet is complete, I copy the entire packet into a message queue. Then all my MQX task does is wait until it gets a signal that the message queue has a message ready, and it pulls it out for processing. It has been extremely robust.
Hi Dave,
Yes, I get your rationale. Reading through the fsl_uart_driver.c code, I realized that when I call UART_DRV_ReceiveDataBlocking() is when RX interrupts get enabled and once it returns my data, interrupts are disabled (See call to CompleteReceiveData()). Since I'm calling this method in a low priority thread, data is being transmitted to my device before I call the next ReceiveDataBlocking leading to missed data. If I implement it via the rx callback, interrupt is always running and I can collect in a buffer and notify thread of awaiting data as you described.
I'll try this to confirm the theory and follow up.
Thank you for the discussion and being a sounding board too!
-Raj
I am not an RTOS expert, but I was curious if you have multiple tasks running, maybe even with the same priority? MQX has a 5ms tick, so perhaps you have another task that the scheduler executed and "stole" the time that your packet assembly task would have otherwise used, and due to the duration of one OS tick, it just missed the UART data. Good luck with the IRQ-driven + message queue approach, I hope that works out for you.
Hi Dave,
Just a follow-up on my working solution. Yes, as you indicated, I have multiple tasks and would miss packets if I didn't call UART_DRV_ReceiveDataBlocking() in time. I could have hooked to rx callback and think it would have worked out OK. However, I took the route of not using fsl_uart_driver but wrote my own. Basically, Rx Interrupt always running pushing data into a circular buffer. My packet assembly task would wait on an event that data is available in the RX queue and process. To transmit, I copy to a transmit buffer and activate the Tx Empty IRQ. It transmits and deactivates Empty IRQ once data transmitted (and event sent back to notify sender data has been transmitted).
I stressed the system by flooding it with data and simultaneously doing CDC and TCP/IP communication and would occasionally hit an overrun. In fact, I listen on both SFIFO and S1 RX Overflow Interrupts as I ran into one case where I believe I received a FIFO RX overrun and not a RX overrun. But since an overrun can exist, switching to UART0 which has a FIFO buffer of 8 (vs UART3 of 1) fixed the OR issue. I've run it for a few days straight and haven't seen an issue. I also integrated the fix you had above as well that started the thread.
Thanks,
-Raj
Many thanks dave408 and johannesschock for taking the time to post on this.
We made the mistake of assuming the FIFOs on some of the K22 parts were 16C550 style and belatedly found they are only 2 bytes deep.
I've been fighting weird NULLs coming out of the fsl_ drivers for some time now and your suggestions have enabled me to fix
The only comment I'd make is that the (really helpful!) example snippet probably doesn't want to be calling the callback on an error path.