MCF5282 FlexCAN receive masks corrupted

pas · ‎01-29-2008

I am using the FlexCAN controller on the MCF5282 configured as follows:

MB 0-7 = Tx Buffers
MB 8-13 = Rx Buffers each configured with different ID_HIGH, ID_LOW
MB14-15 are set to NOT ACTIVE

Each Message Buffer has a different interrupt priority ( MB 0 is the highest and MB 13 is the lowest ). The Wakeup, Error, and BusOff Interrupts (none of which I have seen) all have different prioirities as well.

When I configure RXGMASK to be 0x00000000 (i.e listen for all messages), I do not have any problems transmitting and receiving any message. However, when I configure RXGMASK to only look at the bits I care about, I run into certain problems.

For example, traffic coming into MB 11 will work work fine for a while and then suddenly switch to MB 8. Meanwhile, the traffic that was coming into MB 8 is no longer received until a reboot. However, if I configure the software to never transmit any messages, this problem never occurs. The trasmits always work no matter what happens to the receive MBs.

So it seems to be some interaction between transmitting and receiving where the ID_H and ID_L masks for the input message buffers are being corrupted.

Here is some pseudo code of what we are doing:

input_interrupt()
{
disable interrupts
read the MB CONTROL register to lock the MB
read ID_H (16 bit)
read ID_L (16 bit)
read the data (byte access)
clear the IFLAG bit
read the TIMER to unlock the MB
reenable interrupts

do the work on the data
}

tx()
{
write NOT_ACTIVE (0x0080) to CONTROL register
write ID_H (16 bit)
write ID_L (16 bit)
write data (byte access)
write (TRANSMIT_ONCE | data_length) to control register
}

Has anyone seen any problems like this or have any suggestion on how to prevent this from happening?

PaoloRenzo · ‎05-13-2009

Tx MB, baudrate and bus loading is included in errata description since it's very hard to reproduce it. If you force a low baudrate with a high bus loading, two or more MBs receiving and try to Tx, you will see after a while, even if these factors are not necessary to show bug.

Errata was shown with a 5282lite board and CANalizyer running a script.

Have fun!

Message Edited by Paolo Renzo on 2009-05-13 07:10 PM

san_ · ‎05-14-2009

Hello Paolo:

THanks for your reply. Yea, seems like low baudrate may catch it under heavy loading conditions. I will go down a few kbps and see if I see the error. However the root cause of this problem remains unclear. Even if the errata is caught, it would be nice to understand the root cause, so that a mitigation plan can be developed.In other words, a safe alternative can be developed to avoid pushing freescale into that state.

The errata says, when one rx mailbox gets locked, then both smb's get full, and when mailbox get released, smbs force message into the incorrect mailbox resulting in a corruption.THe latest errata asks the customers not to use all bit masks for Rx ( and does not talk about tx'ing.) If we infact change our bit masks, will it be assured that Tx would not cause any issue???

Thanks in advance.

san.

PaoloRenzo · ‎05-14-2009

Latest errata (SECF125) talks about do not using ID bits 14 to 0 for extended ID's.

Prior FlexCAN Errata (SECF123) about writing to a receive MB may corrupt MB content.

Regarding SECF123:

- Please follow both workarounds to get rid off this

- Errata will happen with 2 or more Rx MB. If you only use a single Rx MB errata will never happen. (Safest alternative)

As I said, Tx MB, Baudrate and bus loading are not part of errata, but are needed to force this condition and see the bug. Some customers have been using this uC for a while and have never seen the errata. So if you follow workarounds you won't see errata.

Hope this clarifies things

san_ · ‎05-14-2009

Hello Paolo:

Thanks for the clarification. Well, yea, i do agree to what you said. catching the bug is one thing and coming up with a safe alternative is another. Using a single mailbox sounds like the only alternative to the problem. In other words, I can only have one Rx mailbox for all CAN communications. However my only concern is, what is the single mailbox gets corrupted?? .. and i lose each and every message , forever until a POR. Can i be assured that the errata will not show again? with 100% confidence.

Thanks you once again for your continued support on this issue.

san.

PaoloRenzo · ‎05-14-2009

Errata will not happen if you use a single RX MB or if you follow both workarounds stated at errata doc

san_ · ‎04-29-2009

Hello pas/Charlie2:

I am trying to catch the errata , but was wondering if I can get the bus loading conditions, ( transmission rate, receive rate, .??) Have you seen this happening just with the Rx traffic or a simultaneous( async). rx/tx traffic..

-ss

Message Edited by san. on 2009-04-29 03:26 PM

Message Edited by san. on 2009-04-29 03:27 PM

Charlie2 · ‎04-29-2009

In my experience, only two devices are required to see the problem. It seems independent of baudrate (I saw it from 250k to 820k). The key is to have the devices doing something at close to the same time (e.g. one preparing to send just as a message is being received).

san_ · ‎05-12-2009

Hello Charlie:

Thanks for getting back .I am writing an app that would catch this specific errata, but seems like I never catch this. I am not changing the existing driver, as I know doing this might catch the errata faster. Now, say if App 1 catches the errata, then I would then play around with driver and come up with another driver 2 that would essentially not let freescale get into the errata state, and then try app 1 with the changed driver and see how the behaviour is.The goal being that driver 2'fixes' the errata for my application.

Thus its getting tricker to catch the errata as I am trying to come up with a suitable application, without changing the existing driver file. ALso, I am not trying to a Tx inside the isr/driver and this means, the process to do a Tx/Rx is getting tougher.

Any ideas or help would be appretiated. Thanks in advance.

san

Charlie2 · ‎12-16-2008

pas,

I'm using a 5216 and am seeing the same issue, i.e. ID_H and ID_L being corrupted after working for a while. Did you ever discover anything?

pas · ‎12-16-2008

After quite a bit of investigation, Freescale issued a new errata for this processor. It turns out that the hardware mailbox Rx masks can become corrupted when using extended arbitration IDs. We now avoid this problem by setting the global mask to 0x0 and performing all message ID masking in software.

The errata:

Title: FlexCAN Writing to an Active Receive MB May Corrupt MB Contents

Description:
Deactivating a FlexCAN receive message buffer (MB) may cause corruption of another active receive
MB, including the ID field, if the following sequence occurs.
1. A receive MB is locked via reading the control/status word, and has a pending frame in the
temporary receive serial message buffer (SMB).
2. A second frame is received that matches a second receive MB, and is queued in the second SMB.
3. The first MB is unlocked during the time between receiving the CRC field and the sixth bit of end
of frame (EOF) of the second frame.
4. The second MB is deactivated within nine bus clock cycles of the sixth bit of EOF, resulting in
corruption of the first MB.
During standard use of the FlexCAN hardware, the errata can appear during heavy communications with
several Rx MBs at a low baudrate and while using Rx extended MB’s IDs. This can be easily observed by
checking ID value overwrite. In all cases, CAN transmissions from the processor are not affected at any
moment.

Workaround:
1. Do not write to the control/status word after initializing a receive MB. If a write (deactivation) is
required to the control/status field of an active receive MB, either freeze the FlexCAN module or
insert a delay of at least 27 CAN bit times plus 10 bus clock cycles between unlocking one MB and
deactivating another MB. This avoids MB corruption; however, frames may still be lost.
2. The FlexCAN software driver ensures IDs are not changed during each reception. As soon as it
has changed, return to original value.

san_ · ‎05-12-2009

HEllo pas:

THe errata issued by freescale talks ONLY about Rx processes. However, this thread started off with doing a Tx/Rx simultaneously to catch the errata. Not sure why Freescale is hesitant to add Tx inforamtion into their Errata... Any ideas?

Thanks in advance.

san.

Charlie2 · ‎05-12-2009

San,

Hard to say as I don't know the details of your driver or app. Do you have a CAN bus analyzer? In my case, both devices transmitted asynchronously and we could monitor the activity on an analyzer. As the tx and rx windows moved within a message length amount of time, the error would occur.

FYI, the new errata was just published on the Freescale site 5/5/09.

Charlie

san_ · ‎07-22-2009

Hello :

After a 'slightly longer than expected' work,I was able to come up with an application without changing the driver for coldfire.

The observation I made was that as long as a few nibbles ( higher) in message ID used for hardware filtering match up, we should see the errata irrespective of the interested node doing a tx/Rx. I was able to see the mailboxes filters getting corrupted again and again indefnitely to accept a different message. The rate at which this happens is also the rate at which messages are received by a coldfire node.

ex: if mailbox 1,2,3 are configured to accept msg 1 msg 2 msg 3, during startup, sometimes, mailbox 1 gets corrupted to accept msg 2, after a while mailbox 2 could get corrupted to accept msg 1, sometimes all three mailboxes get corrupted to accept ONLy msg 1, and so on. I was able to accelerate this if I had a faster Tx rate of the messages coming from a different node ( A can analyzer etc.

This blog helped me in understanding the errata and if anyone wants an application , i can defnly send a pseudocode or may be the code that does exactly that.

san.

PaoloRenzo · ‎07-22-2009

Hi all

As an additional comment:

- RX MBs corrupted wont NEVER happen if a single MB for Rx is used.

- To get rid of the second errata (extended IDs), use a software filtering (do not use general ID mask nor particular MB mask) using a single RX MB. Even if the rest of MBs are used for Tx, both errata will never happen with a 100% of confidence. Just be aware of the self-received CAN message.

btw, the use of a single RX MB is an additional workaround for both erratas. Need to update errata manual soon with this information.

Errata details concerning to shadow MB and internal connections are not necessary to be stated, since there is no workaround at that level.

Hope this closes the issue

Regards

Message Edited by Paolo Renzo on 2009-07-22 11:30 PM

san_ · ‎10-15-2009

Hello Paolo:

Last week or so i was observing a case where the message was getting through a mailbox thats its not registered to. However when it does this, it doesnt reprogram the mailbox. Is this case possible?

Pseudo code:

Mailbox 1 configured to receive MSG 1

Mailbox 2 configured to receive MSG2

mailbox 3 configured to receive MSG3.

Earlier when MSG1 , MSG2, MSG3 had matching MSG_ID high, then MSG3 got through mailbox 1 and because of this, Mailbox for reprogrammed to receive MSG 3, and hence MSG1's failed to be received.

Now I made sure that MSG_ID high were unique in all three messages. In this case, i sometimes find MSG3 getting through Mailbox 1 but IT DOESNT reprogram Mailbox 1 . This way, MSG3 found an alternate path though Mailbox 1 withought corrupting mailbox 1. Thus MSG1's are also received.

Inside my ISR, everytime I receive a CAN message, I copy it into local memory. However instead of copying all the data bytes as received in the CAN message, I re-write CAn byte 0 with Mailbox number. This way, when the application gets the message, it know which mailbox it came from.

Thanks in advance.

sandeep

PaoloRenzo · ‎10-18-2009

Hi Sandeep

THis part of the message seems unclear to me:

Inside my ISR, everytime I receive a CAN message, I copy it into local memory. However instead of copying all the data bytes as received in the CAN message, I re-write CAn byte 0 with Mailbox number. This way, when the application gets the message, it know which mailbox it came from.

Some questions:

1. Can you explain more if you are writing to MB directly? Are you unlocking/locking MB during this operation? More details about the quote.

2. Is there a Tx MB?

san_ · ‎10-20-2009

Hello Paolo:

thanks for your response.I can explain this in another way:

1.I have three mailboxes condifured to receive MSG1, MSG2 and MSG3 respectively.

2. The bus is heavily loaded with MSG1, MSG2 MSG3 sent back to back at about 2ms.

3. There is NO match in MSG ID high.

4. Upon reception of a CAN message, CAN ISR gets called.

5. I loop through the Mailboxes and see which mailbox had an RX message.

6. When a mailbox is found to have a message, i read the contents of the mailbox and copy it into a local memory Structure ( CAN structure)

7. At this time, I tag in the mailbox number in the CAN message structure. (The first data byte in the local CAN message structure is the message buffer # ).

8. The application, then gets this CAN message structure and looks at which mailbox the message actually came from. Ex: For MSG1 ,intended to come from Mailbox 7, an error could be the application receiving MSG 1 from Mailbox 8 and not mailbox 7. MSG2 is intended to come from Mailbox 8 and not MSG1.

9. I see some errors, but what surprises me is that although MSG 1 comes from Mailbox 8, it doesnt corrupt mailbox 8. Messages (MSG2) initially intended to come via mailbox 8 still make it through mailbox 8.

10 . There is a Tx mailbox but I am not transmitting anything. I am just receiving. Thus I am NOT writing anything to the mailbox. i am just reading Mailboxes.

So i wasnt sure if there could be a case where messages could come through different mailboxes, without corrupting them? If I can recollect, earlier errata showed that when a message comes from a 'wrong' mailbox, it corrupts that mailbox.

Hope this helps.

Charlie2 · ‎12-16-2008

Awesome, thanks a ton!

Charlie2 · ‎04-03-2009

Here is a new (2/2009) erratum for certain CF processors (at least the 5216 and 5282):

Description:

Access to any field of a FlexCAN Message Buffer (MB) during reception or transmission of an extended ID frame's CRC and EOF may cause unwanted message reception.

Impact:

With extended ID frames, if the ID_HIGH received matches the ID_HIGH configured in a receive MB, the frame will be received to this MB irrespective of the ID_LOW and the mask. So unwanted messages which should be filtered by the hardware mask register may be received.

This issue only happens to the messages with the extended ID when ID_HIGH of the receive MB is equal to that of the current receiving frame.

Messages with standard ID have no such issue.

Workaround:

Perform one of these actions, either a or b:

a. Use only the Standard ID format for all messages, not the extended format.

b. In case extended IDs are used, make sure that only ID bits 28 to 15 are used as the filter criteria, so that other ID bits (ID bits 14 to 0) are not used to filter messages. ID bits 14 to 0 may contain information not used for message filtering purposes.

krsihna1234 · ‎02-02-2008

Hi,

Can I know are you writing CAN drivers for MCF5282 ? If so Can I get the source ? or any sample Applications.Please help me in this

Regards

krishna

MCF5282 FlexCAN receive masks corrupted

MCF5282 FlexCAN receive masks corrupted

General