MCF548X FlexCAN RX MessageQueue

andreaswehrmann · ‎10-04-2015

Hi!, did anyone ever get message buffer queueing for the FlexCAN on the MCF548X working, if so, how?

I'm using the latest BSP from 2011 (LTIB with Linux 2.6.29) and am dealing with lost CAN frames here so I was trying to set up an NMI,

but that will crash the Linux kernel/hang the system randomly...

The reference manual says that message queueing is possible, but I couldn't find out how to set that up exactly.

Setting up multiple RX buffers with the same ID doesn't really work,

because FlexCAN tries to put new incoming frames into the _first_ matching buffer always... even if that buffer is FULL and another matching buffer exists...

TomE · ‎10-05-2015

> Hi!, did anyone ever get message buffer queueing for the FlexCAN on the MCF548X working,

Not possible. Where did you read that it might work?

There are lots of different FlexCan modules.

The earliest and least capable ones only had 16 buffers which overwrote, meaning you had to read a message within microseconds. Only usable with "extreme bare metal code".

The next generation had the "FEN "FIFO Enable" bit in bit 29 in the MCR that added a 6-deep FIFO.

The next generation added the "IRMQ" (i.MX6) AKA "BCC" bit in the MCR (i.MX53 and various ColdFire chips) which allowed the not very useful "Message Queueing" that allowed all receive buffers to be used as a "Randomly ordered FIFO". You have to use the timestamps to get the reception order back unfortunately.

I'm running Linux on an 800MHz i.MX53 in "FIFO Mode" and it STILL overflows! You can't get any "real time performance" from Linux, so you really do need FIFOs that have up to 30 MILLISECONDS of buffering in hardware, because even now and then Linux will sleep for that long on you. I was able to get it reliable by getting rid of the code in the driver that had "NAPI" unload the FIFO. Rewriting the driver so it unloaded the FIFO in the initial interrupt solved my problem.

So MAKE SURE your FlexCAN driver isn't offloading to NAPI.

The FlexCan module in the MCF548x is the extremely ancient one. It doesn't have IRMQ/BCC and it doesn't even have FEN. It is similar to the one in the MCF5282, which dates from 2003, and is documented here as being derived from a previous Motorola "TouCAN" module:

http://cache.freescale.com/files/32bit/doc/app_note/AN2393.pdf

Which was in the M68376 in about 1997:

http://cache.freescale.com/files/microcontrollers/doc/user_guide/MC68336376UM13.pdf

http://cache.freescale.com/files/microcontrollers/doc/user_guide/MC68336376UMD.pdf

The slightly more modern MCF5235 and MCF5329 chips (which we use) have FEN but not BCC.

The one in the MCF5441x has FEN and BCC. Maybe that's where you learned about it from.

> am dealing with lost CAN frames

At what baud rate? You probably won't be able to solve this with Linux on that hardware. You need a CPU with a FlexCAN module that at least has the FEN bit. You need a more capable CPU or you need to offload the CAN function to a separate CPU and talk to it over SPI, I2C or Ethernet. Or maybe over CAN with a hardware handshake flow control line between them. No, really.

The "Message Queue" wouldn't help you much even if you were using a more modern chip that supported it. Freescale documents the "Message Queue" mode in the i.MX6 manual as being the preferred way to use the hardware (to have it work at all with Linux), but they've never provided any drivers that actually use that mode.

There has just recently been work on the driver in the latest version of Linux that (at last) does support the Message Queue mode.

http://thread.gmane.org/gmane.linux.can/8431

> so I was trying to set up an NMI

Probably doomed. Interrupting something that is playing with the MMU is a bad idea. You may not have memory or even a stack.

Using the six different interrupt priorities might help a bit, but traditionally Linux hasn't supported multi-level interrupts as it has to work on the "lowest common denominator" CPUs and they don't have that. It also turns off all interrupts to do critical things.

Tom

andreaswehrmann · ‎10-06-2015

Hello and thank you for you extensive answer!

In the reference manual of the MCF548X "21.4 Functional Overview" it says:

"A reception queue can be implemented by programming the same ID on more than one receiving MB."

http://cache.freescale.com/files/32bit/doc/ref_manual/MCF5485RM.pdf

That's why I posted the question, because I could never figure out how to make this work.

FlexCAN would (just as you wrote) only take the first matching buffer on RX.

----------------

Just as a side-note, I rewrote the driver that came with the BSP, because the write performance was horrible.

I changed the driver from being socket based to a character device driver which allows me to write multiple frames in one go

(which is important for our application; IIRC SocketCAN allows only one frame per write from userspace).

----------------

The idea with the NMI comes from a colleague who had to deal with the same problem on some MCF52xx CPU (no MMU).

I think it might work on the MCF548X, even with a MMU in the background.

Our idea goes something like this:

- Make the CAN driver a built-in of the kernel (no module)

- Make the MBOR interrupt non-maskable

- In the MBOR irq:

-- Only use CPU registers (avoid use of driver local data (or anything "virtual"))

-- get/put data from/to FIFO in SRAM

-- force a "regular" irq on some unused vector

-- done

In the softirq (one for RX, one for TX) we would then do the transfer between SRAM and the driver local FIFO

used for exchanging data with userspace.

I'm currently in the process of testing this out.

Unfortunately a hardware change is not possible;

I have to deal with what I have in front of me which is this *bad word* implementation of CAN...

----------------

When I make any progress on my experiment, I'll get back and post the results.

Andreas

TomE · ‎10-06-2015

> In the reference manual of the MCF548X "21.4 Functional Overview" it says:

> "A reception queue can be implemented by programming the same ID

> on more than one receiving MB."

That may be simply wrong, or it may be a cut and paste from a different manual that does allow that that function (but I think you'd need a functioning Tardis to paste that from a future manual for that to happen here).

But then again, the Reference Manual Addendum Revision 5 for that manual also says:

Section 1.4.6.7/Page 1-10 Change last sentence to: “The two CAN controllers can interface to two separate 16 message

buffer CAN networks or a single 32 message buffer CAN network.”

Which would need you to set MAXMB to 32 in one of the controllers, but then the Addendum goes on to say:

Table 21-2/Page 21-8 Change MAXMB description from “This 6-bit field...” to “This 4-bit field...”

Also nowhere in the FlexCAN chapter does it say how to get 32 channels. It looks like it was an idea that didn't get implemented.

I also dare you to make any sense of this. It "isn't even wrong""

21.4.5.1 Self-Received Frames

Note that FlexCAN does not receive frames transmitted by itself if another device on the CAN bus

has an ID that matches the FlexCAN Rx MB ID.

(Edit). Completely wrong. There's another bogus interpretation of the above from Freescale here, followed by another, this time correct interpretation:

https://community.freescale.com/message/73272#73272

Also addressed in this one:

https://community.freescale.com/message/95926#95926

Tom

TomE · ‎10-06-2015

Update.

The "Receive Queue" may actually work for you.

First, some background.

When a CAN message is received, the chip has to perform some sort of "Matching" against all of the Receive buffers. Simple chips may do this in parallel hardware. But once you have 16 (or 64) message buffers, that would take a lot of silicon. So in those ones there is usually a programmed sequencer that compares the buffers sequentially.

In the latest chips this is insanely complicated. There are 64 MBs, multiple passes, multiple "potential winners", options of checking the FIFO or MBs first and so on. The matching process starts when the data field starts to be received (to allow enough time for it to execute). You should read "26.6.5 Matching Process" in "IMX6DQRM.pdf" to see what this is like. "Table 26-15. Matching Possibilities and Resulting Reception Structures" has 26 different possibilities and seven "Notes".

The above matching takes SO LONG that "Table 26-18. Minimum Ratio Between Peripheral Clock Frequency and CAN Bit Rate" details that if you want to use 64 MBs then you have to program a minimum of 25 peripheral clocks per Quanta. If that doesn't work (for your baud rate) then you have to drop the number of MBs.

I suggest you read the above to work out which SUBSET of those operations your chip is performing.

I don't think you're going to be able to believe the MCF548x manual as the FlexCAN chapter isn't well written.

But the FlexCAN module has been used in a lot of other parts, and in some of them the Reference Manual is better written.

I searched the Freescale site for a match on ("reception queue" can be implemented) and got some interesting matches, specifically:

http://cache.freescale.com/files/32bit/doc/prod_brief/MPC5566PB.pdf

That manual is well written, has a very good "Matching Process" section. Damn - this is a newer chip with 64 MBs and the IRMQ/BCC bit, but renamed "MBFEN". Bizarrely this version doesn't have the FIFO Enable bit. This "mix and match" of features makes writing general purpose FlexCAN drivers really, really difficult.

So does a particular FlexCAN module match by searching for the FIRST matching buffer (full or empty), or the first EMPTY matching buffer? Surprisingly the second is a lot harder, as if it doesn't find any empty matching ones it then has to run a second pass to "blame" the first (or last) FULL one to flag the overflow.

The MC68376 manual's TouCAN module is probably close to the FlexCAN module in your chip. It doesn't explicitly state that it only matches the first one, but this is implied.

The MCF5235 and MCF5329 have FlexCAN modules without a FIFO (FEN bit) or the BCC bit. The "Matching Process" sections are subtly different, which means that one is probably wrong:

MCF5235

An MB with a matching ID is free to receive a new frame if the MB is not locked (see Section 23.3.15.3,
“Locking and Releasing Message Buffers”). The CODE field is EMPTY, FULL, or OVERRUN but the
CPU has already serviced the MB (read the C/S word and then unlocked the MB).

The grammar in the above is bad. There shouldn't be a full-stop in the middle, as the second sentence is hanging and has no meaning. This is better written in the other manual. Also, it implies that no MB will receive a frame unless the CPU "services" it first.

MCF5329

An MB with a matching ID is “free to receive” a new frame if the MB is not locked (see
Section 21.4.5.3, “Locking and Releasing Message Buffers”) and the CODE field is either
EMPTY or else it is FULL or OVERRUN but the CPU has already serviced the MB (read the C/S
word and then unlocked the MB).

If the first MB with a matching ID is not “free to receive” the new frame, then the matching
algorithm will overwrite the matching MB (unless it is locked) and set the CODE field to
OVERRUN (refer to Table 21-13). If the last matching MB is locked, then the new message
remains in the SMB, waiting for the MB to be unlocked (see Section 21.4.5.3, “Locking and
Releasing Message Buffers”).

That one makes more sense. It says an unlocked empty, OR an unlocked FULL or OVERRUN MB that has been "serviced" is "free to receive". That would allow a "Message Queue". Except you've then got to somehow signal an overrun when it happens.

But then the second paragraph says that even if it isn't "free to receive" the FIRST ONE will be overwritten and set to OVERRUN, so what was the point if it won't then look for a better match?

All this confusion may be the technical writers editing chapters to say what they think it should do, rather than finding out what it actually does, which may or may not be sensible.

You only option in these cases is a severe bout of reverse-engineering. You may still find out it doesn't work (that it can't implement a "message queue").

You didn't say what baud rate you're trying to use. What's the Application here? Vehicle, Industrial or Custom?

Do you have control of "the other end"? Can you program the other end to either rate-limit the transmissions or to use different IDs so you can spread the reception across multiple matching MBs? Or do you have to drop into an existing vehicle and work with what it has?

It a properly designed CAN system in a vehicle, specific messages (with specific IDs) are rate limited. They shouldn't repeat at more than 100Hz or so, or maybe a bit faster on the high-speed or engine/transmission bus. That means you should be able to program up to 15 Receive MBs with properly matching filters. The only time when you'd expect to get "back to back messages with the same IDs" would be during diagnostics or a firmware upgrade. If your device only has to receive specific messages, then programming the filters should have it working fine. It is only when you need to receive diagnostic streams, act as an "all messages bridge" or you have to receive more IDs than there are filters available that you should have the hardware receive everything and use software filtering.

So you have to receive all IDs? You may not be able to change the hardware, but you could add a two-port CAN "bridge" between your device and the network. It could rate-limit transmissions to your device, or you could implement a "software handshake" so it only sends a CAN message (to a specific filtered receive buffer) to your device after it has sent a CAN message back saying it has emptied that MB.

Tom

andreaswehrmann · ‎10-09-2015

Hey again!

I actually got it to work with the NMI.

I remember trying to implement a queue by programming multiple RX MB with the same ID...

just to find out that it's not working like I expected.

We're operating a custom system (inside tunnels/subway stations) where all components come from us.

However, there'll be only one node with the MCF548X on the bus,

which is the one I'm currently working on.

While a software change in the other components would theoretically be possible,

my colleagues and superiors are strongly against this because all components (except "my" board)

are already in use in other projects without any problems (no FlexCAN there...).

We're usually operating at 250 kBaud.

I'm currently running tests to see if 1MBaud is possible either.

Back to my solution:

I've been running stress-tests on the CAN bus overnight and had no crashes/hangs or data losses so far,

so it looks like it's stable...

I did the following:

- Compile driver as a built-in of the kernel (not as a module)

- Setup the MBored irq at level 7 (make it a NMI)

- Adapt global interrupt handler of Linux to call the MBor ISR directly from there (inside ints.c)

(I did this because I wasn't sure whether the IRQ counters (see /proc/interrupts) reside in memory managed by the MMU...)

- Adapt ISR to only use CPU registers(!!)

- Implement RX FIFO in SRAM

(we can safely access SRAM from the NMI because it lies inside the MBAR area, which shouldn't interfere with the MMU)

- Use two unused interrupt vectors to signal RX/TX event via the INTFRC registers

- The "RX softirq" reads from the FIFO in SRAM and forwards it to another RX FIFO

which resides in "virtual" memory (normal kfifo allocated with kmalloc) from which the userspace process reads.

- The "TX softirq" is simply responsible for transmitting the next frame (if any)

I hope I could make my solution somewhat clear.

TomE · ‎10-09-2015

> I actually got it to work with the NMI.

Congratulations.

The only risk I can see is if the hardware generates multiple NMI interrupts, or if the interrupt isn't masked off in the FlexCAN module and reliably deasserted. If you get a second NMI while you're processing the first one you'll re-enter the service routine. Because it is running with static RAM as its work area it can't be written to be re-entrant. The risk is that you might get a Transmit Interrupt while servicing a Receive Interrupt or vice-versa.

Because the 68k Coldfire interrupt scheme sets the IPL to the level of an interrupt on entry, they can employ a simple, reliable and understandable LEVEL-TRIGGERED interrupt scheme for IPL1 to IPL6. That protects absolutely against interrupt reentry as the IPL is only restored at the RTI at the end of the interrupt service routine. Because you can't mask the NMI, it is the outlier in this scheme and is EDGE-TRIGGERED.

So, tricky and Risky.

There are 64 interrupt sources (approx) and each FlexCAN module has three - the Error, Bus Off and Message Buffer interrupts. The latter one is the "OR" of all 16 MBs.

I assume you have ICR51 (FlexCAN0 MBOR) and/or ICR57 (FlexCAN1 MBOR) set with the Level of "7".

That gives you the following flexible masking capabilities:

1 - The appropriate bits in the IMRH register

2 - The 16 MB bits in the FlexCAN IMASK register.

So your ISR has to disable the NMI at the start of the interrupt by hitting one of the above registers.

Then, pretty much the last thing it has to do before hitting the RTI is to enable the interrupts again. But that guarantees a re-entrant interrupt on the instruction after the one that performed the write (if there's another interrupt pending), which is going to be BEFORE the routine restores the registers and executes the RTI. So you'd better restore the registers from the STACK and not from SRAM.

But on a 68k you don't have to burn a register to write to memory like you do with a RISC CPU (ARM, PPC etc). You can execute a move from memory to memory. So after restoring the registers you can perform a write from a SRAM location to one of the above interrupt control registers, and then execute the RTI. Guaranteed safe against reentry.

But since this is a ColdFire and not a real 68k, the "move (xxx).L (yyy).L" form is disallowed. You can't even write an immediate value to a memory location. So you have to use the stack. Maybe you can write from the stack to an absolute destination address, like "move (sp)+, IMRH).L". That should work if you want to stay off the stack.

> We're operating a custom system (inside tunnels/subway stations

> I'm currently running tests to see if 1MBaud is possible either.

Only if it is a model railway, and then only a small layout.

The speed of light gets in the way. Do you know the complete and complicated formula for trading off system clock accuracy, bus length and the division of the time quanta? The best you can do (if EVERY device on the bus is programmed properly and not left at "default) is to have double the maximum bus length being less than about 80% of a bit time. So at 1MBaud light goes 300m, so in 60% of C cable (5ns/meter) that's 180m round trip or 90m end-to-end. But we have to sample the bit at 80%, so make that 72m.

But the transceivers delay the signal, as do the controllers. You have setup and hold (and maybe clocking delays) on the MCF5485 plus two transmit and two receive delays in the transceivers at each end. For something like the MCP2551 that's a specified "loop delay" of 225ns. So that's 450ns gone or 45% of he bit. It is worse at higher temperatures, and also worse if "slope limiting" is enabled to reduce EMC.

So recalculate with pushing the sampling at 80% of the bit time (with 45% gone in the transceivers) and the bus can only be 350ns long. That's 35 meters MAXIMUM. And I haven't factored the controller setup and hold delays into that.

The same calculations at 250jHz get you 365 metres.

The following, referencing ISO-11898 gives 30m for 1MHz and 250m for 250kHz:

Untitled 1

This article gives an excellent explanation of the timing involved:

Signaling rate versus cable length: the CAN-bus timing trade-off | EE Times

Tom

andreaswehrmann · ‎10-09-2015

Thanks a lot for all the insights! Really, I appreciate it!

I should mention that I'm using FlexCAN 0 only and that there is no other NMI besides the FlexCAN 0 MBor interrupt.

Inside the NMI I use a loop to handle RX requests for as long as one of the IFLAGs is set, so I'm basically doing this:

1. Read IFLAGS

2. If any IFLAG set then continue else done

3. ACK IFLAGS immediately

4. handle RX (if set) then TX (if set)

5. goto 1.

Now back to the speed tests: The CAN bus itself is only local to the devices that sit inside a niche or a station.

So the actual length of the CAN bus won't be more than a couple of meters (if any).

In order to communicate over longer distances we use optical fiber and special media bridges.

Anyway... I'm just doing this test because my superior wants to see "if we can do it"...

Have a good one!

Andreas

TomE · ‎10-09-2015

> so I'm basically doing this:

Guaranteed to fail eventually.

The situation is that you have received a message from a message buffer and ACKed the interrupt. There are no interrupts pending, MBOR isn't asserted and the NMI has gone away.

Then you spend a rather long time in that code reading the message buffer.

THEN you get another message received while you're in that routine. NMI asserts again and you can't stop it from taking another NMI while you're in the NMI service routine! You are now re-entrant. You will corrupt all sorts of things.

You could read the MB and then ACK it (to make the NMI go away) but that exposes another race condition and you can lose interrupts that way.

As I detailed, when you enter the NMI you have to completely MASK all of the MBOR interrupts out, so you can service the interrupt without the risk of getting another one. There are two ways to do that. You have to do one of them.

You may be getting reentrant interrupts and haven't noticved. They may or may not be corrupting things, but you wouldn't know unless it crashes or unless you're testing for corrupted data. You should add a test for this condition. Test a static memory location as the first thing entering the NMI service routine, and throw an error (set a flag byte) if it isn't ZERO. Then increment it. Decrement it on exit. That should find any reentrancy. Make sure you're receiving and transmitting, and transmitting often and asynchronously relative to the receive data. You should detect this condition. Then you have to make sure you can stop it from happening.

> Anyway... I'm just doing this test because my superior wants to see "if we can do it"...

All devices on the bus would have to be programmed to the higher speed. The most likely things to go wrong are that one device has a crystal that can't divide down to an even multiple (like 16) of the new baud rate, but could divide down to 250kHz. That's a show stopper. The other thing is if any transmitters are slope-limited or have EMC/ESD filters on them that slow the slew rates. I remember helping someone who had a very bad waveform that wouldn't work at high baud rates on one unit. He posted the circuit and the oscilloscope traces. I said "if those 220pF caps are really 220nF it would explain the problem". It did.

Tom

MCF548X FlexCAN RX MessageQueue

MCF548X FlexCAN RX MessageQueue

General