mcf54418 canbus question

_angelo_ · ‎05-19-2021

Hi all,

still working on this, enjoying having linux flexcan as working driver
together with mmu and elf binaries.

I am a good point, but still some issues.
I am testing with peak_usb on the remote side and both sides
are 120ohm terminated. Using a tja1050 as a driver on coldfire side.

~/scripts # ip link set can0 type can bitrate 500000
~/scripts # ip link set can0 up

[   13.570000] flexcan flexcan.0: can_rx_offload_init_queue: skb_queue_len_max=128
[   13.580000] flexcan_chip_start() entering
[   13.580000] flexcan_chip_enable() entering
[   13.580000] flexcan_set_bittiming() entering
[   13.590000] flexcan_set_bittiming() priv->can.ctrlmode 00000000
[   13.600000] flexcan_set_bittiming() reg 00002000
[   13.600000] flexcan flexcan.0 can0: writing ctrl=0x00002000
[   13.610000] flexcan_set_bittiming_ctrl(): entering
[   13.610000] flexcan_set_bittiming_ctrl() writing ctrl=0x0e312005
[   13.620000] flexcan flexcan.0 can0: writing ctrl=0x0e312005
[   13.620000] flexcan flexcan.0 can0: flexcan_set_bittiming_ctrl: mcr=0x5980000f ctrl=0x0e312005
[   13.630000] flexcan_chip_freeze() entering
[   13.640000] flexcan flexcan.0 can0: flexcan_chip_start: writing mcr=0x79a3023f
[   13.640000] flexcan flexcan.0 can0: flexcan_chip_start: writing ctrl=0x0e312055
[   13.650000] flexcan_chip_unfreeze() entering
[   13.660000] flexcan flexcan.0 can0: flexcan_chip_start: reading mcr=0x60a3023f ctrl=0x0e312055
[   13.660000] flexcan_chip_start() ok
[   13.670000] flexcan_irq(): entering, 00024036
[   13.670000] flexcan_irq() reg_esr 00020036
[   13.670000] flexcan flexcan.0 can0: Controller changed from Error Active State (0) into Bus Off State (3).
[   13.670000] flexcan flexcan.0 can0: bus-off
[   13.670000] flexcan_error_irq_disable() entering
[   13.680000] flexcan_chip_interrupts_enable() setting imask1 as 000000a0
[   13.680000] flexcan_open(): ok

So i am getting bus error interrupt just afteer if up.
Coldfire is never sending any data on TX. IU clearly see clean RX data coming from peak_usb.
What are common cases i can get bus-error ?

From a hardware point of view, i replicated a diagram used
on some other nxp dev boards using TJA1050.

Thanks

angelo

TomE · ‎05-20-2021

So something between the "ip link" program and the driver doesn't know how to perform an integer division. That looks like a rounding problem to me.

I'd first try "ip link set can0 type can bitrate 500001" and see if that gets it right. If it doesn't, keep trying larger numbers until it does. That might be all the fix you need.

I don't think the rounding bug is in the FlexCAN driver. I'm looking at some (old) sources here and the segment lengths look to be attributes that the driver makes visible. But these things change a lot over time, so YMMV. So the bug is probably in "the next thing up". Which is a bug that was probably fixed decades ago, but the linux sources you're using there are probably over 12 years old and haven't been updated. What version of Linux (with what build date) are you running there?

If you type "flexcan baud rate" into the search bar you might find that someone else had this problem in one of the 94 search results a decade or more ago. Please let us know what you find.

I found an old reply of mine mentioning "AN1798", which might be worth reading if you haven't already.

https://community.nxp.com/t5/ColdFire-68K-Microcontrollers/MCF54418-flexCAN-porblems/m-p/704512/

Tom

_angelo_ · ‎05-25-2021

Hi Tom !

thanks, this was a short weekend and had not time for linux, but, at least i can something

setting speed to 500001 seems not helping.

kernel is mainline:

[ 0.000000] Linux version 5.12.0-rc3stmark2-001-00010-g072b9a688f92-dirty (angelo@dfj) (m68k-linux-gcc (GCC) 10.1.0, GNU ld (GNU Binutils) 2.34) #570 Thu May 20 23:44:17 CEST 2021

Error i see, after "can up" at 500kbps, with or without peak_usb connected is BIT0ERR

And bus-off interrupt too, that maybe is a consequence.

[ 17.180000] flexcan_irq(): entering, esr = 0x00024036

On peak_usb side i am just setting it up and using "candump can0"

Will be back on this later, hopefully.

TomE · ‎05-25-2021

> setting speed to 500001 seems not helping.

That was just to check for an off-by-one rounding error. You can easily enter other rates to see what input generates 500kHz on the oscilloscope. That should then be able to communicate. 50010. Or (most likely) 526,000 or so.

It may be the master clock is wrong, but the error you're getting is close to 19/20 or 20/21. Which is how many time quanta you have.

You should not get BUSERR when there's nothing else on the bus. CAN standards require a lone device on the bus to retry "forever" without giving errors. If you are getting errors then the bus isn't terminated or the transceiver isn't enabled or something like that.

Tom

_angelo_ · ‎05-31-2021

Hi Tom,

thanks a lot, great, you solved at least the major bus-off issue, it was due to the fact that TJA1050 cannot be powered at 3,3, needs 5, so, replaced it with SN65HVD232 and now bus off is mostly gone, even if it still happens sometime but may be due to issues to my home-made driver circuit here.

Measured better the bitrate, at 500000 it is now very near:

bos set meas.
500000 1.98us 120000000busclk 505,050 bps

Actually, receiving frames from peak_usb by cansend works properly. Coldfire receives frames properly.
Only remaining issue is that i cannot send,

On pc side i am setting peak_can usb as "candump can0"

Sending a frame from coldfire, i see the flexcan_start_xmit() function is entered, it completes, and packet seems sent, but there is nothing generated in the TX line, and also, if i retry, flexcan_start_xmit function of the driver is not called again.

~/scripts # ./cansend.sh
seding single packet ...
send_frame() iface:can0 data:01a#11223344AABBCCDD

[   68.920000] flexcan_start_xmit(): entering
[   68.920000] flexcan_start_xmit(): can_id   = 00680000
[   68.930000] flexcan_start_xmit(): can_ctrl = 0c080000

esr before flexcan_start_xmit() return
[ 68.930000] flexcan_start_xmit(): esr = 00000080
done

Regards,
angelo

TomE · ‎05-31-2021

> bus off is mostly gone, even if it still happens sometime

"Bus off" is a very serious error. CAN retries transmissions in hardware when it detects errors. It has to get 15 or 32 hard errors in a row before it goes "Bus Off". That's not something you should ignore.

CAN is a three-wire bus. Make sure you have a common ground connection between all of your CAN devices. A lot of people get this wrong and it can cause problems from "random" up to "exploding chips".

You're getting reception working. I can't see your setup from here, but you're implying there's only two devices on the CAN bus, the "Peak" whatever and the MCF. Is that correct? If that IS correct then the MCF CAN module must be transmitting in order to receive. That's because when one device transmits a CAN message, there's a one-bit "ack slot" in the message. At least one other device on the bus has to transmit a "Dominant bit" (meaning it has to drive the bus) during that bit time to tell the transmitter that the message has been received.

So you should set up continuous transmission from the "Peak" device and monitor the RX and TX pins on the MCF CAN transceiver. You should see it transmitting the ACK bits. If that is happening then the CAN module, the transceiver and the MCF pin programming are all set up properly.

If you have three (or more) devices on the bus then the OTHER device might be driving the ACK bit, and so this test doesn't show much. But you should still look for that ACK bit on the MCF.

So please do all that first and report back on it. And please detail fully what your test setup is and how many CAN devices are on the bus.

If it isn't sending the ACK, then maybe you have the port setup programming wrong and the CAN TX pin on the MCF isn't set up properly - maybe it is set up as a GPIO and not connected to the CAN module.

The next thing that goes wrong is that the FlexCAN port won't be able to finish initialising (specifically can't get out of the config state) if the CAN bus isn't idle. This can happen if it isn't terminated. Make sure you're using the same test setup that can receive properly when you're transmitting.

You're probably seeing the second attempted transmit not return because the first one hasn't completed.

> [   68.920000] flexcan_start_xmit(): entering
> [   68.920000] flexcan_start_xmit(): can_id   = 00680000
> [   68.930000] flexcan_start_xmit(): can_ctrl = 0c080000

I can't see your code, so I have no idea what "can_id" is. There are no registers in the MCG54418 Reference Manual with that name. I'm guessing that's the second word in the message buffer.

How have you got CANMCR programmed?

That CANCTRL register has PRESDIV=12, PSEG1=1, PSEG2=0, PROPSEG=0. That's illegal. Nobody sets it up like that. That takes Fsys/2 and divides it by 13 and then has 5 Quanta in a bit, or one bit is the system clock divided by 65. Do you have a 32MHz clock perhaps?

If you read "33.3.21 Protocol Timing", and specifically "Figure 33-19. Segments within the Bit Time" it says the minimum value for the above sums is EIGHT and not FIVE as you have. If you're running 32MHz then you should have 8 or 16 quanta and have PRESDIV set to 7 or 3 as appropriate to divide by 8 or 4. That gets you an exact 500000 bits/second.

That might explain why the poor thing can't receive properly and can't transmit. It can't get around to all the work it has to do within a bit with too few clocks. It needs at least 8 clocks per bit to get everything done.

How are you getting those PSEG values? I thought the driver or the "ip" program was meant to select a valid set of numbers given the system clock and the requested baud rate.

> esr before flexcan_start_xmit() return
> [ 68.930000] flexcan_start_xmit(): esr = 00000080

That shows the CAN bus is idle then, and without any errors (yet). What does it show later?

This is how I'm setting the PSEG values up - this is Linux 2.6, so it has probably changed ... where have the advanced editing options gone? I can't insert code any more???

Tom

TomE · ‎05-31-2021

This forum editor wouldn't show me the Advanced Toolbar while editing, so I'm starting a new post to get that.

Here's how I set up the PSEGs with Linux 2.6, which is the latest one NXP released for the i.MX53. This is from "/etc/init.d/can.sh" which is run at startup in the usual way:

# Linux 2.6 CAN sysfs files
SDF0=/sys/devices/platform/FlexCAN.0

        # Have to set all the bits as bitrate=1000000 defaults to /1 and 7:8:8:2
        echo 4 > $SDF1/br_propseg
        echo 4 > $SDF1/br_pseg1
        echo 3 > $SDF1/br_pseg2
        echo 3 > $SDF1/br_rjw
        echo 2 > $SDF1/br_presdiv
        echo 0 > $SDF1/boff_rec
        echo 1 > $SDF1/fifo
        echo 9 > $SDF1/maxmb
        echo 8 > $SDF1/xmit_maxmb

For Linux 3.4 the sequence we were using was:

# linux 3.4, so canconfig is the configuration interface
/usr/local/sbin/canconfig can1 stop > /dev/null
/usr/local/sbin/canconfig can1 bitrate 1000000 > /dev/null
/usr/local/sbin/canconfig can1 restart-ms 1000 > /dev/null
/usr/local/sbin/canconfig can1 clockfreq | grep -q 66666666 > /dev/null
RC=$?
if [ $RC -eq 0 ]
then
   # 66.6MHz (old), divide by 6 for 90ns (11MHz) and 11 quanta/bit
   PARMS="bittiming tq 90 prop-seg 3 phase-seg1 4 phase-seg2 3 sjw 3"
else
   # 24MHz (new), divide by 2 for 83ns (12MHz) and 12 quanta/bit.
   PARMS="bittiming tq 83 prop-seg 4 phase-seg1 4 phase-seg2 3 sjw 3"
fi
/usr/local/sbin/canconfig can1 $PARMS > /dev/null

Tom

_angelo_ · ‎06-02-2021

Hi Tom,

really thanks a lot. Now all works.

From peak_usb to pc

angelo@dfj ~/dev-sysam/canbus $ candump can0
can0 01A   [8] 11 22 33 44 AA BB CC DD
can0 01A   [8] 11 22 33 44 AA BB CC DD
can0 01A   [8] 11 22 33 44 AA BB CC DD
can0 01A   [8] 11 22 33 44 AA BB CC DD

From MCF to peak_usb

~/scripts # candump can0
candump listening ...
[   89.750000] flexcan_irq(): iflag1 00000020
[   89.750000] flexcan_irq(): esr     00000080
01A#11223344AABBCCDD
[   90.930000] flexcan_irq(): iflag1 00000020
[   90.930000] flexcan_irq(): esr     00000080
01A#11223344AABBCCDD

Now the issue history and solutions:

1) initial BUS-OFF was due to a TJA1050 powered at 3.3
used a SN65HVD232 powered at 3.3

2) BUS-OFF still happening sometime
fixed celaning pads on my home made pcb

3) last issue was proper receive, included ACK bit generated
form MCF, bit transmitting nothing was sent.
This was due to the mainline flexcan driver that i am adjusting.
It considers 64 buffers, while MCF has 16, so i was writing a
buffer in an invalid memory area. Fixed the driver.

So, mainline patch coming in short.

Still a great thanks Tom !

Hui_Ma · ‎05-20-2021

Hi,

I checked the <RELEASE NOTES M54418 Tower Kit Linux BSP Released: Jun. 18, 2010> with below description:

KNOWN ISSUES

6) To make CAN work, it is needed to blue wire J4 pin2 to GND on TWR-SER2 board
workaround: No
Revision plan: hardware dependent.

Wish it helps.

best regards,

Mike

_angelo_ · ‎05-20-2021

Thanks ! did it.

TomE · ‎05-20-2021

"peak_usb on the remote side". Is the Peak programmed to be a bus monitor or is it programmed to fully participate in the network?

If it is programmed to be a device (that can receive and send), then the most likely problem you have is a mismatching baud rate between the Peal and the MCF. CAN baud rates are a pain. Getting this wrong takes you into "Bus Off" territory very quickly.

Search for "can baud rate" in THIS forum and also in the i.MXRT forums. Follow links in there to any good off-site documentation.

To calculate the register values to use, you have to find the Flexcan clock rate the peripheral is using. Sometimes that is very hard to work out which clock it is really using. Then you have to divide that down to a "Quanta Clock". Then that is divided by "1 + propseg + pseg1 + pseg2" and THAT is the bit rate. That's made more complicated as the registers are programmed with those values minus one.

You can also (as you seem to be doing) print the relevant registers and the decode them and work backwards to the bit rate that it should be with those register contents.

You normally set the Linux CAN bit rate with a call to a program that you pass the required baud rate into, and it is meant to do all of those calculations for you.

The best way to find what is going on is to put an oscilloscope on the CAN bus, make your device try and transmit something (not the Peak) and then measure the minimum bit width and see if that's what you expect it to be. To make that easier, remove the Peak so the MCF is the only device on the CAN bus. Then send something. It will transmit that "for ever" as nothing is answering, but it shouldn't get errors and you'll be able to measure the width of the bits. Then do the same for the Peak and see if its bits are the same width.

Tom

_angelo_ · ‎05-20-2021

Hi Tom, thanks a lot, as always.

So the board has

CPU: Freescale MCF54415 (Mask:a0 Version:2)
CPU CLK 240 MHz BUS CLK 120 MHz FLB CLK 60 MHz
INP CLK 30 MHz VCO CLK 480 MHz

Yes, as you said, Linux passes the "quanta" divisors and
(maybe) proper bit intervals:

S clock frequency (time quanta) = fsys/2 / PRESDIV + 1
120000000 / 15 = 8000000 = 125nsec

Total bit time must be composed as:
1 SYNC (fixed) | PROP_SEG + PSEG1 + 2 | PSEG2 + 1

Decoding MCF register contents i get
(as time quanta values)
1 | 6 + 6 + 2 | 1 + 1

1 | 14 | 2 = tot 17
17 * 125ns = as 470,588 speed
(while i set up for 500 by:)
ip link set can0 type can bitrate 500000
ip link set can0 up

Also, NXP datasheet says intervals should be

So seems i am a bit out as a bitrate and as time segments.

Setting can up on MCF i see

and

So yes, looks like i am near 475 bps. Will try to adjust the
clock now, let's see.

Thanks,
angelo