<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>i.MX ProcessorsのトピックRe: i.MX53 Linux FlexCAN Driver Can't Send Properly &amp; other bugs.</title>
    <link>https://community.nxp.com/t5/i-MX-Processors/i-MX53-Linux-FlexCAN-Driver-Can-t-Send-Properly-other-bugs/m-p/459736#M72076</link>
    <description>&lt;HTML&gt;&lt;HEAD&gt;&lt;/HEAD&gt;&lt;BODY&gt;&lt;P&gt;&lt;SPAN class="j-post-author"&gt;&lt;STRONG&gt;&lt;A href="https://community.nxp.com/people/alejandrolozano"&gt;alejandrolozano&lt;/A&gt;&lt;/STRONG&gt;&lt;/SPAN&gt; wrote:&lt;/P&gt;&lt;P&gt;&amp;gt; Sorry, the thread was marked as answered. Let me delve into this.&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;Yes, that was me that "Answered" it and also me that "Unanswered" it.&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;OL&gt;&lt;LI&gt;The driver can't send CAN messages in order. This has been essential for a long time.&lt;/LI&gt;&lt;LI&gt;I found that the simplest way to fix this is to set the number of transmit Message Buffers to "1". That problem fixed.&lt;/LI&gt;&lt;LI&gt;Then I found it goes horridly slow when pushed - it drops to one message/second.&lt;/LI&gt;&lt;LI&gt;I found that was because it calls "netif_start_queue()" when it should be calling "netif_wake_queue()". That problem fixed. So I posted a patch and marked it "Answered". That was a bit premature.&lt;/LI&gt;&lt;LI&gt;Then it locks up solid when pushed, so I marked it "Unanswered".&lt;/LI&gt;&lt;LI&gt;I then found that was due to a very basic interrupt hazard bug. When I fix that, test it and post a Patch for that one I'll mark this "Answered" again.&lt;/LI&gt;&lt;/OL&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;Tom&lt;/P&gt;&lt;/BODY&gt;&lt;/HTML&gt;</description>
    <pubDate>Wed, 16 Dec 2015 22:19:26 GMT</pubDate>
    <dc:creator>TomE</dc:creator>
    <dc:date>2015-12-16T22:19:26Z</dc:date>
    <item>
      <title>i.MX53 Linux FlexCAN Driver Can't Send Properly &amp; other bugs.</title>
      <link>https://community.nxp.com/t5/i-MX-Processors/i-MX53-Linux-FlexCAN-Driver-Can-t-Send-Properly-other-bugs/m-p/459732#M72072</link>
      <description>&lt;HTML&gt;&lt;HEAD&gt;&lt;/HEAD&gt;&lt;BODY&gt;&lt;P&gt;The Linux 2.6.35 FlexCAN driver has all sorts of problems. The latest one I've run into is that it can't send data properly.&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;CAN on Linux has problems caused by it being bolted into the Network code. Normally when sending data you'd prefer flow control to work by blocking the call before you get ENOBUFS. The way networking is set up this is what normally happens, as you run into the SO_SNDBUF limit (which blocks) before you hit the "/sys/class/net/eth0/tx_queue_len" limit, which gives ENOBUFS. Because the CAN buffers are small, and the "tx_queue_len" for CAN defaults to FIVE you get ENOBUFS all the time unless you push "tx_queue_len" up over 380 or so.&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;So if you run "cansequence" without any parameters you'll get this:&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P style="padding-left: 30px;"&gt;&lt;SPAN style="font-family: courier new,courier;"&gt;# /usr/local/bin/cansequence can1&lt;/SPAN&gt;&lt;BR /&gt;&lt;SPAN style="font-family: courier new,courier;"&gt;interface = can1, family = 29, type = 3, proto = 1&lt;/SPAN&gt;&lt;BR /&gt;&lt;SPAN style="font-family: courier new,courier;"&gt;write: No buffer space available&lt;/SPAN&gt;&lt;/P&gt;&lt;P style="padding-left: 30px;"&gt;&lt;/P&gt;&lt;P&gt;The writers of that program thought of that, so you can also do this, which has it polling for a POLLOUT condition:&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P style="padding-left: 30px;"&gt;&lt;SPAN style="font-family: courier new,courier;"&gt;# /usr/local/bin/cansequence can1&lt;/SPAN&gt;&lt;BR /&gt;&lt;SPAN style="font-family: courier new,courier;"&gt;interface = can1, family = 29, type = 3, proto = 1&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;I can then use "cansequence -r -v" to have it print one line for every 256 incoming messages.&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;And nothing happens. And nothing KEEPS happening. So instead:&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P style="padding-left: 30px;"&gt;&lt;SPAN style="font-family: courier new,courier;"&gt;# /usr/local/bin/cansequence -r -v -v&lt;/SPAN&gt;&lt;BR /&gt;&lt;SPAN style="font-family: courier new,courier;"&gt;interface = can0, family = 29, type = 3, proto = 1&lt;/SPAN&gt;&lt;BR /&gt;&lt;SPAN style="font-family: courier new,courier;"&gt;received frame. sequence number: 146&lt;/SPAN&gt;&lt;BR /&gt;&lt;SPAN style="font-family: courier new,courier;"&gt;received frame. sequence number: 147&lt;/SPAN&gt;&lt;BR /&gt;&lt;SPAN style="font-family: courier new,courier;"&gt;received frame. sequence number: 148&lt;/SPAN&gt;&lt;BR /&gt;&lt;SPAN style="font-family: courier new,courier;"&gt;received frame. sequence number: 149&lt;/SPAN&gt;&lt;BR /&gt;&lt;SPAN style="font-family: courier new,courier;"&gt;received frame. sequence number: 150&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;What I can see is that the above is printing ONE LINE PER SECOND.&lt;/P&gt;&lt;P&gt;&lt;BR /&gt;That is because "cansequence -p" on top of the supplied drivers is only able to send one CAN message per second on a 1 MBit/second CAN bus.&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;An oscilloscope bears this out. Running "cansequence" has it sending messages in a burst up until the transmit buffer size, and then reverting to sending one per second.&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;One second is the timeout passed to the "poll()" call by cansequence.&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;That tells me the driver can't be properly signalling when a buffer comes free. Here's the code that does that:&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;PRE __default_attr="c++" __jive_macro_name="code" class="jive_macro_code _jivemacro_uid_14491206970571721 jive_text_macro" data-renderedposition="827_8_1234_128" jivemacro_uid="_14491206970571721"&gt;&lt;P&gt;static void flexcan_mb_bottom(struct net_device *dev, int index)
{
...
&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; if (hwmb-&amp;gt;mb_cs &amp;amp; (CAN_MB_TX_INACTIVE &amp;lt;&amp;lt; MB_CS_CODE_OFFSET)) {
&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; if (netif_queue_stopped(dev))
&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; netif_start_queue(dev);
&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; return;

&lt;/P&gt;&lt;/PRE&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;And the above would be wrong because the definition of "netif_start_queue()" basically comes down to:&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;PRE __default_attr="c++" __jive_macro_name="code" class="jive_macro_code jive_text_macro _jivemacro_uid_14491212359502623" data-renderedposition="1018_8_1234_32" jivemacro_uid="_14491212359502623"&gt;&lt;P&gt;include/linux/nedevice.h:&lt;/P&gt;&lt;P&gt;clear_bit(__QUEUE_STATE_XOFF, &amp;amp;dev_queue-&amp;gt;state);&lt;/P&gt;&lt;/PRE&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;Every other driver calls "netif_wake_queue() in that place, and that function does more than the above as it actually causes a reschedule:&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;PRE __default_attr="c++" __jive_macro_name="code" class="jive_macro_code _jivemacro_uid_1449121454357514 jive_text_macro" data-renderedposition="1113_8_1234_96" jivemacro_uid="_1449121454357514"&gt;&lt;P&gt;static inline void netif_tx_wake_queue(struct netdev_queue *dev_queue)
{
&amp;nbsp;&amp;nbsp;&amp;nbsp; if (test_and_clear_bit(__QUEUE_STATE_XOFF, &amp;amp;dev_queue-&amp;gt;state))
&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; __netif_schedule(dev_queue-&amp;gt;qdisc);
}

&lt;/P&gt;&lt;/PRE&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;Changing the call in flexcan_mb_bottom() to netif_wake_queue() fixes this serious bug.&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;Isn't anyone else using CAN on the i.MX28 and i.MX53? I know of one other, and he ported the mainstream driver back to his project to gt it working:&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;&lt;A _jive_internal="true" data-containerid="2004" data-containertype="14" data-objectid="272930" data-objecttype="1" href="https://community.nxp.com/thread/272930" rel="nofollow noopener noreferrer" target="_blank"&gt;https://community.freescale.com/thread/272930&lt;/A&gt;&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;Tom&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;p.s.&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;While I'm here I might as well list all of the other problems I've found with it so far:&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;UL&gt;&lt;LI&gt;&lt;SPAN&gt;FIFO code doesn't work: &lt;/SPAN&gt;&lt;A _jive_internal="true" data-containerid="2004" data-containertype="14" data-objectid="381075" data-objecttype="1" href="https://community.nxp.com/thread/381075" rel="nofollow noopener noreferrer" target="_blank"&gt;https://community.freescale.com/thread/381075&lt;/A&gt;&lt;/LI&gt;&lt;LI&gt;sysfs code forgets to add 1 when printing /sys/devices/platform/FlexCAN.0/rjw.&lt;/LI&gt;&lt;LI&gt;Code attempting to limit xmit_maxmb to less than maxmb does the reverse.&lt;/LI&gt;&lt;LI&gt;Code doesn't force DLC values above "8" to "8" to stop netif panics.&lt;/LI&gt;&lt;LI&gt;dump_rx_mb and dump_xmit_mb functions kills everything when the interface is down.&lt;/LI&gt;&lt;LI&gt;dump_*_mb() functions don't handle fifo mode.&lt;/LI&gt;&lt;LI&gt;Sysfs code printing rjw should add one to it.&lt;/LI&gt;&lt;/UL&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;Here's some more problems with it:&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;UL&gt;&lt;LI&gt;Sysfs supports "Clock Selection" but that only applies for i.MX35 and not any of the other ones.&lt;/LI&gt;&lt;LI&gt;The "flexcan_set_bitrate()" function starts with the misleading comment "&lt;SPAN&gt;&lt;SPAN class="comment"&gt;TODO:: implement in future&lt;/SPAN&gt;&lt;/SPAN&gt;", and then implements the "future". This matches the provided Documentation ("mx53_linux.pdf"), which says that the bitrate setting doesn't work when it does and did prior to that documentation being written.&lt;/LI&gt;&lt;/UL&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;These latter ones are documented here:&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;&lt;A _jive_internal="true" data-containerid="2004" data-containertype="14" data-objectid="590070" data-objecttype="2" href="https://community.nxp.com/message/590070#590070" rel="nofollow noopener noreferrer" target="_blank"&gt;https://community.freescale.com/message/590070#590070&lt;/A&gt;&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;Message was edited by: Tom Evans, adding some more bugs.&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;Here's some more problems with it, detailed later on in this thread:&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;UL&gt;&lt;LI&gt;There's a seriously bad interrupt hazard in the transmit code. The transmit interrupt attempts to enable the queue, but if it interrupts the mainline transmit code, that disables the queue after it has been enabled by the interrupt. This gives a solid lockup, but you normally have to have it set to one TX MB to have this happen.&lt;/LI&gt;&lt;LI&gt;The lockup should be able to be cleared by taking the port down and up again, but the open() and stop() functions don't operate on the queue.&lt;/LI&gt;&lt;/UL&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;Message was edited by: Tom Evans, adding some more bugs.&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;The Bus Off Recovery doesn't work at all. There are 5 or more separate bugs involved in this one.&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;&lt;A class="jive-link-message-small" data-containerid="2004" data-containertype="14" data-objectid="599099" data-objecttype="2" href="https://community.freescale.com/message/599099#599099" rel="nofollow noopener noreferrer" target="_blank"&gt;https://community.freescale.com/message/599099#599099&lt;/A&gt;&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;Message was edited by: Tom Evans to add the Bus Off Recovery problem pointer.&lt;/P&gt;&lt;/BODY&gt;&lt;/HTML&gt;</description>
      <pubDate>Thu, 03 Dec 2015 05:51:29 GMT</pubDate>
      <guid>https://community.nxp.com/t5/i-MX-Processors/i-MX53-Linux-FlexCAN-Driver-Can-t-Send-Properly-other-bugs/m-p/459732#M72072</guid>
      <dc:creator>TomE</dc:creator>
      <dc:date>2015-12-03T05:51:29Z</dc:date>
    </item>
    <item>
      <title>Re: i.MX53 Linux FlexCAN Driver Can't Send Properly &amp; other bugs.</title>
      <link>https://community.nxp.com/t5/i-MX-Processors/i-MX53-Linux-FlexCAN-Driver-Can-t-Send-Properly-other-bugs/m-p/459733#M72073</link>
      <description>&lt;HTML&gt;&lt;HEAD&gt;&lt;/HEAD&gt;&lt;BODY&gt;&lt;P&gt;I've attached patches for the Receive, Transmit and all the other bugs on this thread:&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;&lt;A href="https://community.nxp.com/thread/303271"&gt;Submit i.MX53 &amp;amp;amp; i.MX28 Linux kernel patches&lt;/A&gt;&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;And to help others use this driver...&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;The Reference for this manual is "Chapter 32" in the " i.MX53 EVK Linux Reference Manual". It lists the variables in the sysfs area without saying how to use them. Most map directly through to bits in the FlexCAN registers, but some don't.&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;In order to get transmissions in the right order it is necessary to reduce the number of transmit buffers to ONE.&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;The parameters that control this are "fifo", "maxmb" and "xmit_maxmb". You might think that "xmit_maxmb" controls this, but it controls the number of RECEIVE MBs and not the number of transmits. I'm guessing the driver suffered a redesign at some point (to reverse the Receive/Transmit allocation) and the control variable didn't get renamed.&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;"maxmb" does what it says on the tin. It is the total number of MBs in use. Su it ranges from 1 to 64. The value of this MINUS ONE is written to FLEXCANx_MCR[MAXMB].&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;The buffers are divided up into the ones to be used for reception first, then the rest are used for transmission. If "fifo" is "1" then the FIFO is enabled, 8 MBs are used for receptionand "maxmb - 8" are used for transmission. "xmit_maxmb" is ignored by the code (except for the "dump" code which are another set of bugs I've fixed).&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;So if you've enabled the FIFO and want to transmit in order, set "maxmb = 9" and ignore "xmit_maxmb" as long as you've added my patches.&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;If not using the FIFO, the number of RECEIVE MB's is "xmit_maxmb", and the number of TRANSIT MB's is "maxmb - xmit_maxmb". Yes, very confusing, the reverse of what you'd expect from the names.&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;The code sets up the receive filters on the receive MBs so that odd ones receive Extended IDs and even ones receive Standard IDs. So you can have a minimum of 2 receive buffers if you need to receive both types.&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;If you're using Kernel 2.6.35 then all data is received under interrupts. If you're using 2.6.38, 3.x or 4.x then they use a completely different driver, and none of the above applies, but then you'll run into worse problems if you're using the MMC/eMMC/ESDHC driver:&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;&lt;A href="https://community.nxp.com/message/533214"&gt; frames&lt;/A&gt; &lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;Tom&lt;/P&gt;&lt;/BODY&gt;&lt;/HTML&gt;</description>
      <pubDate>Thu, 10 Dec 2015 06:18:11 GMT</pubDate>
      <guid>https://community.nxp.com/t5/i-MX-Processors/i-MX53-Linux-FlexCAN-Driver-Can-t-Send-Properly-other-bugs/m-p/459733#M72073</guid>
      <dc:creator>TomE</dc:creator>
      <dc:date>2015-12-10T06:18:11Z</dc:date>
    </item>
    <item>
      <title>Re: i.MX53 Linux FlexCAN Driver Can't Send Properly &amp; other bugs.</title>
      <link>https://community.nxp.com/t5/i-MX-Processors/i-MX53-Linux-FlexCAN-Driver-Can-t-Send-Properly-other-bugs/m-p/459734#M72074</link>
      <description>&lt;HTML&gt;&lt;HEAD&gt;&lt;/HEAD&gt;&lt;BODY&gt;&lt;P&gt;I wrote:&lt;/P&gt;&lt;P&gt;&amp;gt; I've attached patches for the Receive, Transmit and all the other bugs on this thread:&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;And even with all those patches the driver &lt;STRONG&gt;STILL&lt;/STRONG&gt; doesn't work!&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;It still locks up when transmitting, and it locks up so badly you have to power cycle to recover it. Taking the port down and up again doesn't fix it.&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;Simply looking at the code shows it has a serious interrupt hazard. There's one mutex in there, and no spinlocks. And the mutex is only used to protext the sysfs variable. There is NO protection of the mainline code against the interrupt service routine.&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;Other drivers (for other chips and the mainline one for this chip) avoid the interrupt hazard by only having a single transmit MB and calling "netif_stop_queue(dev)" at the start of every transmit. The transmit routine then calls "netif_wake_queue(dev)" to start it again.&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;Instead this code searches for a spare MB, sends and returns with the queue still running. If the queue is now full, the transmit function is called again, finds no spare MBs, stops the queue and returns "NETDEV_TX_BUSY". This is a bad idea as this is documented in linux/Documentation/networking/netdevices.txt as:&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P style="padding-left: 30px;"&gt;&lt;SPAN style="font-family: courier new,courier;"&gt;NETDEV_TX_BUSY Cannot transmit packet, try later &lt;/SPAN&gt;&lt;BR /&gt;&lt;SPAN style="font-family: courier new,courier;"&gt;Usually a bug, means queue start/stop flow control is broken in&lt;/SPAN&gt;&lt;BR /&gt;&lt;SPAN style="font-family: courier new,courier;"&gt;the driver. Note: the driver must NOT put the skb in its DMA ring.&lt;/SPAN&gt;&lt;/P&gt;&lt;P style="padding-left: 30px;"&gt;&lt;/P&gt;&lt;P&gt;It is also stunningly inefficient. When the queue is full, every single MB send requires two calls to the transmit function. The first sends the MB and the second finds there's no MBs available. Worse, the code SEARCHES through all transmit MBs to see if it can find one. Since the FlexCAN is on a slow bus on the other size of a bridge, this takes about 200ns (EDIT: just wrote code to measure this at 180ns/read) for each read, which wastes 180 CPU instruction cycles at 1GHz. That means it takes about 6000ns in theory (actually 7000ns measured) to scan through the 32 transmit MBs once. And it does it TWICE so it takes between 900ns and 14000ns depending on whether the first scan found one quickly or not, just for the transmit to send one MB. If you've selected 56 transmit MBs for some reason it'll take a maximum of about 20us to transmit one MB. If you're running on a 1MHz CAN bus which can transmit one message every 50us, this means the CPU is busy for between 20% and 40% of the time just looping through the MBs twice, looking for a free one. The take-home message is to use as FEW transmit MBs as you can to avoid this.&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;As well as being slow, the interrupt hazard means that the transmit interrupt sometimes happens (I've got print statements showing this case) just prior to it returning "NETDEV_TX_BUSY". it then never interrupts again and the transmitter it totally wedged.&lt;/P&gt;&lt;P&gt;&lt;BR /&gt;That's another thing I'm going to have to fix.&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;Tom&lt;/P&gt;&lt;/BODY&gt;&lt;/HTML&gt;</description>
      <pubDate>Wed, 16 Dec 2015 06:22:36 GMT</pubDate>
      <guid>https://community.nxp.com/t5/i-MX-Processors/i-MX53-Linux-FlexCAN-Driver-Can-t-Send-Properly-other-bugs/m-p/459734#M72074</guid>
      <dc:creator>TomE</dc:creator>
      <dc:date>2015-12-16T06:22:36Z</dc:date>
    </item>
    <item>
      <title>Re: i.MX53 Linux FlexCAN Driver Can't Send Properly &amp; other bugs.</title>
      <link>https://community.nxp.com/t5/i-MX-Processors/i-MX53-Linux-FlexCAN-Driver-Can-t-Send-Properly-other-bugs/m-p/459735#M72075</link>
      <description>&lt;HTML&gt;&lt;HEAD&gt;&lt;/HEAD&gt;&lt;BODY&gt;&lt;P&gt;Hi, &lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;Sorry, the thread was marked as answered. Let me delve into this.&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;/Alejandro&lt;/P&gt;&lt;/BODY&gt;&lt;/HTML&gt;</description>
      <pubDate>Wed, 16 Dec 2015 16:30:59 GMT</pubDate>
      <guid>https://community.nxp.com/t5/i-MX-Processors/i-MX53-Linux-FlexCAN-Driver-Can-t-Send-Properly-other-bugs/m-p/459735#M72075</guid>
      <dc:creator>alejandrolozan1</dc:creator>
      <dc:date>2015-12-16T16:30:59Z</dc:date>
    </item>
    <item>
      <title>Re: i.MX53 Linux FlexCAN Driver Can't Send Properly &amp; other bugs.</title>
      <link>https://community.nxp.com/t5/i-MX-Processors/i-MX53-Linux-FlexCAN-Driver-Can-t-Send-Properly-other-bugs/m-p/459736#M72076</link>
      <description>&lt;HTML&gt;&lt;HEAD&gt;&lt;/HEAD&gt;&lt;BODY&gt;&lt;P&gt;&lt;SPAN class="j-post-author"&gt;&lt;STRONG&gt;&lt;A href="https://community.nxp.com/people/alejandrolozano"&gt;alejandrolozano&lt;/A&gt;&lt;/STRONG&gt;&lt;/SPAN&gt; wrote:&lt;/P&gt;&lt;P&gt;&amp;gt; Sorry, the thread was marked as answered. Let me delve into this.&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;Yes, that was me that "Answered" it and also me that "Unanswered" it.&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;OL&gt;&lt;LI&gt;The driver can't send CAN messages in order. This has been essential for a long time.&lt;/LI&gt;&lt;LI&gt;I found that the simplest way to fix this is to set the number of transmit Message Buffers to "1". That problem fixed.&lt;/LI&gt;&lt;LI&gt;Then I found it goes horridly slow when pushed - it drops to one message/second.&lt;/LI&gt;&lt;LI&gt;I found that was because it calls "netif_start_queue()" when it should be calling "netif_wake_queue()". That problem fixed. So I posted a patch and marked it "Answered". That was a bit premature.&lt;/LI&gt;&lt;LI&gt;Then it locks up solid when pushed, so I marked it "Unanswered".&lt;/LI&gt;&lt;LI&gt;I then found that was due to a very basic interrupt hazard bug. When I fix that, test it and post a Patch for that one I'll mark this "Answered" again.&lt;/LI&gt;&lt;/OL&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;Tom&lt;/P&gt;&lt;/BODY&gt;&lt;/HTML&gt;</description>
      <pubDate>Wed, 16 Dec 2015 22:19:26 GMT</pubDate>
      <guid>https://community.nxp.com/t5/i-MX-Processors/i-MX53-Linux-FlexCAN-Driver-Can-t-Send-Properly-other-bugs/m-p/459736#M72076</guid>
      <dc:creator>TomE</dc:creator>
      <dc:date>2015-12-16T22:19:26Z</dc:date>
    </item>
    <item>
      <title>Re: i.MX53 Linux FlexCAN Driver Can't Send Properly &amp; other bugs.</title>
      <link>https://community.nxp.com/t5/i-MX-Processors/i-MX53-Linux-FlexCAN-Driver-Can-t-Send-Properly-other-bugs/m-p/459737#M72077</link>
      <description>&lt;HTML&gt;&lt;HEAD&gt;&lt;/HEAD&gt;&lt;BODY&gt;&lt;P&gt;I fixed this with the following basic change:&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;PRE __default_attr="c++" __jive_macro_name="code" class="jive_macro_code _jivemacro_uid_14503243774853047 jive_text_macro" data-renderedposition="50_8_1234_352" jivemacro_uid="_14503243774853047" modifiedtitle="true"&gt;&lt;P&gt;static int flexcan_start_xmit(struct sk_buff *skb, struct net_device *dev)
{
&amp;nbsp;&amp;nbsp;&amp;nbsp; struct can_frame *frame = (struct can_frame *)skb-&amp;gt;data;
&amp;nbsp;&amp;nbsp;&amp;nbsp; struct flexcan_device *flexcan = netdev_priv(dev);
&amp;nbsp;&amp;nbsp;&amp;nbsp; struct net_device_stats *stats = &amp;amp;dev-&amp;gt;stats;

&amp;nbsp;&amp;nbsp;&amp;nbsp; BUG_ON(!flexcan);

&amp;nbsp;&amp;nbsp;&amp;nbsp; if (frame-&amp;gt;can_dlc &amp;gt; 8)
&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; return -EINVAL;

&amp;nbsp;&amp;nbsp;&amp;nbsp; if (!flexcan_mbm_xmit(flexcan, frame)) {
&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; dev_kfree_skb(skb);
&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; stats-&amp;gt;tx_bytes += frame-&amp;gt;can_dlc;
&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; stats-&amp;gt;tx_packets++;
&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; dev-&amp;gt;trans_start = jiffies;
&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; return NETDEV_TX_OK;
&amp;nbsp;&amp;nbsp;&amp;nbsp; }
&amp;nbsp;&amp;nbsp;&amp;nbsp; netif_stop_queue(dev);
&amp;nbsp;&amp;nbsp;&amp;nbsp; return NETDEV_TX_BUSY;
}

&lt;/P&gt;&lt;/PRE&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;The above was changed to the following. Stopping the queue at the start of the function and starting it at the end (instead of stopping it at the end) avoids the hazard.&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;PRE __default_attr="c++" __jive_macro_name="code" class="jive_macro_code jive_text_macro _jivemacro_uid_14503244047313385" data-renderedposition="465_8_1234_512" jivemacro_uid="_14503244047313385" modifiedtitle="true"&gt;&lt;P&gt;static int flexcan_start_xmit(struct sk_buff *skb, struct net_device *dev)
{
&amp;nbsp;&amp;nbsp;&amp;nbsp; struct can_frame *frame = (struct can_frame *)skb-&amp;gt;data;
&amp;nbsp;&amp;nbsp;&amp;nbsp; struct flexcan_device *flexcan = netdev_priv(dev);
&amp;nbsp;&amp;nbsp;&amp;nbsp; struct net_device_stats *stats = &amp;amp;dev-&amp;gt;stats;
&amp;nbsp;&amp;nbsp;&amp;nbsp; int ret;

&amp;nbsp;&amp;nbsp;&amp;nbsp; BUG_ON(!flexcan);

&amp;nbsp;&amp;nbsp;&amp;nbsp; if (frame-&amp;gt;can_dlc &amp;gt; 8)
&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; return -EINVAL;

&amp;nbsp;&amp;nbsp;&amp;nbsp; /* Stop queue before transmitting to avoid interrupt hazard. */
&amp;nbsp;&amp;nbsp;&amp;nbsp; netif_stop_queue(dev);
&amp;nbsp;&amp;nbsp;&amp;nbsp; /* flexcan_mbm_xmit() returns:
&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; *&amp;nbsp;&amp;nbsp;&amp;nbsp; -1 if the transmit failed (all MBs full),
&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; *&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; 0 if the transmit succeeded and there might be more free MBs,
&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; *&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; 1 if the transmit succeeded and there are no free MBs left.
&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; */
&amp;nbsp;&amp;nbsp;&amp;nbsp; ret = flexcan_mbm_xmit(flexcan, frame);
&amp;nbsp;&amp;nbsp;&amp;nbsp; if (ret &amp;gt;= 0) {
&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; dev_kfree_skb(skb);
&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; stats-&amp;gt;tx_bytes += frame-&amp;gt;can_dlc;
&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; stats-&amp;gt;tx_packets++;
&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; dev-&amp;gt;trans_start = jiffies;
&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; if (ret == 0)
&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; netif_start_queue(dev);
&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; return NETDEV_TX_OK;
&amp;nbsp;&amp;nbsp;&amp;nbsp; }
&amp;nbsp;&amp;nbsp;&amp;nbsp; return NETDEV_TX_BUSY;
}

&lt;/P&gt;&lt;/PRE&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;The "flexcan_mbm_xmit()" function was also changed to return -1, 0&amp;nbsp; and 1 values. That isn't necessary for this fix, but halves the transmit overhead in the case of using single transmit MBs.&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;People using more than one Transmit MB are less likely to see this problem, as each successive transmit interrupt has the opportunity to fix the damage caused by the one that had the interrupt hazard.&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;Full sources for this fix posted here:&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;&lt;A class="jive-link-message-small" data-containerid="2004" data-containertype="14" data-objectid="597951" data-objecttype="2" href="https://community.freescale.com/message/597951#597951" rel="nofollow noopener noreferrer" target="_blank"&gt;https://community.freescale.com/message/597951#597951&lt;/A&gt;&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;Tom&lt;/P&gt;&lt;/BODY&gt;&lt;/HTML&gt;</description>
      <pubDate>Thu, 17 Dec 2015 03:58:15 GMT</pubDate>
      <guid>https://community.nxp.com/t5/i-MX-Processors/i-MX53-Linux-FlexCAN-Driver-Can-t-Send-Properly-other-bugs/m-p/459737#M72077</guid>
      <dc:creator>TomE</dc:creator>
      <dc:date>2015-12-17T03:58:15Z</dc:date>
    </item>
  </channel>
</rss>

