MCF5475 FEC transmit hang

cancel
Showing results for 
Show  only  | Search instead for 
Did you mean: 

MCF5475 FEC transmit hang

1,561 Views
plattro_
Contributor II

Using the MCF5475 with Ethernet for a while, design is pretty stable.

The problem appears related to heavy transmit stress:

 

the Ethernet transmit stops, the HW FIFO fills, the 'ALARM" to the DMA is cleared, then naturally the DMABD ring fills and the Application notices.

The Ethernet doesn't recover from this state.

 

Receive is still working.

 

A full reset of our SW (and reinitialization of the ethernet controller + PHY) of course recovers, but I would like to find a more graceful recovery.

 

I believe the DMA is working fine, it has simply stopped because the Tx HW FIFO is full.

(Steady state of the FIFO would be empty: TLRFP==TLWFP==TFRP==TFWP)

 

EIR      00000000       
EIMR     f83e0000       
ECR      f0000002       
RCR      05ee0006       
TCR      00000000       
FECTFWR  00000003       
FECTFAR  00000100    
TX fifo:     
FECTFSR  40080000    
FECTFCR  0f240000    

FECTLRFP 0000015f    
FECTLWFP 00000144    
FECTFRP  00000164    
FECTFWP  00000144    

I can manually empty the FIFO by reading the TFDR with the debugger, so it appears the FIFO itself is working fine.

 

TXW is asserted, but the device errata indicates this is a nuisance, and the interrupt should be masked (It is), because it gets asserted by colissions, and WOULD stop the whole ethernet.

 

datecodes all XXX0837 and newer,

(I have only seen the old datecodes affected by the device errata on eval boards that still say "Motorola" on them)

 

any hints gladly appreciated

Labels (1)
0 Kudos
Reply
5 Replies

1,081 Views
plattro
Contributor I

I have opened SR 1-745821173 against this issue and escalated to our distrubutor

0 Kudos
Reply

1,081 Views
TomE
Specialist II

Could this be SECF175? It says:

 

"For the FEC, FIFO data is used to update the FEC buffer descriptors, so the
data corruption could affect packet data and/or descriptor information."

 

Otherwise, is your driver code reading any registers in the FEC that are meant to have status bits written to them by the hardware? I've just found what looks like a problem with the MCF5329 LCDC (LCD Controller) where reading the LISR register prevents the hardware from setting bits in itr. Something like that could be remotely possible with the FEC too.

 

Tom

 

0 Kudos
Reply

1,081 Views
plattro_
Contributor II

Hi Tom

I've been poring over the FEC part of the the databook and errata for weeks now. Just this last week I noticed that SECF175 applies only to the receive FIFO, not the transmit.

Quote "the data in the following peripherals' receive FIFO could become corrupted..."

 

About reading registers that have status bits written by HW, I have no idea or guesses. We used the sample code as a basis.

 

We use the TXF interrupt (In the EIR), and write EIR with that bit set to clear the event.

If XFUN,XFERR or RFERR would ever get set then we would read the FIFO status registers, but so far I have never seen any of those error bits get set.

 

I suppose that that TXF interurpt isn't strictly necessary. As a diagnostic, though, if he number of TXF interurpts is suddenly less than DMA Tx interrupts it would tell us the transmitter is hung. 

0 Kudos
Reply

1,081 Views
Vikki_ERL
Contributor I

Hi,

 

We are also facing a similar kind of problem with MCF5485.

 

When using MCF5485 and Ethernet for transfering files its is getting crashed for large file transfers(500KB).  whenever the crash happens the FECRFSR ' s(FEC Receive FIFO Status Register) RXW (Receive wait condition) is set to "1"..and causes the RFERR to set 1 in EIR Register.

 

Please let me know how to get thorugh this. For reference I have attached snapshots of error screen at the time of crash and the status of FEC registers during crash..

0 Kudos
Reply

1,081 Views
plattro_
Contributor II

hi Vikki

I think you experience a different problem. Key charicteristic of the problem I observe is that the TxFIFO read/write and last frame read/write pointers are all at different locations. In your case, the TxFIFO pointers are all the same, so you have a different problem than I have.

 

It looks to me that your Rx FIFO has unread data in in (RXW,FRM-RDY, ALARM), I guess about 270 bytes - subract Rxfifo read and write pointers). Did your DMA stop? Does your DMA Rx ISR always call DMA_continue()?All all Rx DMA buffer descriptors freed up and 'owned' by the DMA?

As far as I understand the DMA, the 'ALARM' bit set should be telling the DMA to run, unless it has stopped due to a buffer descriptor owned by the software.

 

All disclaimers apply, I have no idea. I'm just hoping that thinking about your problem maybe gives me some idea about my problem...

 

 

0 Kudos
Reply