5272 Ethernet bufffer allocation problem

JonHarris · ‎04-17-2007

I am having issues relating to allocating and freeing buffers on the 5272 (using the standard driver code). There is a case in my code where I may allocate a buffer, start to fill it, but then another Ethernet interrupt comes in before it is completed and sent. This could cause me to have to allocate another buffer, for example if the interrupt was caused by an ARP request that I need to respond to. Then, when that interrupt completes, I resume filling the first buffer. If that first buffer is sent out, I have a minor issue, in that the 2 packets are sent out in reverse order and the second interrupt-driven packet (e.g. ARP) has to wait to go out until the first packet is sent. (I noticed that the driver has specific code for dealing with this in the case of ARP requests, but not the general case.) The much worse problem for me is if for some reason the first packet cannot be sent (error condition, destination route cannot be resolved, etc.). Then I would like to de-allocate/free this packet. That's where things get messy. Since another buffer has already been allocated and attempted to be sent, I have a problem with my ring buffer pointer being out of sync with the FEC's own pointer.

It seems that the FEC has its own internal pointer to where it is in the ring buffer. And if my own ring buffer pointer gets out of sync with that, it never recovers. From there on out, nothing goes out until I send 8 packets, then all 8 go out in a burst. (I have 8 buffer descriptors in my ring.) Furthermore, I don't see any way to change or even read the FEC's internal pointer. So detecting/correcting this is difficult. I suppose resetting the FEC would work, though it seems drastic.

Any thoughts on this? Is there any way to handle de-allocating a buffer, especially if another buffer has already been allocated? Or should I try to avoid ever allocating a second buffer before I am finished with the first (could be difficult with the way my code is structured)? Another thought I had was when I want to de-allocate, I still send out a packet, but make it a dummy packet sent to a loopback address or something (how?). I have also considered trying to generalize the buffer swapping code used for ARP requests to handle other cases of multiple buffers being allocated.

Please advise.

mccPaul · ‎04-17-2007

Hi Jon,

I don't think that there is an easy way to find out where the FEC is in the buffer descriptor ring (or indeed any way at all, but i would be happy to be corrected).

Wht you need to do is to make sure that the BD ring is only modified when you have a frame buffer (or buffers) ready to transmit. The general case would be this:

Application (APP) starts filling a frame buffer for some reason, interrupted by an IRQ.
ISR decides that it wants to tx a frame so it allocates another buffer and constructs a frame in it.
ISR then fills in the next buffer descriptor in the TX BD ring, updating the internal pointer so that it points to the next BD in the ring.
When the Ready bit in the BD is set by the ISR, the FEC will start the TX process asynchronously.
ISR completes.
Carry on filling in the buffer from step 1.
APP fills in the next buffer descriptor in the TX BD ring, updating the internal pointer so that it points to the next BD in the ring.
When the Ready bit in the BD is set by the APP, the FEC will start the TX process asynchronously again, unless it is still transmitting the first frame.

Bingo, two frames sent, in the order you expect and you are still in sync with the FEC. It should be obvious that the BD ring must only be modified when interrupts are disabled (this is a critical section).

Does this help? If not I will happily expand on this.

Cheers,

Paul.

JonHarris · ‎04-17-2007

Hi Paul, thanks for the reply. What you say mostly makes sense, though I still have a few questions

The main problem right now is that in your step #1, in order to start filling a frame buffer, the APP first allocates a BD from the circular list and updates its internal pointer. It does this so it has somewhere to put the data it is about to generate. But perhaps this is not the right approach. Or maybe I should be disabling interrupts between the point I allocate the buffer and when I actually send it? I was hesitant to do this thinking this might be a fairly long period.

I also need to deal with the case of not being able to send a packet because the route cannot be resolved. In other words, a packet is prepared, but then it needs to send out an ARP request, so the original packet gets "bumped" by the outgoing ARP request. If the ARP is not successful, I need to throw away the original packet in a way as not to get out of sync with the FEC's pointer. Thoughts on this?

A further complication could be if a FEC receive interrupt came in after the ARP request was sent and before the response came back that required another packet to be sent out (e.g. someone else ARP'ing me). The code enables interrupts after sending out an ARP request so it can receive the ARP response. It gets sticky!

Message Edited by Jon Harris on 2007-04-1701:56 PM

mccPaul · ‎04-18-2007

Hi Jon,

Are you talking about a tcp/ip stack here or is this some application defined or minimalist stack?

The BD ring is designed so that you can have a number of asynchronous processes going on, that use as many frame buffers as required. The buffer descriptors are only used when a frame is ready to be transmitted. You need to 'allocate' a BD or BDs only when you have a buffer that is ready to go or you end up in the pickle you describe.

So in more detail:

You obviously have to have a MAC address for your target before you can construct a frame to transmit. If you need to ARP the target to get the MAC address then your stack will have to put the outgoing data into a queue until the ARP reply comes in. Normally you would arrange for this to happen like so:

Allocate a frame buffer and start to construct your Ethernet frame (protocol doesn't matter here, it could be UDP, TCP or something else).
Discover that your ARP cache doesn't contain the required target MAC address, oh dear.
Put the frame buffer into a queue of some kind and set a timer to poll the queue. You can set yourself a flag or counter so that you know how many times you've ARPed for this address if you want.
Allocate another frame buffer and construct an ARP request frame for the target host.
Disable interrupts or do something else to prevent another process trying to modify the BD ring.
Get the next buffer descriptor and update the next BD ring pointer.
Fill in the BD with the ARP frame buffer details and set the R bit in the BD.
Re-enable interrupts or leave the critical section.

Now everything needs to be asynchronous. Your queue should be polled regularly to find out if any queued packets need to be transmitted. You can use this polling to send more ARPs out if you like or to abandon the send attempt if you don't get an ARP reply within a timeout period.

You will have a receive ISR, and this should identify incoming ARP replies and cache the MAC addresses. If this rx ISR sees an ARP request, it could send a reply using the same tx code that locks access to the BD ring so that the application is blocked. It is more usual for the rx ISR to peek into incoming frames so that unwanted ones can be discarded quickly. The rx ISR will then copy the wanted packets to an rx queue. The stack or application will deal with the received packet, next time the rx queue is polled.

If this makes any sense, you should see that there are two levels going on here. The higher level is a bunch of queues and timers that deal with frame buffers - these are buffers that you allocate yourself and fill in with Ethernet and protocol headers and payload. The lower level is the buffer descriptor ring - this is what you use to tell the FEC where your frame buffers are, but only when they are ready to go - this should be the last thing you do as you say goodbye to the frame, definitely not the first.

So, you do need to change your approach, disabling interrupts for a long time is not going to work. And you need to make sure that your code can deal with all of these asynchronous transmits and receives.

Cheers,

Paul.

JonHarris · ‎04-18-2007

Hi Paul,

My application just does UDP (and ICMP and ARP), so I guess you would say it is a minimalist stack. No TCP/IP and we don't have a real RTOS either.

I guess the main issue here is that we don't have the "two levels" you are talking about. You talk about allocating a frame buffer and getting a buffer descriptor. For me, those are the same thing. I allocate a buffer descriptor, which includes a pointer to a frame buffer, so I have a place to build my outgoing packet data. This is all based on the example code for the 5272 (see http://www.freescale.com/files/soft_dev_tools/software/app_software/code_examples/MCF5272SC.zip).

When I declare the buffer descriptor structure, I also declare 8 data buffers and map the pointers in the BD table to the buffers. From then on, the pointers in the BD table don't change, (except in the case of an outgoing ARP, in which 2 packets are swapped).

It sounds like you are advocating decoupling the data buffers from the buffer descriptors. So when I would allocate a buffer, I would grab a 1600 byte buffer, fill it. When I was finally ready to send it, I would allocate a new buffer descriptor and change its pointer to point to the buffer I just filled. Right?

This would be a fairly significant departure from the example code, but not unmanageable. It just seems odd to me that the "official" example code would be written as it was. Or maybe it was assuming the application code would be disabling interrupts, etc. to prevent the issues we've been discussing.

Thanks again for your detailed and clearly written reply!

-Jon

mccPaul · ‎04-19-2007

Hi Jon,

I think I saw this example a couple of years ago but I must have struggled to understand it! Having spent a few minutes looking at the buffer scheme it uses I'd have to conclude that you can't do what you want to do with it. In fact I don't think it is a very good example as far as the FEC is concerned. This seems a shame as otherwise it looks like very clear code.

The example appears to ignore some of the reasons for having separate buffers and buffer descriptors. Most importantly the example's buffer scheme means that the FEC and the application become tightly coupled, but this is wrong because the FEC is an independent device running asynchronously.

I assume the example must work, but only because it is written to make every tx happen in a particular order so that the app can keep track of the BD ring next pointer state. For anything more complex, like only ARPing when you need to know an address, you will need to treat the frame buffers (the data element in the buffer descriptor) separately from the buffer descriptors.

It should be possible, but a bit awkward to do this. The example is written as if the app 'owns' the BD ring exclusively so you see NBUF*s all over the place. In fact, the BD ring is being shared between the FEC and the app, so the app should have a much lighter touch and must only use the BD ring to communicate with the FEC.

You need to replace the ubiquitous NBUF*s in the example code with a structure that contains your frame buffer. Probably best to have a structure than just an array so that you can keep other stuff there like timeout counters. Then you change nbuf_init() so that it creates the BD ring with the BD data pointers set to NULL. This function should probably also allocate your pool of frame buffers.

Your app can allocate itself as many frame buffers as it likes and in fec_send you call nbuf_tx_allocate to get a BD and fill it in only when you are ready to send.

Cheers,

Paul.

JonHarris · ‎04-20-2007

I'm going to have a go at making the changes you are suggesting, i.e. separating the allocation of a frame buffer from the allocation of a buffer descriptor from the ring. For starters, I am going to try to keep my software structure for frame buffers as similar to the existing NBUF structure as possible to minimize code changes. I will see how far this gets me. I want to keep the actual frame buffer memory contiguous since for best performance, the start of each buffer should lie on a 16-byte boundary.

I assume the example must work, but only because it is written to make every tx happen in a particular order so that the app can keep track of the BD ring next pointer state.

I agree with that! As long everything goes smoothly, it works just fine. But it is not the most robust system.

For anything more complex, like only ARPing when you need to know an address, you will need to treat the frame buffers (the data element in the buffer descriptor) separately from the buffer descriptors.

They do have specific code to handle the case ARPing for an address to send to--see nbuf_tx_swap(), which is called by arp_request(). But they don't handle other cases where things could get out of order.

mccPaul · ‎04-20-2007

Hi Jon,

That sounds like a good plan. You may even be able to leave most of the code that deals with the current buffering scheme untouched and simply add an additional private buffer descriptor ring for the fec_send code. Once the NBUF is complete, you just need to copy the buffer descriptor bits into the next available private buffer descriptor.

You should end up with code that only manipulates the tx buffer descriptor ring in at most two places - the fec_send function and any tx complete/error ISR.

Cheers,

Paul.

JonHarris · ‎04-25-2007

Paul,

I have completed the modifications as you described, and it is working great! The actual modifications were fairly straightforward and didn't impact the code too much.

The process took longer than expected because I ran into a number of issues along the way in testing my changes. None of the issues were new things, but I just noticed them now because of the extensive testing I was doing on my new code. Those issues are all resolved as well. One change that helped considerably was to increase the number of Rx buffers allocated. I have plenty of memory, so going from 8 to 64 buffers makes a huge difference in the packet flood conditions that were giving me problems before.

Thanks again for all your help!

-Jon

mccPaul · ‎04-26-2007

Hi Jon,

Glad it worked. You may not need to bother if your rx is working well enough, but I have a few thoughts on that.

I didn't look at the rx code, but I assume that a similar thing is going on - the rx buffers are used by the FEC in order and must be processed your stack in order. Normally, you can reduce the chances of rx packets flooding your stack by doing a small amount of work in the rx ISR. It is relatively quick to discard packets other than the ones that contain the protocols you are interested in (TCP, UDP, ARP and ICMP, say). The FEC will have filtered packest that aren't addressed to you, and you can usually check if the remaining boradcast packets are meant for you - e.g. ARP requests for addresses that aren't yours can be tested and dropped very quickly.

This reduces the load on the stack processing enormously, but it assumes that you don't have to use your rx buffers sequentially.You should have a small set of rx buffers in the BD ring and when you identify a received frame for your stack you copy the pointer to the data buffer from the BD into a queue. The pointer is replaced with a pointer to an empty buffer and the BD E flag is set so that the FEC can use the buffer. The rx ISR completes and your stack can deal with the received frame at its leisure. To make things easier the rx data buffers can be 2KB long which will accomodate the largest frame that the FEC will send you.

Unless you expect huge amounts of data for your application, this scheme should work with as little as two rx BDs and maybe 4 to 8 rx data buffers.

Cheers,

Paul.

JonHarris · ‎04-26-2007

Thanks for the ideas. I am already am quickly throwing away packets I don't care about. If I had a limited amount of memory, the method you describe would definitely be the way to go. But the brute force approach of having 64 buffers seems to be working fine for me. I did some extensive torture testing on it, using 3 different computers all bombarding it with packets, and it worked well, so I will stick with that for now.

Thanks again,

Jon

5272 Ethernet bufffer allocation problem

5272 Ethernet bufffer allocation problem

General