Hi Johnny,
You should be ok with all the descriptors in SDRAM or IRAM. The descriptors themselves won't cause an underflow condition - the underflow conditions occurs when the EMAC DMA engine needs TX data to send (and the transfer is larger thena the EMAC FIFO) and can't get it fast enough to keep the packet on the wire uninterrupted.
This issue is related to the buffer locaiton used for the ethernet - it's memory transfer rate is slower than the ethernet requires. Look closely at the address of the ethernet buffers (pbuf->payload) and make sure they are located in IRAM or SDRAM. It's very possible you might be trying to send a PBUF_ROM type pbuf with it's payload located in read-only (slow) SPI FLASH memory. In this case, the Ethernet will underflow.
In you can't easily use the LWIP_DEBUG capability, you might try stick a few printfs in the driver in the TX function to dump the payload addresses for the packets to send.
Thanks for the assistance. I can't post any of the project here, company regulations would prevent that.
When I posted this problem, I was only using the first 32Kb of AHB SRAM (address 0x20000000 to 0x200800) for the lwip heap. In lwipopts.h I had MEM_SIZE defined as (31 * 1024). In lpc_emac_config.h I had LPC_NUM_BUFF_TXDESCS and LPC_NUM_BUFF_RXDESCS defined as 10 (so 20 in total). Hopefully, the heap being in AHB SRAM means the DMA should be able to access it quickly enough.
If you are saying that 7 tx descriptors should be enough, that leads me to believe there is a problem elsewhere in the code.
What I have done since is, stole some more of the AHB SRAM from the other core and I now have 61Kb to play with. I have set MEM_SIZE to now be (58 * 1024) which leaves me with 2028bytes free. I realised that because of the way our project has to work, the M0 had put the lpc_enetdata struct into external SDRAM. I have now re-located this struct into the spare 2028bytes in AHB SRAM. I believe this may have been the cause of the DMA underflow because this struct contains the actual descriptors. It wasn't enough just to have the pbuf memory in SRAM.
I will now put everything back to how it was before, except with the lpc_enetdata struct in AHB SRAM and see what happens.
Descriptors should be re-queued for transmit on underflow - this sounds like an oversight and should get a <a href="http://www.lpcware.com/node/add/ct-plus-task">bug tracker issue </a>created for it. However, you'll need to determine the reason for the underflow as re-queueing won't help if that issue exists.
>dn = 2 (number of descriptors needed)
Are you using the HTTP server example? I've seen cases where the pbuf chain contains a buffer that is a ROM buffer in SPIFLASH or slower memory. The Ethernet DMA will attempt to queue a buffer from any memory locaiton in a descriptor, but if the memory is too slow, the ethernet will underflow.
A mechanism was added to copy buffers located in slower memory to a faster pbuf prior to sending using the LPC_CHECK_SLOWMEM and LPC_SLOWMEM_ARRAY definitions. If you set LPC_CHECK_SLOWMEM to 1 and setup an array on memory addresses/ranges with slower memory using LPC_SLOWMEM_ARRA, the driver will automatically handle pbuf copying for buffers if it's needed.
Here's an example setup for external SPIFI FLASH..
>I am reluctant to increase LPC_NUM_BUFF_TXDESCS because (a) i don't have any more SRAM available and (b) it would only make this bug happen less often. At this point of development I want to make it happen more often so I can fix it properly.
It looks like you have 7 now, that's more than enough. You have 7 'in use', but likely not reclaimed correctly due to underflow.
>I tried enabling EMAC_DEBUG but the amount of trace that comes out is such that it slows the whole process down enough to criple the processor, so it's not representative of the normal operating conditions.
True, but the output is useful for debugging this specific issue. It gives an idea of where the buffers are being sent from. I wouldn't keep this one once LWIP is operating correctly. I can help debug this to an extent, but would need to see the messages the LWIP EMAC driver produces at run-time. Maybe you can post your project files?
It looks like the ethernet has underflowed.
DMA_STAT register = 0x006804E5 (Transmit interrupt | Transmit buffer unavailable | Transmit underflow | Early transmit interrupt).
Looking at the current TX descriptor, it still has bit 31 set (indicating it is owned by the DMA?).
tx_reclaim_idx = 6,
tx_fill_idx = 6,
tx_free_descs = 0
dn = 2 (number of descriptors needed)
txdesc[6].CTRLSTAT = 0xF0D00000
interestingly, txdesc[4] and txdesc[5] have bit 31 clear suggesting they might be free to use again?
The datasheet indicates that if an underflow occurs, the driver must explicitly issue a Transmit Poll Demand after rectifying the suspension cause. I have tried the following code in place of the while loop but it still doesn't work, as bit 31 never gets cleared by the DMA:
I am reluctant to increase LPC_NUM_BUFF_TXDESCS because (a) i don't have any more SRAM available and (b) it would only make this bug happen less often. At this point of development I want to make it happen more often so I can fix it properly.
I tried enabling EMAC_DEBUG but the amount of trace that comes out is such that it slows the whole process down enough to criple the processor, so it's not representative of the normal operating conditions.
For the code snippet above, the msDelay is probably a bad design choice on our part. Without reclaiming TX descriptors somewhere else, this can loop forever if there are not enough descriptors on the initial call. Calling lpc_tx_reclaim() is a good workaround for this.
There are a few things that could be happening here.
1) The pbuf chain might be a non-contiguous chained pbuf that needs more descriptors than are actually available. Is the value of 'dn' greater than the number defined by LPC_NUM_BUFF_TXDESCS? Try increasing the LPC_NUM_BUFF_TXDESCS value - this will also have a slight increase in required memory, but will allow a larger float of free TX descriptors.
2) Is the ethernet actually transmitting the packets or just holding them? Holding them might indicate another problem. You can try enabling LWIP_DEBUG and EMAC_DEBUG to get EMAC driver status messages on the UART to get an idea of what might be happening. Posting that here would be very helpful.
I have also run into the same problem. There is absolutely no way it will ever leave that while loop unless you are using NO_SYS == 0.
I don't suppose you ever found a solution to this? I have tried calling lpc_tx_reclaim() within that while loop but that didn't work.