Hi
I assume that you are using the NE64 with 8k SRAM. It's a great little device for a number of interesting jobs. (with extended memory the Ethernet interface only works at 10M and so I have stuck with the 80 pin part until now)
However the 8k SRAM also means that it requires tight utilisation of resources. For more powerful jobs the Coldfire is the best choice - the NE64 still remains the best choice for all others 'small' applications.
I tend to keep with small buffer sizes so as to get the most out of the application. This is at the expense of IP efficiency but this is often not a killer. If you want to get data from a remote sensor it is often not a big deal if it arrives chopped up into smaller frames rather than less frames of maximum size - it is still good to be able to get if over the Internet from a single chip and quite power consumption device.
TCP applications are no problem in this respect. Only DHCP is a bit of a nuicance since it requires 1k buffers to be used to ensure that the BOOTP UDP frames really fit, sacrificing 1k which the application could well use. Note that although all the rx and tx buffers have to be the same size, it is not necessary to use all of the tx buffer (the amount of it used is under your control) and so I tend to start variables in the middle of it if I know that I will not utilise it all.
If you are doing an application which receives TCP frames to be sent to a lower speed interface (classically the serial interface using XOFF/CTS protocol) then the lack of windowing support in the OpenTCP implementation becomes a problem. The reason is that there is no flow control mechanism across the network and TCP frames will invariably arrive at a rate much faster that the data can be sent over the interface - in this case the window advertised by the tcp stack will tend to equal the window available in the serial output buffer which must close in order to choke the data - if not it will be lost. Reception TCP frames can also be simply ignored (dropped) if there is no room but this results in ugly repetitions and is inefficient due to the repetition timeout delays.
It's interesting stuff and sometimes quite challenging finding the best compromise.
Cheers
Mark Butcher
www.mjbc.ch