> The heapstat monitor (mh_stats()) therefore reports false values.
Have you fixed it so you can trust it?
> My problem is the freescale HTTP server:
> When my server is connected to the net it is sometimes "visited" by some
> unwanted guests (probably IP-scanners) and if this is the case the memory
> after the "attack" is not properly freed.
Are you sure that's what is happening? I suspect there is no bug.
But what should you do if the server does have a bug that leads to a memory leak?
Guess what Apache does? The last time I looked at the code (10 years ago, so it might have changes), an Apache server spawns multiple threads forming a "pool". Each one is allocated to incoming client requests. After a client has responded to about 100 requests, IT IS KILLED AND RESTARTED! That is because it is assumed that it will leak, and rather than actually fix the bugs it is easier to simply kill it and clean up any mess. Of course the servers are running on top of an OS that keeps a separate memory pool per process and keeps a list of all file handles and sockets. So everything is cleanly released when the "kill" happens.
This is a lot harder to do on embedded systems!
You should be able to tell from the memory statistics how big each "leak" is. Then compare that "leak size" with all the "request sized" looking for a match. You could even log every allocation and free, logging the caller function address. That would let you know what code isn't freeing when it should.
A standard "problem" (actually design characteristic) of TCP/IP is that a connection is expected to stay open FOR EVER unless deliberately closed. And it will stay open without traffic and without sending any data. This is not a bug.
There is a "Feature" of standard (large, on Linux and Windows) TCP/IP systems called "Keepalive". There's a conflict between selfish programmers who would like "keepalives" every few seconds and the "Old Net Gods" who rightly said "that doesn't SCALE" and so required the MINIMUM time for the keepalives to be 2 HOURS:
Keepalive - Wikipedia
So I suspect your "attacks" are performing HTTP Opens which are first causing TCP Opens, and then either aren't sending any data, or are sending a bit and then go off to attack something else without closing the connection.
The only thing that will cause the connection to drop is either a deliberate timeout in the server, or something that makes the server want to send some data on the connection. The retries on the send will eventually close the connection, but NOTHING ELSE WILL.
The proper way to monitor this is to add a command to the system (maybe even a web page) that lists or counts all the open connections. Linux machines have "netstat -s". Windows has "netstat". You want to write an equivalent for your system. Especially if you can add a column listing how long the socket has been open for. Then you can see if the number of open sockets is increasing without limit (and taking up memory without limit).
Maybe modern servers have timeouts to handle this problem. Maybe the Freescale one is so old (and meant as a demo rather than a bulletproof on-the-internet server) that it doesn't have any sort of timeout, or it does, but it isn't turned on.
Is this the one you're running, or is it another one? Where's the source?
"Freescale\NicheliteColdfireLite\7.2 REG_ABI 20110524.zip\7.2 REG_ABI 20110524\CF_Lite_v3.2_MVDH_20110524_CW7.2\src\projects\example\freescale_HTTP_Web_Server"
You could rely on KEEPALIVEs to close sockets. But you might run out of them or memory way before two hours is up.
Here's how to configure it in the 2009 version, with a recommendation on dropping the timeout from 2 hours to as low as a minute (for embedded systems). It even includes source code and documents a bug in that version that means it won't work properly with Windows, but it also gives a patch to fix that.
https://www.nxp.com/docs/en/application-note/AN10775.pdf
Here's a post from 2010 saying it doesn't time out. But the original poster may not have turned it on. There's a post in there from Marc on his fixes to Nichelite too. I've searched the Freescale Nichelite documentation and didn't get a match on "keepalive" (or even "keep" or "timeout").
Here's what happens if you search the code for keywords:
$ find . -type f | xargs grep KEEPALIVE
./NicheLite/Source/h/msock.h:#define SO_KEEPALIVE 0x0008 /* keep connections alive */
./NicheLite/Source/mtcp/tcp_timr.c: if ((((M_SOCK)(tp->t_inpcb))->so_options & SO_KEEPALIVE) &&
$ find . -type f | xargs grep TCPT_KEEP
./NicheLite/Source/h/mtcp.h: * The TCPT_KEEP timer is used to keep connections alive. If an
./NicheLite/Source/h/mtcp.h: * an ack segment in response from the peer. If, despite the TCPT_KEEP
./NicheLite/Source/h/mtcp.h:#define TCPT_KEEP 2 /* keep alive */
./NicheLite/Source/mtcp/TCPAPI.C: tp->t_timer[TCPT_KEEP] = TCPTV_KEEP_INIT; /* initial connect keep alive */
./NicheLite/Source/mtcp/TCPIN.C: tp->t_timer[TCPT_KEEP] = tcp_keepidle;
./NicheLite/Source/mtcp/TCPIN.C: tp->t_timer[TCPT_KEEP] = TCPTV_PERSMAX; //FSL was TCPTV_KEEP_INIT;
./NicheLite/Source/mtcp/tcp_timr.c: case TCPT_KEEP: //FSL case 2
./NicheLite/Source/mtcp/tcp_timr.c: tp->t_timer[TCPT_KEEP] = (short)tcp_keepintvl;
./NicheLite/Source/mtcp/tcp_timr.c: tp->t_timer[TCPT_KEEP] = (short)tcp_keepidle;
./NicheLite/Source/h/mtcp.h:#define TCPTV_KEEP_INIT (75*PR_SLOWHZ) /* initial connect keep alive */
./NicheLite/Source/h/mtcp.h:#define PR_SLOWHZ 2 /* TCP ticks per second */
./NicheLite/Source/h/mtcp.h:#define TCPTV_SRTTDFLT (3*PR_SLOWHZ) /* assumed RTT if no info */
./NicheLite/Source/h/mtcp.h:#define TCPTV_PERSMIN (5*PR_SLOWHZ) /* retransmit persistance */
./NicheLite/Source/h/mtcp.h://#define TCPTV_PERSMAX (60*PR_SLOWHZ) /* maximum persist interval */
./NicheLite/Source/h/mtcp.h:#define TCPTV_PERSMAX (10*PR_SLOWHZ) /* maximum persist interval */ //FSL lowered
./NicheLite/Source/h/mtcp.h:#define TCPTV_KEEP_INIT (75*PR_SLOWHZ) /* initial connect keep alive */
./NicheLite/Source/h/mtcp.h:#define TCPTV_KEEP_IDLE (120*60*PR_SLOWHZ) /* dflt time before probing */
There's no demo code that sets that socket option anywhere. Maybe you should add it to the HTTP server where it opens its sockets. And you might like to change the "two hours" definition above for TCPTV_KEEP_IDLE to something quicker. Except there's a "FSL" comment in there that may have changed how this worked.
The simplest way around all of these problems is to just reset the box periodically, or to have a monitor check memory and sockets and reset if it is about to run out. If it already crashes when it runs out of memory then you've already done that :-).
That's assuming it is a TCP problem. It may still be a bug as you've said. In which case it should be easy to find and fix.
Tom