HTTP Server 4.0.2 System Crash On Recipt Of High Volume Of CGI POST Messages

cancel
Showing results for 
Show  only  | Search instead for 
Did you mean: 

HTTP Server 4.0.2 System Crash On Recipt Of High Volume Of CGI POST Messages

Jump to solution
4,388 Views
Tim562
Senior Contributor I

Hi All,

     I'm using the HTTP Server from RTCS 4.0.2 and have discovered a problem. When the client webpage sends a large volume of CGI POST messages the entire MQX system can be crashed. What I mean by "high volume" is I have a webpage with a gain slider on it and as it is moved it generates a CGI POST message that is about 100 bytes long. If I only allow the webpage to send one message per second there's no problem but, if I don't pace it, after a few (5, 10) seconds of continuous slider movement, MQX crashes. It's not a problem with my CGI function that process the incoming gain command (I've commented it out and just ignored these messages and the same thing happens).

     The main CGI function that is called when any incoming message is received reads from the socket into a temp buffer (allocated for this session), calls the appropriate function to handle the message (based on it's content), frees the memory allocated for the buffer and exits. The problem seems to occur when a new CGI POST message is received while a previous message is still being processed. Has anybody else run into a problem similar to this? When the system crashes (under debug control) I frequently observe that MQX is running the SOCK_STREAM_shutdown() function. Any thoughts on troubleshooting something likes this? Thanks for any replies!

Best Regards,

Tim

_mqx_int CGI_ParseMsg(HTTPSRV_CGI_REQ_STRUCT *param)      {      uint_32 lTotalMsgBytes, lReadLength, lBytesRead, lReadval;      _mqx_int lRetval = 0;      char_ptr sRecvBuffer = NULL;      char_ptr sBufferPtr = NULL;

     //      //----------------------------------------------------------------------------      if(param->request_method != HTTPSRV_REQ_POST)            return(0); 

     //Attempt to allocate a sRecvBuffer to store the received data      //----------------------------------------------------------------------------      lTotalMsgBytes = param->content_length;      sRecvBuffer = _mem_alloc(lTotalMsgBytes);      if(sRecvBuffer == NULL)           {          PSERVE_LogMsg(local->cOurSlot, "CGI_ParseMsg() - Unable To Allocate sRecvBuf", gcLOGENTRY_ERROR);         return(0);         }

     //Read in the data until all bytes (per lTotalRecvdMsgBytes value) are received      //----------------------------------------------------------------------------      sBufferPtr = sRecvBuffer; //Use sBufferPtr as a movable temp pointer so sRecvBuffer can always point to buffer start      lBytesRead = 0;      while(param->content_length > 0)          {         lReadLength = HTTPSRV_cgi_read(param->ses_handle, sBufferPtr, (lTotalMsgBytes-lBytesRead));         param->content_length -= lReadLength;         sBufferPtr += lReadLength;         lBytesRead += lReadLength;        }

     //Parse / decode the msg here      //----------------------------------------------------------------------------      :      :      :

     _mem_free(sRecvBuffer);      return(0);      }

Labels (1)
Tags (3)
0 Kudos
1 Solution
1,986 Views
Martin_
NXP Employee
NXP Employee

Hi Tim, this one looks quite familiar to me, as we recently discovered a problem with _task_destroy_internal() on platforms with floating point context enabled - which is the case for MPC5125 I guess.

We fixed it for MQX 4.1.0, attached you may find the patch. Please give it a try.

I have httpsrv example, browser with 10 web pages open to HTTPSRV, all pages doing cgi requests each 100 ms, in cgi handler I allocate, use and free 256 bytes buffer. Seems working (with the above patch applied).

View solution in original post

0 Kudos
28 Replies
1,537 Views
Martin_
NXP Employee
NXP Employee

Hi Tim,

please give a check with the attached scheduler change in the PSP project:

0 Kudos
1,535 Views
Tim562
Senior Contributor I

Hi Martin and Garabo,

    I've had a thought about the crashes I'm seeing and wonder if something I'm doing in my code is the problem. When I need to send a reply to a web client, I build that reply in a temp buffer (pBuf in example below), issue the response using HTTPSRV_cgi_write(), and then free the buffer like this:

uint_16 CGI_ServicePwordMsg(HTTPSRV_CGI_REQ_STRUCT *param, char_ptr pRecvBuffer)

    {

    lReplyLen;

    char_ptr pBuf = NULL;

    HTTPSRV_CGI_RES_STRUCT response;

    pBuf = _mem_alloc(256);    //Allocate a 256 byte temp buffer

    if(pBuf == NULL)

        return(0);

/*

Function code goes here

  :

  lReplyLen = ???; //lReplyLen is set to actual number of bytes loaded in the buffer

*/

    //Fill out the response info and write the data     //----------------------------------------------------------------------------     ISSUE_RESPONSE:

    response.ses_handle = param->ses_handle;     response.content_type = HTTPSRV_CONTENT_TYPE_PLAIN;     response.status_code = 200;     response.data = pBuf;     response.data_length = lReplyLen;     response.content_length = response.data_length;     HTTPSRV_cgi_write(&response);

    //Release the allocated receive buffer     //----------------------------------------------------------------------------     if(_mem_free(pBuf) != MQX_OK)        //<-- OK TO RELEASE pBUF MEMORY HERE ???     return(0);

    return(response.content_length);     }

I'm wondering if my freeing the buffer immediately after the HTTPSRV_cgi_write() call is causing a problem? Perhaps the buffer memory can't be released back to MQX until after the reply transmission has completed? Perhaps a better practice is to allocate a persistent buffer to support replies to CGI service requests? I guess another good question would be, does HTTPSRV_cgi_write() block until it is finished transmitting the contents of the buffer? Thanks!

Best,

Tim

0 Kudos
1,536 Views
Martin_
NXP Employee
NXP Employee

Hi Tim, try using "_mem_alloc_system()" to allocate your memory (or _mem_alloc_system_zero() in case the memory needs to be cleared".

Martin

0 Kudos
1,536 Views
Tim562
Senior Contributor I

Hi Martin,

    I'm pretty sure that I have discovered the culprit in my MQX crash problem. It appears to be the calls to  HTTPSRV_cgi_read() that my CGI service function makes to retrieve the CGI message into my temp buffer for processing. All of my incoming CGI messages are served by a single function I wrote that examines the start of the message and calls the appropriate function to decode and act on the received message. As a troubleshooting step I removed all code from this function except for the code that allocated and freed the temp buffer then exited. Doing this, my service function serviced over 10,000 received messages with no crashes. I next added code to set the temp buffer to all 0x00 and then all 0xFF bytes before it was freed (to ensure the buffer memory area was ok to write to), still no crashes. Finally I added in the code to call HTTPSRV_cgi_read() and store the received message in the temp buffer and MQX started crashing regularly.

    I'm posting the code for my stripped down CGI service function as I used it for these tests so you can see what I'm doing. It appears that the problem is not with repeatedly allocating and freeing a temp buffer, or with the memory space provided by _mem_alloc_system(), the problem only shows up when HTTPSRV_cgi_read() is called to load the received message data into the temp buffer. Here's my stripped down CGI service function:

mqx_int CGI_ParseMsg(HTTPSRV_CGI_REQ_STRUCT *param)    {    uint_32 lTotalMsgBytes;    uint_32 lReadLength, lBytesRead;    static uint_32 lCallCounter = 0;    char_ptr cBuf = NULL;    char_ptr cBufPtr = NULL;

   //    //----------------------------------------------------------------------------    if(param->request_method != HTTPSRV_REQ_POST)        return(0);

   //Attempt to allocate cBuf to store the received data    //----------------------------------------------------------------------------    lTotalMsgBytes = param->content_length;    cBuf = _mem_alloc_system(lTotalMsgBytes);    if(cBuf == NULL)       return(0);

   //Read in the data until all bytes (per param->content_length) are received

   //----------------------------------------------------------------------------    cBufPtr = cBuf;  //Use cBufPtr as a movable temp pointer so cBuf is not changed and can be freed    lBytesRead = 0;    while(param->content_length > 0)  //<-- *** Comment out this loop and things stop crashing ***       {       lReadLength = HTTPSRV_cgi_read(param->ses_handle, cBufPtr, (lTotalMsgBytes - lBytesRead));       param->content_length -= lReadLength;       cBufPtr += lReadLength;       lBytesRead += lReadLength;       }

   if(_mem_free(cBuf) != MQX_OK)       return(0);

   lCallCounter++;     //Used to track how many incoming msgs this function has serviced    return(0);    }

I'm calling HTTPSRV_cgi_read() in a loop based on the example in the ...\rtcs\examples\httpsrv\cgi.c file. Do you see that I may be using it incorrectly? It usually works just fine. My service function may properly service dozens to hundreds of received CGI messages then, seemingly at random, MQX will crash as illustrated in my earlier posts. Either I'm using HTTPSRV_cgi_read() incorrectly or it's got a bug that is crashing MQX. Any other possibilities? Thanks for your help Martin, I appreciate it.

Best Regards,

Tim

0 Kudos
1,536 Views
karelm_
Contributor IV

Hi Tim,

Function HTTPSRV_cgi_read should return number of bytes you requested. If its return value is lesser than what you requested, there is something wrong with socket and you should not try to read more data. Instead you should report error. So your code for reading of data should probably in your case look like this:

/* Read CGI data, return if error occurs. */

lReadLength = HTTPSRV_cgi_read(param->ses_handle, cBufPtr, param->content_length);

if (lReadLength < param->content_length)

{

    printf("Error occurred during CGI read");

    _mem_free(cBuf);

    return(0);

}

Best regards,

Karel

0 Kudos
1,536 Views
Tim562
Senior Contributor I

Hi Karel,

     I have observed that when a web client transmits a large file (aprox 2MB) using HTTP POST, that file data is often not completely available the first time HTTPSRV_cgi_read() is called. It often requires 2 or 3 calls before the complete file is returned which is the reason for calling HTTPSRV_cgi_read() in a while loop. The large file transfer is a very rare event (it only happens when the web client is sending a new binary firmware image file) and is handled by a separate CGI function. I will modify my standard CGI service function to dump a received message (and flag an error) if HTTPSRV_cgi_read() doesn't return param->content_length and see if that effects the problem. Thanks for your idea, I'll give it a try and post back here any results.

Best Regards,

Tim

0 Kudos
1,536 Views
pbanta
Contributor IV

Hi Tim,

What was the resolution to this?  I'm seeing a similar problem in MQX 4.1 in that when I'm trying to upload a file regardless of size the first call to HTTPSRV_cgi_read() always returns zero.  If I resubmit the form from the browser the file will be transferred.

Best regards,

Paul

0 Kudos
1,535 Views
Tim562
Senior Contributor I

Hi Martin,

    I changed all calls to _mem_alloc() in the CGI service functions to be calls to _mem_alloc_system() with pretty much the same results (out of memory crash). I had a breakpoint set in the _task_set_error() function and when it hit (on an attempt to allocate 496 bytes of memory) I made screen captures of the situation as reported by Code Warrior. The breakpoint was in place at application startup and no hits were encountered (at startup, init and running) until the web interface started servicing CGI client requests. I'm pasting in the various screen captures and attaching a larger capture of the MQX memory blocks (I thought it too large to paste into this msg). I'm also attaching the extram.lcf file for the project  in case it sheds any light. I've noticed that the MQX Memory blocks report indicates 87MB of memory available from 0x0A00_0420 to 0x0F7F_FFA0 (aprox 3.0MB used) which the extram.lcf file indicates is in the "kernel" memory area.  I appreciate your suggestions Martin and If you can think of anything I might be able to do to gather more info about what's happening I would be grateful. Thanks!

Best,

Tim

Captures were taken when _mem_alloc_zero() attempted to allocate 496 bytes and failed.

Current Stack when _task_set_error() breakpoint hit:

MQX task_set_error() stack capture on error(6).jpg

MQX Task Summary when _task_set_error() breakpoint hit (No TCP/IP error until after _task_set_error() returned):

MQX task_set_error() stack capture on error(6b).jpg

MQX Stack Usage Report (Notice no overflows):

MQX task_set_error() stack capture on error(6a).jpg

MQX Memory Pools report (tons of available memory right?):

MQX task_set_error() stack capture on error(6c).jpg

0 Kudos
1,534 Views
yb
Contributor IV

Hi Tim,

I've juste read your posts and I think I have the same problem : https://community.freescale.com/message/388034#388034

For me, it's not with CGI requests, but with any request (SHTML, HTML, GIF, JPEG or other file extensions).

I there are too many requests too close, the HTTPSRV will crashed...

I tried to add a patch in an external task to kill the open socket on timeout, but the result is not perfect.

Have you some fresh informations about your problem ?

Yvan

0 Kudos
1,534 Views
Tim562
Senior Contributor I

Hi Yvan,

     See the patch for ta_dest.c that Martin posted in this topic for the solution that solved my problem. Hope it covers yours as well.

Best,

Tim

0 Kudos
1,534 Views
Tim562
Senior Contributor I

Hi Yvan,

     In my case I've narrowed down the problem to "I believe" calls to the HTTPSRV_cgi_read() function. I'm going to post a detailed message here in just a little bit in hopes that it's enough information that one of the Freescale guys can reproduce the problem and maybe suggest a solution. Good luck!

Best,

Tim

0 Kudos
1,535 Views
Tim562
Senior Contributor I

Hi Martin,

     I will give it a try and report back here. Did you have any opinion on my practice of freeing my temporary memory buffer immediately after calling HTTPSRV_cgi_write()? Hope all is well.

Best,

Tim

0 Kudos
1,536 Views
Tim562
Senior Contributor I

Hi Martin and Garabo,

     I'm starting to think that a high volume of CGI message traffic is not required to cause the MQX crash. I'm thinking that it just provides more opportunity for a crash. It looks like there may be an issue with the amount of memory allocated to the TCP/IP task (not the stack size). Do you know how I can increase that amount? How about any other information I might examine to determine what's happening here? Hope all is well.

Best Regards,

Tim

0 Kudos
1,536 Views
Luis_Garabo
NXP TechSupport
NXP TechSupport

Hi Tim,

A good way to increase the TCPIP memory dedicated for it is by adding this:

_RTCSTASK_stacksize = 4500;

I hope this helps,

Regards,

Garabo

0 Kudos
1,536 Views
Tim562
Senior Contributor I

Hi Garabo,

     Looking at the above stack usage screen capture it looks like the stack size for the TCP/IP task isn't the problem with only 17% utilization. It looks like the RTCS task was trying to allocate a memory chunk (400 to 500 bytes if I remember correctly) and was unable to do so. Examining the MQX/Memory Pools and MQX/Memory blocks info provided by CodeWarrior indicated there was aprox 130MB of RAM available to the system, yet the call to _mem_alloc_zero() failed and _task_set_error() (where I had a breakpoint set) was called to document that error. It seems to me that with around 130MB of RAM available to MQX that _mem_alloc_zero() should have been able to allocate a few hundred bytes?

     I'm trying to figure out where to look to determine how _mem_alloc_zero() could've been unable to provide the requested memory. Perhaps this problem wasn't the "cause" of the crash but just a symptom of some other problem? The MQX project seems to be completely stable until the web server is called into action, then it's only a matter of time before I run into this crash. Maybe it's an issue with porting the RTCS from 4.0.2 into MQX 3.8.1.1? Any thoughts on further troubleshooting? Thanks!

Best Regards,

Tim

0 Kudos
1,536 Views
Tim562
Senior Contributor I

Hi Martin and Garabo,

     I placed the new dispatch.s file in the "C:\Program Files (x86)\Freescale\Freescale MQX 3.8.1.1\mqx\source\psp\powerpc" folder (replacing the existing file) then rebuilt MQX and tried my application again but, unfortunately got the same results. I believe this is occurring when the CGI messages come in faster then they can be serviced and multiple sockets are opened to handle them. I'm pasting in a couple of screen caps that I got when my application hit a breakpoint I set in the _task_set_error() function (Indicated error number 4 which I believe is OUT_OF_MEMORY). Is there any other info I could try and capture that might provide more clues as to what's happening? Thanks much for your suggestion of trying the modification to the dispatch.s file, I appreciate it. Hope all is well.

Best Regards,

Tim Hutchinson

MQX task_set_error() stack capture on error(5).jpg         MQX task_set_error() stack capture on error(5a).jpg

MQX task_set_error() stack capture on error(5b).jpg

MQX task_set_error() stack capture on error(5e).jpg

0 Kudos
1,536 Views
stevejanisch
Contributor IV

Hello Tim:

I can say I've run into this issue in a past lifetime (i.e. a non-Freescale MQX job) about 8 years ago.  At the time I was led to believe that this was a fairly common occurrence for several of the open source web servers.  CGI messaging creates a new process handler for the request, and pounding the server with CGI messages crashed quite a few of them.

To get around this, a new protocol called FastCGI was developed.  FastCGI differs from standard CGI in that the process was persistent, so it reduced overhead since the server did not have to create and then destroy the process handler over and over again.

I know that this doesn't really help you with MQX, but perhaps you may find the information helpful or see if Freescale plans on implementing FastCGI in the library (hopefully someone from Freescale will read this and let us know).  It is also possible that you could look into the code of one of the open source servers and implement this in MQX (one in particular that was very good was the Hiawatha web server).  At some point in time I plan on using this type of communications in my project, although I am far from the step where this would be necessary.  Or at the worst case, perhaps it would be possible to limit the requests on the server... although that probably is much help in the long run aside from stopping the whole OS to crash.

I'm very interested in what you come up with... as I said at some point perhaps we could work on a FastCGI port together.

1,536 Views
Tim562
Senior Contributor I

Hi Steve,

     Thanks for the idea. I will look at this problem from that angle and see if I can affect it. Right now I've just dialed back the cgi message frequency to no more then one every 100mS and that seems to work. Not a very satisfying fader experience for the user but it beats crashing the OS. I will post any insights or breakthroughs back to this thread. Thanks again for your reply.

Best,

Tim

0 Kudos
1,536 Views
Tim562
Senior Contributor I

Hi All,

     I was able to observe a crash where the debugger continued to function and was able to view Task Summary and Stack Usage info. Looks like I have two tasks in distress (main and TCP/IP) with no stack overflows. I've tried increasing the stack size in both the RTCS init (_RTCSTASK_stacksize = 100000;) and HttpdServer init ( params.script_stack = 100000; ) but it seems to make no difference. A buffer overflow or a failure to release allocated memory would be my usual suspects, but that doesn't seem to be the case here.

~Tim

TASK SUMMARY

pastedImage_0.png

STACK USAGE

pastedImage_1.png

0 Kudos
1,536 Views
Luis_Garabo
NXP TechSupport
NXP TechSupport

Hi Tim,

Try to add a break point in the function _task_set_error(). Then take a screenshot to the task stack and share here so we can see where in the code these invalid pointers are coming up. Maybe we can figure out how to solve that.

Regards,

Garabo

0 Kudos