AnsweredAssumed Answered

Bug:  Data "corruption" when using detached tasks for CGI or SSI

Question asked by Bowe Neuenschwander on May 15, 2015
Latest reply on Dec 30, 2015 by Bowe Neuenschwander

We are using MQX for Kinetis SDK 1.1.  We think we have found a bug in the CGI handling.  I have searched the forums a little, and haven’t really seen anything similar.  In looking at the code for MQX for Kinetis SDK 1.2, it looks like this bug has not been fixed.

 

We finally saw this bug when we went to upload a relatively large file (28 KB) to a CGI endpoint, where the handling function was run in its own task (which I will refer to as the handling task, which is distinct form the CGI handler task).  Essentially what was happening was the is the CGI handler was not waiting for the handling task to finish, and would strip the HTTP data out (assuming it was left over data that the handling task didn’t process) before the handling task could receive it.  If the handling function was run in the CGI handler task, then it was a function call for the CGI handler instead of a task create, so it would wait for the handling function to finish.  On a small file, the handling task could grab the data quickly before the CGI handler trashed it.  So we only saw this issue on (relatively) larger files/data and when the handling function was run in its own task.

 

We have found a solution that seems to work.  We created a function modeled after RTCS_task_create() that would create the task and wait for it to complete (unsurprisingly called RTCS_task_create_and_wait()).  Instead of using a semaphore and waiting for the child (which is the handling task) to unblock the parent (the CGI handler task), we had the parent watch for the child to terminate.  We could not find an MQX function to wait for a task to complete or terminate, so we mimicked that behavior by waiting for the child task descriptor to no longer exist or to list a different parent (in the case that a new task is spawned from a different parent task with the same task ID shortly after our child task terminates).  We chose this method over the semaphore method because we felt it was more robust to the child task crashing, which we were concerned would probably cause the CGI handler task to deadlock in the semaphore based approach.

 

The following function prototype was added to rtcs25x.h:

 

extern uint32_t RTCS_task_create_and_wait
(
   char          *name,
   uint32_t           priority,
   uint32_t           stacksize,
   void (_CODE_PTR_  start)(void *, void *),
   void             *arg
);

 

 

And the following definition was added to rtcstask.c:

 

/*FUNCTION*-------------------------------------------------------------
*
* Function Name   : RTCS_task_create_and_wait
* Returned Values : uint32_t (error code)
* Comments        :
*     Create a task and wait for it to finish running.
*
*END*-----------------------------------------------------------------*/

uint32_t RTCS_task_create_and_wait
   (
      char          *name,
      uint32_t           priority,
      uint32_t           stacksize,
      void (_CODE_PTR_  start)(void *, void *),
      void             *arg
   )
{ /* Body */
   TASK_TEMPLATE_STRUCT    task_template;
   struct rtcs_task_state  task;

#if (RTCSCFG_ENABLE_ASSERT_PRINT==1) ||  (RTCSCFG_ENABLE_ASSERT==1)
  /* for TCP/IP task we bypass this check as for this one */
  /* priority = _RTCSTASK_priority */
  if(TRUE == _RTCS_initialized)
  {
    RTCS_ASSERT(priority>_RTCSTASK_priority);
  }
#endif


   RTCS_sem_init(&task.sem);
   task.start = start;
   task.arg   = arg;
   task.error = RTCS_OK;

   _mem_zero((unsigned char *)&task_template, sizeof(task_template));
   task_template.TASK_NAME          = name;
   task_template.TASK_PRIORITY      = priority;
   task_template.TASK_STACKSIZE     = stacksize;
   task_template.TASK_ADDRESS       = RTCS_task;
   task_template.CREATION_PARAMETER = (uint32_t)&task;
   _task_id myTID = _task_get_id();
   _task_id childTID = _task_create(0, 0, (uint32_t)&task_template);
   if (childTID == MQX_NULL_TASK_ID) {
      RTCS_sem_destroy(&task.sem);
      return RTCSERR_CREATE_FAILED;
   } /* Endif */

   // Wait until its out on its own
   RTCS_sem_wait(&task.sem);
   RTCS_sem_destroy(&task.sem);

    // Watch it die
    TD_STRUCT_PTR childTD = (TD_STRUCT_PTR) _task_get_td(childTID);
    while(childTD != NULL)
    {
        // Paternity Test:  If this task is no longer my child, abandon it
        if(myTID != childTD->PARENT)
        {
            break;
        }
        _sched_yield();
        childTD = (TD_STRUCT_PTR) _task_get_td(childTID);
    }

   return task.error;

} /* Endbody */

 

 

Then RTCS_task_create() was changed to RTCS_task_create_and_wait() in the function httpsrv_detach_script() in httpsrv_script.c:

 

/*
** Detach script processing to separate task
**
** IN:
**      HTTPSRV_SESSION_STRUCT* session - session structure pointer.
**      HTTPSRV_STRUCT *server - pointer to server structure (needed for session parameters).
**
** OUT:
**      none
**
** Return Value: 
**      none
*/
void httpsrv_detach_script(HTTPSRV_DET_TASK_PARAM* task_params)
{
    _mqx_uint  error;
    _mqx_uint  priority;

    error = _task_get_priority(MQX_NULL_TASK_ID, &priority);
    if (error != MQX_OK)
    {
        return;
    }

    error = RTCS_task_create_and_wait(HTTPSRV_DETACHED_SCRIPT_TASK_NAME, priority, task_params->stack_size, httpsrv_detached_task, (void *)task_params);
    if (error != MQX_OK)
    {
        return;
    }
}

 

 

Like I said, this seems to work for us, but is probably not the best or most eloquent solution.  I am certainly open for suggestions on improvements.

 

-Bowe

Outcomes