MQX FTP Lockup

cancel
Showing results for 
Show  only  | Search instead for 
Did you mean: 

MQX FTP Lockup

Jump to solution
2,515 Views
kcameron
Contributor III

Hi there,

We have a design using a K60f120 and MQX 4.02 with an FTP server, Telnet server and a periodic UDP broadcast. We're using the Atheros AR4100 Wi-Fi chip and the driver that currently only supports MQX 4.0.2.2 (support for 4.1 has apparently been developed but is caught up with "marketing and lawyers").


Our FTP server uses an MFS mounted to an SD card to allow us to copy files off of the SD card from an application running on a host PC; the Telnet server implements a command interface for the application on the PC; and the UDP broadcast is used as a keep alive and to convey the IP address and SSID to the host application so it knows which devices are available on the network to connect to.


We've got everything running rather nicely except for when the Wi-Fi signal fades out. After some period of time running with a poor Wi-Fi signal the UDP (which opens and closes a socket every second to perform it's broadcast) can no longer obtain a socket to connect to. Looking at the sockets that are currently allocated in this condition reveals that there are some sockets allocated to FTP and Telnet, but none free. If we don't limit the maximum number of TCP connections then we end up running out of memory in our application. Funnily enough, Telnet usually keeps on chugging along with no problems.


Given the information we've gathered so far, it looks FTP is using up all of the available sockets and not releasing them. We've tried playing around with the TCP and FTP timeouts to no avail.


I've seen others in this forum complain about similar issues and the answer is usually to move to MQX 4.1, but unfortunately we are limited to MQX 4.0 due to our use of the AR4100 and its lack of support for 4.1.


Any suggestions?

Labels (1)
Tags (3)
0 Kudos
1 Solution
1,808 Views
kcameron
Contributor III

Hi Martin,

Never mind about the  RTCS_selectset() not working. It turns out I just needed to setup a for loop and call it repeatedly for awhile in order to detect activity on the data socket, as follows. This fixes my problem where the FTP server locks up AND where I get a build up of "Ground" sockets while copying a large number of files and fading out the Wi-Fi signal.

// Try to detect activity on the listening socket every 100mS for up to 5 seconds

for(uint_32 i = 0; i < 50; i++)

{

    // This checks for activity on listening socket and returns 0 (no activity) or session_ptr->PASV_SOCK (activity)

    temp_sock = RTCS_selectset(&session_ptr->PASV_SOCK, 1, 100);

  

    if(temp_sock != 0)

    {

        // Got some activity, so accept the new data connection and break

        new_sock = accept(temp_sock, (sockaddr*) &remote_addr, &length);

        break;

    }

  

    _time_delay(100);

}

I've also modified the code in FTPd_task() as follows. This fixes the first problem I was seeing where the build up of "Ground" sockets happens when I connect/disconnect from the FTP rapidly and fade Wi-Fi out. The changes from my last post of this code are:

  1. Line 129: I've added a check to see if RTCS_selectset() returned 0 or the socket handle - this prevents accept() from being called unnecessarily.
  2. Line 231: I've added to the else() statement a shutdown() of the childsock if accept() fails. This appears to be all I need to be for my needs in order to prevent "Ground" sockets from building up.

/*TASK*-----------------------------------------------------------------

*

* Function Name    : FTPd_task

* Returned Value   : none

* Comments  :  FTP server.

*

*END*-----------------------------------------------------------------*/

void FTPd_task(pointer init_ptr, pointer creator)

{ /* Body */

   FTPd_CONTEXT               ftpd_context =  { 0 };

   sockaddr_in                laddr;

   uint_32                    sock, childsock, listensock;

   uint_32                    error=RTCS_OK;

   uint_16                    remote_addr_len;

   sockaddr_in                remote_addr = {0};

#if FTPDCFG_ENABLE_MULTIPLE_CLIENTS

   TASK_TEMPLATE_STRUCT_PTR   t_ptr;

#endif

   FTPd_SESSION_PTR           session_ptr;

   uint_32                    option;

   boolean                    dev_in_path = FALSE;

   int_16                     devlen = 0, rootdirlen = 0;

   uint_32                   parameter;

   FTPd_task_id = RTCS_task_getid();

  

#ifdef __MQX__

   /* Set up exit handler and context so that we can clean up if the FTP Server is terminated */

   _task_set_environment( _task_get_id(), (pointer) &ftpd_context );

   _task_set_exit_handler( _task_get_id(), FTPd_Exit_handler );

#endif

   laddr.sin_family      = AF_INET;

   laddr.sin_port        = IPPORT_FTP;

   laddr.sin_addr.s_addr = INADDR_ANY;

   // Create new listening socket

   ftpd_context.LISTENSOCK = socket(PF_INET, SOCK_STREAM, 0);

   listensock = ftpd_context.LISTENSOCK;

   if (listensock == RTCS_SOCKET_ERROR)

   {

      error = RTCSERR_OUT_OF_SOCKETS;

   }

   // When the socket is bound, RTCS allocates a send buffer of the specified number of bytes, which controls how much sent data RTCS

   // can buffer for the socket.

   //Recommended to be a multiple of the maximum segment size (536), where the multiple is at least three.

   if (!error)

   {

      option = FTPDCFG_BUFFER_SIZE;

      error = setsockopt(listensock, SOL_TCP, OPT_TBSIZE, &option, sizeof(option));

   }

   // When the socket is bound, RTCS allocates a receive buffer of the specified number of bytes, which controls

   // how much received data RTCS can buffer for the socket.

   //Recommended to be a multiple of the maximum segment size (536), where the multiple is at least three.

   if (!error)

   {

      option = FTPDCFG_BUFFER_SIZE;

      error = setsockopt(listensock, SOL_TCP, OPT_RBSIZE, &option, sizeof(option));

   }

   // Zero: RTCS waits indefinitely for outgoing data during a call to send().

   // Non-zero: RTCS waits for this number of milliseconds for incoming data during a call to send().

   if (!error)

   {

      option = FTPDCFG_SEND_TIMEOUT;

      error = setsockopt(listensock, SOL_TCP, OPT_SEND_TIMEOUT, &option, sizeof(option));

   }

   // RTCS maintains the connection for this number of milliseconds. Must be minimum of 180,000

   if (!error)

   {

      option = FTPDCFG_CONNECT_TIMEOUT;

      error = setsockopt(listensock, SOL_TCP, OPT_CONNECT_TIMEOUT, &option, sizeof(option));

   }

   // Two times the maximum segment lifetime (which is a constant). Returned information is for the last

   // frame that the socket received.

   if (!error)

   {

      option = FTPDCFG_TIMEWAIT_TIMEOUT;

      error = setsockopt(listensock, SOL_TCP, OPT_TIMEWAIT_TIMEOUT, &option, sizeof(option));

   }

  

   // fixme cwp - added to see if we can fix ftp lockup on loss of wifi

   if (!error)

   {

       option = FTPDCFG_RECEIVE_TIMEOUT;  // Set this value to what you need (360000 for 6 minute, for me)

       error = setsockopt(listensock, SOL_TCP, OPT_RECEIVE_TIMEOUT, &option, sizeof(option));

   }

  

   if (!error)

   {

       option = FTPDCFG_NO_NAGLE_ALGORITHM;

       error = setsockopt(listensock, SOL_TCP, OPT_NO_NAGLE_ALGORITHM, &option, sizeof(option));

   }

   // Bind the new socket to local address

   if (!error)

   {

      error = bind(listensock, (const sockaddr *)&laddr, sizeof(laddr));

   }

   // Start listening for incoming connection

   if (!error)

   {

      error = listen(listensock, 0);

   }

   // Stop here if something went wrong setting up the listening socket (this shouldn't happen normally)

   if (error)

   {

      RTCS_task_exit(creator, error);

   }

  

   // Resume the creator task (in this case it's falcon.c calling Falcon_initialize_networking(), which starts FTP and Telnet.

   RTCS_task_resume_creator(creator, RTCS_OK);

   for (;;)

   {

      remote_addr_len = sizeof(remote_addr);

     

      // Check to see if there is any activity on the listening socket. If there is listensock will be returned, else 0

      sock = RTCS_selectset(&listensock, 1, FTPDCFG_CONNECT_TIMEOUT);

     

      if(sock != 0)

      {

          // There was activity on the listening socket, so accept the control socket connection

          childsock = accept(sock,(sockaddr *)&remote_addr, &remote_addr_len);

         

          if ((childsock != 0) && (childsock!=RTCS_SOCKET_ERROR))

          {

              // The connection was accepted successfully, so configure the FTP server

              session_ptr = (FTPd_SESSION_PTR) RTCS_mem_alloc_zero(sizeof (FTPd_SESSION_STRUCT));

   

              if ( session_ptr )

              {

                  _mem_set_type(session_ptr, MEM_TYPE_FTPd_SESSION_PTR);

   

                  session_ptr->DATA_BUFFER_SIZE = FTPDCFG_BUFFER_SIZE; // KDC: Changed this

                  session_ptr->DATA_BUFFER_PTR = RTCS_mem_alloc_zero(session_ptr->DATA_BUFFER_SIZE);

               

                  if (session_ptr->DATA_BUFFER_PTR == NULL)

                  {

                      _mem_free(session_ptr);

                      session_ptr = NULL;

                  }

                  else

                  {

                      _mem_set_type(session_ptr->DATA_BUFFER_PTR, MEM_TYPE_FTPd_DATA_BUFFER);

                  }

              }

             

              if (session_ptr == NULL)

              {

                  shutdown((uint_32)childsock, FTPDCFG_SHUTDOWN_OPTION);

              }

              else 

              {

                  session_ptr->CONTROL_SOCK = (uint_32) childsock;

                  session_ptr->CONNECTED = TRUE;

                  /* set default data ports */

                  session_ptr->SERVER_DATA_SOCKADDR.sin_family      = AF_INET;

                  session_ptr->SERVER_DATA_SOCKADDR.sin_port        = IPPORT_FTPDATA;

                  session_ptr->SERVER_DATA_SOCKADDR.sin_addr.s_addr = INADDR_ANY;

   

                  session_ptr->USER_DATA_SOCKADDR.sin_family      = remote_addr.sin_family;

                  session_ptr->USER_DATA_SOCKADDR.sin_port        = remote_addr.sin_port;

                  session_ptr->USER_DATA_SOCKADDR.sin_addr.s_addr = remote_addr.sin_addr.s_addr;

   

#if FTPDCFG_USES_MFS

                  //initialize current directory and current filesystem

                  devlen = _io_get_dev_for_path(session_ptr->CURRENT_FS_NAME, &dev_in_path, FTPD_DEVLEN,(char *)FTPd_rootdir, NULL);

   

                  session_ptr->CURRENT_FS = _io_get_fs_by_name(session_ptr->CURRENT_FS_NAME);

   

                  error = ioctl(session_ptr->CURRENT_FS, IO_IOCTL_CHECK_DIR_EXIST,(pointer)FTPd_rootdir );

   

                  if (error) 

                  {

#endif

                      session_ptr->CURRENT_FS = NULL;

                      session_ptr->CURRENT_FTP_DIR = NULL;

                      session_ptr->CURRENT_FS_DIR[0] = '\0';

#if FTPDCFG_USES_MFS

                  }

                  else

                  {

                      // set current fs dir (including root dir)

                      strcpy(session_ptr->CURRENT_FS_DIR,FTPd_rootdir+devlen);

   

                      rootdirlen = strlen(session_ptr->CURRENT_FS_DIR);

                     

                      // set current FTP dir

                      session_ptr->CURRENT_FTP_DIR = session_ptr->CURRENT_FS_DIR + rootdirlen - 1;

   

                      //check if there is / at the end of root dir name

                      if(*(session_ptr->CURRENT_FTP_DIR) != '\\' && *(session_ptr->CURRENT_FTP_DIR) != '/')

                      {

                          session_ptr->CURRENT_FTP_DIR++;

                      }

                  

                      session_ptr->CURRENT_FTP_DIR[0] = '\\';

                      session_ptr->CURRENT_FTP_DIR[1] = '\0';

                  }

#endif

#if FTPDCFG_ENABLE_MULTIPLE_CLIENTS

                  /* Create a task to look after it */

                  RTCS_detachsock(childsock);

                  t_ptr = _task_get_template_ptr(MQX_NULL_TASK_ID);

                 

                  if (RTCS_task_create("FTPd_child", t_ptr->TASK_PRIORITY, t_ptr->TASK_STACKSIZE, FTPd_child, (pointer) session_ptr) != RTCS_OK )

                  {

                          RTCS_attachsock(childsock);

                          shutdown(childsock, FLAG_ABORT_CONNECTION);

                          _mem_free(session_ptr->DATA_BUFFER_PTR);

                          _mem_free(session_ptr);

                   }

#else

                   // The FTP server has been setup, so call FTPd_child(), which will start responding to commands from a client

                   FTPd_child((pointer) session_ptr,0);

#endif

              }

          }

          else

          {

              // Accepting the socket failed, so call shutdown to free any memory that was allocated for it.

              shutdown((uint_32)childsock, FLAG_ABORT_CONNECTION);

          }

      }

   }

}

Martin_

View solution in original post

0 Kudos
18 Replies
1,808 Views
Martin_
NXP Employee
NXP Employee

The sockets might be waiting for ACKs. There is minimum of 100 sec for this timeout in RTCS, until connection is dropped. In our latest development version, we have removed this minimum. So if you want sockets to wait for acknowledge for say max 5 seconds, you can set the OPT_CONNECT_TIMEOUT socket options for all FTP sockets. The modification is in the file tcp.c in TCP_Process_open(), remove 3 source code lines shown below:

if ( tcb->sndto_2 < TCP_SENDTIMEOUT_MIN ) {

         tcb->sndto_2 = TCP_SENDTIMEOUT_MIN;

      } /* Endif */

rebuild RTCS lib and then setsockopt for your FTP listening socket, OPT_CONNECT_TIMEOUT, to for example 5 seconds. Your telnet socket can still be configured for say 180 seconds.

The result of this change will be: for an FTP connection, if acknowledge is not received from a client within OPT_CONNECT_TIMEOUT, the socket will be closed by the RTCS.

It is application dependent on how long it should wait for possible link re-establish until it gives up and drops the socket.

-Martin

0 Kudos
1,808 Views
kcameron
Contributor III

Hi Martin,

Thanks for your quick response!

I've tried implementing your suggested changes and it does not seem to have effected the problem.

I've included my socket summary below for both the lockup condition and under normal operation (with FTP and Telnet both connected).

Lockup condition:

Socket Summary.jpg

Normal operation:

Socket Summary good.jpg

When I see the lockup on FTP and UDP there seems to be a few sockets in the "Ground" state that are not present normally. I have the maximum number of TCP connections limited to 7 for those pictures and if I increase that number I get more "Ground" sockets under the lockup condition.

0 Kudos
1,808 Views
Martin_
NXP Employee
NXP Employee

Kyle, did you check how the listening socket on port 3015 is created ? In normal you show just FTP and TELNET ports open, what is the 3rd one listening on port 3015, in lockup situation ?

0 Kudos
1,808 Views
kcameron
Contributor III

Hi Martin,

I'm not sure what that is and I haven't been able to find out. We're only using FTP, Telnet and UDP. The first two are the only ones that should be creating listening sockets...

0 Kudos
1,808 Views
kcameron
Contributor III

Hi Martin,

I think I have found where the extra listen socket comes from - it's the FTP server creating a passive data socket in the FTPd_open_passive_data_connection() function in ftpd_cmd.c. When it does this it creates a listen socket, and then calls accept() on that.

Noticing that accept() gets called from there I thought maybe there might be an orphaned "Ground" socket being created there sometimes, too. To test this I copied 85 files of varying sizes on and off of the server - sure enough this seems to be creating problems, too. After a little while and when the wireless signal fades out I can cause a build up of "Ground" sockets this way on top of the situation I described above when I rapidly connect/disconnect a client and fade the Wi-Fi signal.

Once you've answered my pending questions above I'd be very appreciative if you could give this some thought.

0 Kudos
1,808 Views
Martin_
NXP Employee
NXP Employee

Hi Kyle, I looked into the FTPd source code and it looks to me that the data socket, that is created by FTPd_open_passive_data_connection() is shutdown only when the client issues another command, such as "list" ? If you have sequence of "pasv" and "list" then all is fine. What happens with the socket if "pasv" creates it but then link goes down so there is no more "list".  The server should drop idle/inactive connections after some timeout, because I guess by default inactive TCP connections stay alive forever.

So perhaps there should be an application timeout added for this socket, if client does not issue any data command after it has issued "pasv" the app.should shutdown the data socket.

Also, to get rid of the ground sockets, use the approach with RTCS_selectset(). The function that creates ground TCB is the accept() function. So if you do listen() followed by RTCS_selectset() followed by accept() you will have no ground TCBs. Because RTC_selectset would return only after a connection is pending. Do this for all listen sockets in FTPd server.

0 Kudos
1,808 Views
kcameron
Contributor III

Hi Martin,

Thanks for your reply!

The timeout suggestion for the listening socket is a good one - I'll implement that.

As for your suggestion to get rid of the "Ground" sockets, unfortunately RTCS_selectset() doesn't work in this case. The problem is that once any command following a PASV command is received it is expected that the server will initiate opening the data socket and start sending data. If I call RTCS_selectset() I never get any activity (data or connection requests) on the listening socket created in FTPd_open_passive_data_connection() from the client side, so the FTP server fails to open the data socket and gives up.

I've also found another problem that is much more severe:

If I fade out Wi-Fi after a command such as LIST has been received but just before accept() is called, then accept() never returns and my FTP server locks up completely - if I put break points in FTPd_task() or FTPd_child() the code is not running at all.

To fix all of the problems I am now seeing, I need to find some method, other than RTCS_selectset(), to cause accept() to return. I see two methods to trigger this - I can either detect from the Wi-Fi driver that the link has been broken, or I can add a timeout. The problem is I'm having a hard time trying to figure out where in the TCP stack I can actually cause accept() to return? Any suggestions?

Martin_

0 Kudos
1,809 Views
kcameron
Contributor III

Hi Martin,

Never mind about the  RTCS_selectset() not working. It turns out I just needed to setup a for loop and call it repeatedly for awhile in order to detect activity on the data socket, as follows. This fixes my problem where the FTP server locks up AND where I get a build up of "Ground" sockets while copying a large number of files and fading out the Wi-Fi signal.

// Try to detect activity on the listening socket every 100mS for up to 5 seconds

for(uint_32 i = 0; i < 50; i++)

{

    // This checks for activity on listening socket and returns 0 (no activity) or session_ptr->PASV_SOCK (activity)

    temp_sock = RTCS_selectset(&session_ptr->PASV_SOCK, 1, 100);

  

    if(temp_sock != 0)

    {

        // Got some activity, so accept the new data connection and break

        new_sock = accept(temp_sock, (sockaddr*) &remote_addr, &length);

        break;

    }

  

    _time_delay(100);

}

I've also modified the code in FTPd_task() as follows. This fixes the first problem I was seeing where the build up of "Ground" sockets happens when I connect/disconnect from the FTP rapidly and fade Wi-Fi out. The changes from my last post of this code are:

  1. Line 129: I've added a check to see if RTCS_selectset() returned 0 or the socket handle - this prevents accept() from being called unnecessarily.
  2. Line 231: I've added to the else() statement a shutdown() of the childsock if accept() fails. This appears to be all I need to be for my needs in order to prevent "Ground" sockets from building up.

/*TASK*-----------------------------------------------------------------

*

* Function Name    : FTPd_task

* Returned Value   : none

* Comments  :  FTP server.

*

*END*-----------------------------------------------------------------*/

void FTPd_task(pointer init_ptr, pointer creator)

{ /* Body */

   FTPd_CONTEXT               ftpd_context =  { 0 };

   sockaddr_in                laddr;

   uint_32                    sock, childsock, listensock;

   uint_32                    error=RTCS_OK;

   uint_16                    remote_addr_len;

   sockaddr_in                remote_addr = {0};

#if FTPDCFG_ENABLE_MULTIPLE_CLIENTS

   TASK_TEMPLATE_STRUCT_PTR   t_ptr;

#endif

   FTPd_SESSION_PTR           session_ptr;

   uint_32                    option;

   boolean                    dev_in_path = FALSE;

   int_16                     devlen = 0, rootdirlen = 0;

   uint_32                   parameter;

   FTPd_task_id = RTCS_task_getid();

  

#ifdef __MQX__

   /* Set up exit handler and context so that we can clean up if the FTP Server is terminated */

   _task_set_environment( _task_get_id(), (pointer) &ftpd_context );

   _task_set_exit_handler( _task_get_id(), FTPd_Exit_handler );

#endif

   laddr.sin_family      = AF_INET;

   laddr.sin_port        = IPPORT_FTP;

   laddr.sin_addr.s_addr = INADDR_ANY;

   // Create new listening socket

   ftpd_context.LISTENSOCK = socket(PF_INET, SOCK_STREAM, 0);

   listensock = ftpd_context.LISTENSOCK;

   if (listensock == RTCS_SOCKET_ERROR)

   {

      error = RTCSERR_OUT_OF_SOCKETS;

   }

   // When the socket is bound, RTCS allocates a send buffer of the specified number of bytes, which controls how much sent data RTCS

   // can buffer for the socket.

   //Recommended to be a multiple of the maximum segment size (536), where the multiple is at least three.

   if (!error)

   {

      option = FTPDCFG_BUFFER_SIZE;

      error = setsockopt(listensock, SOL_TCP, OPT_TBSIZE, &option, sizeof(option));

   }

   // When the socket is bound, RTCS allocates a receive buffer of the specified number of bytes, which controls

   // how much received data RTCS can buffer for the socket.

   //Recommended to be a multiple of the maximum segment size (536), where the multiple is at least three.

   if (!error)

   {

      option = FTPDCFG_BUFFER_SIZE;

      error = setsockopt(listensock, SOL_TCP, OPT_RBSIZE, &option, sizeof(option));

   }

   // Zero: RTCS waits indefinitely for outgoing data during a call to send().

   // Non-zero: RTCS waits for this number of milliseconds for incoming data during a call to send().

   if (!error)

   {

      option = FTPDCFG_SEND_TIMEOUT;

      error = setsockopt(listensock, SOL_TCP, OPT_SEND_TIMEOUT, &option, sizeof(option));

   }

   // RTCS maintains the connection for this number of milliseconds. Must be minimum of 180,000

   if (!error)

   {

      option = FTPDCFG_CONNECT_TIMEOUT;

      error = setsockopt(listensock, SOL_TCP, OPT_CONNECT_TIMEOUT, &option, sizeof(option));

   }

   // Two times the maximum segment lifetime (which is a constant). Returned information is for the last

   // frame that the socket received.

   if (!error)

   {

      option = FTPDCFG_TIMEWAIT_TIMEOUT;

      error = setsockopt(listensock, SOL_TCP, OPT_TIMEWAIT_TIMEOUT, &option, sizeof(option));

   }

  

   // fixme cwp - added to see if we can fix ftp lockup on loss of wifi

   if (!error)

   {

       option = FTPDCFG_RECEIVE_TIMEOUT;  // Set this value to what you need (360000 for 6 minute, for me)

       error = setsockopt(listensock, SOL_TCP, OPT_RECEIVE_TIMEOUT, &option, sizeof(option));

   }

  

   if (!error)

   {

       option = FTPDCFG_NO_NAGLE_ALGORITHM;

       error = setsockopt(listensock, SOL_TCP, OPT_NO_NAGLE_ALGORITHM, &option, sizeof(option));

   }

   // Bind the new socket to local address

   if (!error)

   {

      error = bind(listensock, (const sockaddr *)&laddr, sizeof(laddr));

   }

   // Start listening for incoming connection

   if (!error)

   {

      error = listen(listensock, 0);

   }

   // Stop here if something went wrong setting up the listening socket (this shouldn't happen normally)

   if (error)

   {

      RTCS_task_exit(creator, error);

   }

  

   // Resume the creator task (in this case it's falcon.c calling Falcon_initialize_networking(), which starts FTP and Telnet.

   RTCS_task_resume_creator(creator, RTCS_OK);

   for (;;)

   {

      remote_addr_len = sizeof(remote_addr);

     

      // Check to see if there is any activity on the listening socket. If there is listensock will be returned, else 0

      sock = RTCS_selectset(&listensock, 1, FTPDCFG_CONNECT_TIMEOUT);

     

      if(sock != 0)

      {

          // There was activity on the listening socket, so accept the control socket connection

          childsock = accept(sock,(sockaddr *)&remote_addr, &remote_addr_len);

         

          if ((childsock != 0) && (childsock!=RTCS_SOCKET_ERROR))

          {

              // The connection was accepted successfully, so configure the FTP server

              session_ptr = (FTPd_SESSION_PTR) RTCS_mem_alloc_zero(sizeof (FTPd_SESSION_STRUCT));

   

              if ( session_ptr )

              {

                  _mem_set_type(session_ptr, MEM_TYPE_FTPd_SESSION_PTR);

   

                  session_ptr->DATA_BUFFER_SIZE = FTPDCFG_BUFFER_SIZE; // KDC: Changed this

                  session_ptr->DATA_BUFFER_PTR = RTCS_mem_alloc_zero(session_ptr->DATA_BUFFER_SIZE);

               

                  if (session_ptr->DATA_BUFFER_PTR == NULL)

                  {

                      _mem_free(session_ptr);

                      session_ptr = NULL;

                  }

                  else

                  {

                      _mem_set_type(session_ptr->DATA_BUFFER_PTR, MEM_TYPE_FTPd_DATA_BUFFER);

                  }

              }

             

              if (session_ptr == NULL)

              {

                  shutdown((uint_32)childsock, FTPDCFG_SHUTDOWN_OPTION);

              }

              else 

              {

                  session_ptr->CONTROL_SOCK = (uint_32) childsock;

                  session_ptr->CONNECTED = TRUE;

                  /* set default data ports */

                  session_ptr->SERVER_DATA_SOCKADDR.sin_family      = AF_INET;

                  session_ptr->SERVER_DATA_SOCKADDR.sin_port        = IPPORT_FTPDATA;

                  session_ptr->SERVER_DATA_SOCKADDR.sin_addr.s_addr = INADDR_ANY;

   

                  session_ptr->USER_DATA_SOCKADDR.sin_family      = remote_addr.sin_family;

                  session_ptr->USER_DATA_SOCKADDR.sin_port        = remote_addr.sin_port;

                  session_ptr->USER_DATA_SOCKADDR.sin_addr.s_addr = remote_addr.sin_addr.s_addr;

   

#if FTPDCFG_USES_MFS

                  //initialize current directory and current filesystem

                  devlen = _io_get_dev_for_path(session_ptr->CURRENT_FS_NAME, &dev_in_path, FTPD_DEVLEN,(char *)FTPd_rootdir, NULL);

   

                  session_ptr->CURRENT_FS = _io_get_fs_by_name(session_ptr->CURRENT_FS_NAME);

   

                  error = ioctl(session_ptr->CURRENT_FS, IO_IOCTL_CHECK_DIR_EXIST,(pointer)FTPd_rootdir );

   

                  if (error) 

                  {

#endif

                      session_ptr->CURRENT_FS = NULL;

                      session_ptr->CURRENT_FTP_DIR = NULL;

                      session_ptr->CURRENT_FS_DIR[0] = '\0';

#if FTPDCFG_USES_MFS

                  }

                  else

                  {

                      // set current fs dir (including root dir)

                      strcpy(session_ptr->CURRENT_FS_DIR,FTPd_rootdir+devlen);

   

                      rootdirlen = strlen(session_ptr->CURRENT_FS_DIR);

                     

                      // set current FTP dir

                      session_ptr->CURRENT_FTP_DIR = session_ptr->CURRENT_FS_DIR + rootdirlen - 1;

   

                      //check if there is / at the end of root dir name

                      if(*(session_ptr->CURRENT_FTP_DIR) != '\\' && *(session_ptr->CURRENT_FTP_DIR) != '/')

                      {

                          session_ptr->CURRENT_FTP_DIR++;

                      }

                  

                      session_ptr->CURRENT_FTP_DIR[0] = '\\';

                      session_ptr->CURRENT_FTP_DIR[1] = '\0';

                  }

#endif

#if FTPDCFG_ENABLE_MULTIPLE_CLIENTS

                  /* Create a task to look after it */

                  RTCS_detachsock(childsock);

                  t_ptr = _task_get_template_ptr(MQX_NULL_TASK_ID);

                 

                  if (RTCS_task_create("FTPd_child", t_ptr->TASK_PRIORITY, t_ptr->TASK_STACKSIZE, FTPd_child, (pointer) session_ptr) != RTCS_OK )

                  {

                          RTCS_attachsock(childsock);

                          shutdown(childsock, FLAG_ABORT_CONNECTION);

                          _mem_free(session_ptr->DATA_BUFFER_PTR);

                          _mem_free(session_ptr);

                   }

#else

                   // The FTP server has been setup, so call FTPd_child(), which will start responding to commands from a client

                   FTPd_child((pointer) session_ptr,0);

#endif

              }

          }

          else

          {

              // Accepting the socket failed, so call shutdown to free any memory that was allocated for it.

              shutdown((uint_32)childsock, FLAG_ABORT_CONNECTION);

          }

      }

   }

}

Martin_

0 Kudos
1,808 Views
Martin_
NXP Employee
NXP Employee

Line 231, execution arrives there if childsock is either zero or RTCS_SOCKET_ERROR, because it is else for line 134. So this code on line 231 effectively does nothing, because inside shutdown() there is an error check to return if given sock is zero or RTCS_SOCKET_ERROR.

Perhaps you can remove the line 231 and hopefully all works still well.

-Martin

0 Kudos
1,808 Views
kcameron
Contributor III

Hi Martin,

You are correct - that else statement is useless. Thanks for pointing it out!

0 Kudos
1,808 Views
kcameron
Contributor III

We've made a little more progress by adding some diagnostics to the TCP code. When we iterate through the list of active TCB's we find that there are a number of blocks that appear to be stuck in the "CLOSE_WAIT" state under the failure condition.

0 Kudos
1,808 Views
Martin_
NXP Employee
NXP Employee

so, TCBs in CLOSE_WAIT state might mean that your application does not call shutdown() to a connected socket in response to received FIN.

maybe you could add a debug condition to see if this is happening. for example, to SOCK_STREAM_shutdown() function, add a counter that increments with each shutdown request. And then add another counter to count incoming FIN: into TCP4_Service_packet, something like: if(flags & FIN) {g_fincnt++;}

then when the lockup situation occurs, check if these two counters match or not. if they not matched, it would mean the application does not call shutdown() when remote peer sends FIN.

0 Kudos
1,808 Views
kcameron
Contributor III

Hi Martin,

I think your probably right that the shutdown() is not called. When the Wi-Fi connection fades out, though, we will likely never receive a FIN.

I've implemented counters in both SOCK_STREAM_shutdown() and TCP4_Service_packet() and monitored their values. The counter in TCP4_Service_packet is always zero, which I think means this function is never called. We definitely are using IPV4.

I've snooped around some more and I've got a theory about what's going on:

In ftpd.c under the FTPd_task() function, a listening socket is created, setup and bound to listen for FTP connection requests. After this is done the code enters an infinite loop that calls accept() to create a new socket for the FTP connection when it is requested by a client. Drilling down into what happens when accept() is called reveals that the TCB data structures are created, but never bound to an endpoint. This is where the "Ground" state comes from - "Ground" state appears to be the guts of a socket that is setup and waiting to be connected to an endpoint.

Now, after looking around in these forums, I've found the following post:

Re: FTP socket problem

This person appears to have had a similar problem. They found that these sockets in the "Ground" state are created by the accept() function, and if FTPd_stop() is called before accept() returns the listening socket is released, however, the "Ground" socket, hanging in limbo waiting for a connection, is not.

Could this be what is happening when the Wi-Fi signal fades out? An FTP connection with a client is started, but then part of the way through (before accept() returns) the connection fades out, which causes a timeout that closes the listening socket and not the "Ground" socket? How could I fix this? I've tried calling TCB_Close_TCB() followed by TCP_Process_release() on the sockets stuck in CLOSE_WAIT but it doesn't seem to have any effect.

0 Kudos
1,808 Views
Martin_
NXP Employee
NXP Employee

When a connection timeout occurs due to lost link, the TCP/IP task will close the TCB for that connection, and socket layer will receive an error. So a recv() or send() will get an error. Now, perhaps your FTP client tries to open a new connection, so it sends a new SYN ? Is the ftp client configured to automatically reconnect or something like that ? Because accept() creates the ground socket, that is right, but it will create just one per connection. So if you have more ground sockets, it might mean the ftp client tried to open more connections when the link is bad.

By the way, the only place in the code where a TCB goes into CLOSE_WAIT state is in the TCP_Process() function, when TCB is in SYN_RECEIVED or ESTABLISHED state and FIN is received (if ( (flags & FIN) != 0 ) {). So I guess the TCBs in CLOSE_WAIT state have received FIN.

When you mention FTPd_stop(), do you call this function ? If so, please do not use it. MQX function _task_abort() is very dangerous, because what it does it sets the program counter to exit handler without any synchronization with the task being aborted. If the task calls some MQX or RTCS API functions, you never know what it executes when _task_abort() is called. For example it can have a semaphore locked, or a message sent to TCP/IP task, waiting for response, then you abort the task, but the semaphore or TCP/IP task doesn't know and everything is wrong since then .....

In our latest applications, we strictly don't use _task_abort(). When we need to stop a service like FTP, we basically use a synchronization mechanism to let the task destroy itself gracefully (by return from the task body). You can look to FTPSRV or HTTPSRV in our latest MQX - we shutdown() the listening socket. We have a select() function waiting on activity on the listening socket, when it is closed, select returns error and we gracefully free all resources, post semaphores etc, destroy all child tasks etc.

So, you should not use FTPd_stop(). If you need to release the FTP server, I suggest this way: just call shutdown on listening socket, and change the for(;;) loop in FTPd_task to handle this request.

Now, please be aware there is one more bug: when you shutdown() listening socket, the accept() function will not return. We have fixed this bug for next MQX release. But for you, there exists a workaround to this bug that works. The workaround is to use RTCS_selectset() function before accept(), like this:

my_sock_set[0] = listensock;

sock = RTCS_selectset(...my_sock_set....);

childsock = accept(sock, ....);

if the socket has been shutdown, the accept() after RTCS_selectset() will return RTCS_SOCKET_ERROR. Then in the FTPd_task, release all memory that it has allocated, shutdown all sockets it has created and return from task body.

These issues are one of the reason why we migrated the FTP to new FTPSRV application. The new FTPSRV solves all these FTP service release issues, supports IPv6 and allows multiple file transfers in parallel.

1,808 Views
kcameron
Contributor III

Hi Martin,

I've implemented the fix you suggested (see below on line 126), however, I'm not really sure what to do in the else statement on lines 224-229? If I call shutdown on listensock and childsock then exit the task my FTP server goes down and stays down.

I'd like to be able to recover more gracefully so that I can still connect a client again - can you please explain how I would do that?

/*TASK*-----------------------------------------------------------------

*

* Function Name    : FTPd_task

* Returned Value   : none

* Comments  :  FTP server.

*

*END*-----------------------------------------------------------------*/

void FTPd_task(pointer init_ptr, pointer creator)

{ /* Body */

   FTPd_CONTEXT               ftpd_context =  { 0 };

   sockaddr_in                laddr;

   uint_32                    sock, childsock, listensock;

   uint_32                    error=RTCS_OK;

   uint_16                    remote_addr_len;

   sockaddr_in                remote_addr = {0};

#if FTPDCFG_ENABLE_MULTIPLE_CLIENTS  

   TASK_TEMPLATE_STRUCT_PTR   t_ptr;

#endif  

   FTPd_SESSION_PTR           session_ptr;

   uint_32                    option;

   boolean                    dev_in_path = FALSE;

   int_16                     devlen = 0, rootdirlen = 0;

   FTPd_task_id = RTCS_task_getid();

  

#ifdef __MQX__

   /* Set up exit handler and context so that we can clean up if the FTP Server is terminated */

   _task_set_environment( _task_get_id(), (pointer) &ftpd_context );

   _task_set_exit_handler( _task_get_id(), FTPd_Exit_handler );

#endif

   laddr.sin_family      = AF_INET;

   laddr.sin_port        = IPPORT_FTP;

   laddr.sin_addr.s_addr = INADDR_ANY;

   /* Listen on TCP port */

   ftpd_context.LISTENSOCK = socket(PF_INET, SOCK_STREAM, 0);

   listensock = ftpd_context.LISTENSOCK;

  

   if (listensock == RTCS_SOCKET_ERROR)

   {

      error = RTCSERR_OUT_OF_SOCKETS;

   }

  

   // When the socket is bound, RTCS allocates a send buffer of the specified number of bytes, which controls how much sent data RTCS

   // can buffer for the socket.

   //Recommended to be a multiple of the maximum segment size (536), where the multiple is at least three.

   if (!error)

   {

      option = FTPDCFG_BUFFER_SIZE;  

      error = setsockopt(listensock, SOL_TCP, OPT_TBSIZE, &option, sizeof(option));

   }

  

   // When the socket is bound, RTCS allocates a receive buffer of the specified number of bytes, which controls

   // how much received data RTCS can buffer for the socket.

   //Recommended to be a multiple of the maximum segment size (536), where the multiple is at least three.

   if (!error)

   {

      option = FTPDCFG_BUFFER_SIZE;  

      error = setsockopt(listensock, SOL_TCP, OPT_RBSIZE, &option, sizeof(option));

   }     

   // Zero: RTCS waits indefinitely for outgoing data during a call to send().

   // Non-zero: RTCS waits for this number of milliseconds for incoming data during a call to send().

   if (!error)

   {

      option = FTPDCFG_SEND_TIMEOUT;  

      error = setsockopt(listensock, SOL_TCP, OPT_SEND_TIMEOUT, &option, sizeof(option));

   }  

   // RTCS maintains the connection for this number of milliseconds. Must be minimum of 180,000

   if (!error)

   {

      option = FTPDCFG_CONNECT_TIMEOUT;  

      error = setsockopt(listensock, SOL_TCP, OPT_CONNECT_TIMEOUT, &option, sizeof(option));

   }  

   // Two times the maximum segment lifetime (which is a constant). Returned information is for the last

   // frame that the socket received.

   if (!error)

   {

      option = FTPDCFG_TIMEWAIT_TIMEOUT;  

      error = setsockopt(listensock, SOL_TCP, OPT_TIMEWAIT_TIMEOUT, &option, sizeof(option));

   } 

  

   // fixme cwp - added to see if we can fix ftp lockup on loss of wifi

   if (!error)

   {

       option = FTPDCFG_RECEIVE_TIMEOUT;  // Set this value to what you need (360000 for 6 minute, for me)

       error = setsockopt(listensock, SOL_TCP, OPT_RECEIVE_TIMEOUT, &option, sizeof(option));

   }

  

   if (!error)

   {

       option = FTPDCFG_NO_NAGLE_ALGORITHM;  

       error = setsockopt(listensock, SOL_TCP, OPT_NO_NAGLE_ALGORITHM, &option, sizeof(option));

   }

   if (!error)

   {

      error = bind(listensock, (const sockaddr *)&laddr, sizeof(laddr));

   }

   if (!error)

   {

      error = listen(listensock, 0);

   }

   if (error)

   {

      RTCS_task_exit(creator, error);

   }

  

   RTCS_task_resume_creator(creator, RTCS_OK);

   for (;;)

   {

      // Connection requested; accept it

      remote_addr_len = sizeof(remote_addr);

     

      watchdog_set_state_blocking(ftpd_task_index);

     

      // KDC: Added this at the suggestion of Freescale. This will causes accept() to return RTCS_SOCKET_ERROR if the listening socket has been shutdown

      sock = RTCS_selectset(&listensock, 1, 10000/*FTPDCFG_CONNECT_TIMEOUT*/);

      // KDC: end of above

     

      childsock = accept(sock,(sockaddr *)&remote_addr, &remote_addr_len);

     

      watchdog_set_state_active(ftpd_task_index);

     

      if ((childsock != 0) && (childsock!=RTCS_SOCKET_ERROR))

      {

         session_ptr = (FTPd_SESSION_PTR) RTCS_mem_alloc_zero(sizeof (FTPd_SESSION_STRUCT));

         if ( session_ptr )

         {

            _mem_set_type(session_ptr, MEM_TYPE_FTPd_SESSION_PTR);

            session_ptr->DATA_BUFFER_SIZE = FTPDCFG_BUFFER_SIZE; // KDC: Changed this

            session_ptr->DATA_BUFFER_PTR = RTCS_mem_alloc_zero(session_ptr->DATA_BUFFER_SIZE);

           

            if (session_ptr->DATA_BUFFER_PTR == NULL)

            {

               _mem_free(session_ptr);

               session_ptr = NULL;

            }

            else

            {

               _mem_set_type(session_ptr->DATA_BUFFER_PTR, MEM_TYPE_FTPd_DATA_BUFFER);

            }

         }

           

         if (session_ptr == NULL)

         {

            shutdown((uint_32)childsock, FTPDCFG_SHUTDOWN_OPTION);

         }

         else 

         {

            session_ptr->CONTROL_SOCK = (uint_32) childsock;

            session_ptr->CONNECTED = TRUE;

            /* set default data ports */

            session_ptr->SERVER_DATA_SOCKADDR.sin_family      = AF_INET;

            session_ptr->SERVER_DATA_SOCKADDR.sin_port        = IPPORT_FTPDATA;

            session_ptr->SERVER_DATA_SOCKADDR.sin_addr.s_addr = INADDR_ANY;

            session_ptr->USER_DATA_SOCKADDR.sin_family      = remote_addr.sin_family;

            session_ptr->USER_DATA_SOCKADDR.sin_port        = remote_addr.sin_port;

            session_ptr->USER_DATA_SOCKADDR.sin_addr.s_addr = remote_addr.sin_addr.s_addr;

#if FTPDCFG_USES_MFS

            //initialize current directory and current filesystem

            devlen = _io_get_dev_for_path(session_ptr->CURRENT_FS_NAME, &dev_in_path, FTPD_DEVLEN,(char *)FTPd_rootdir, NULL);

                       

            session_ptr->CURRENT_FS = _io_get_fs_by_name(session_ptr->CURRENT_FS_NAME);

           

            error = ioctl(session_ptr->CURRENT_FS, IO_IOCTL_CHECK_DIR_EXIST,(pointer)FTPd_rootdir ); 

           

            if (error) 

            {

#endif           

               session_ptr->CURRENT_FS = NULL;

               session_ptr->CURRENT_FTP_DIR = NULL;

               session_ptr->CURRENT_FS_DIR[0] = '\0';

#if FTPDCFG_USES_MFS              

            }

            else

            {

               // set current fs dir (including root dir)

               strcpy(session_ptr->CURRENT_FS_DIR,FTPd_rootdir+devlen);

              

               rootdirlen = strlen(session_ptr->CURRENT_FS_DIR);

               // set current FTP dir

               session_ptr->CURRENT_FTP_DIR = session_ptr->CURRENT_FS_DIR + rootdirlen - 1;

              

               //check if there is / at the end of root dir name

               if(*(session_ptr->CURRENT_FTP_DIR) != '\\' && *(session_ptr->CURRENT_FTP_DIR) != '/')

               {

                  session_ptr->CURRENT_FTP_DIR++;

               }

              

               session_ptr->CURRENT_FTP_DIR[0] = '\\';

               session_ptr->CURRENT_FTP_DIR[1] = '\0';

            }

#endif                    

            #if FTPDCFG_ENABLE_MULTIPLE_CLIENTS

               /* Create a task to look after it */

               RTCS_detachsock(childsock);

               t_ptr = _task_get_template_ptr(MQX_NULL_TASK_ID);

               if (RTCS_task_create("FTPd_child", t_ptr->TASK_PRIORITY, t_ptr->TASK_STACKSIZE, FTPd_child, (pointer) session_ptr) != RTCS_OK )

               {

                  RTCS_attachsock(childsock);

                  shutdown(childsock, FLAG_ABORT_CONNECTION);

                  _mem_free(session_ptr->DATA_BUFFER_PTR);

                  _mem_free(session_ptr);

               }

            #else

               FTPd_child((pointer) session_ptr,0);

            #endif

         }        

      }

      else

      {

          printf("Failed to create child socket...\n");

    

          // Not sure what to do here...

      }

   }

  

   watchdog_set_state_blocking(ftpd_task_index);

}

0 Kudos
1,808 Views
Martin_
NXP Employee
NXP Employee

It is application dependent. In FTPSRV, we have defined a close request to listening socket as application request to release the server. So we basically free all resources we have allocated, we send messages to all child tasks to destroy themselves, shutdown all sockets we have created etc. FTPSRV can be safely start and released many times.

In you application, if you don't allow to stop the FTPSRV, then I guess a time delay would be sufficient. You never call shutdown on the listening socket, so childsock will be RTCS_SOCKET_ERROR if accept() fails, for example due to low memory resources for the new connection, so maybe after a timeout there will be more memory available in the system, as other tasks finish and free resources.

0 Kudos
1,808 Views
kcameron
Contributor III

Hi Martin,

Thanks again for your prompt responses - you have no idea how frustrating it is when we're trying to meet deadlines and we have to wait for a long time for responses to help us when we're stuck!

We're not explicitly calling FTPd_stop() in any code that we've written, I just assumed (apparently wrongly) from the posts made in that discussion I referenced that FTPd_stop() gets called somewhere in the TCP code.

Thanks for the explanation of the bug in the for(;;) loop - now we're getting somewhere. I'll try to implement your suggested fix and report back.

0 Kudos
1,808 Views
kcameron
Contributor III

One more thing - to answer your question about the client trying to reconnect: yes we have a python script that opens and closes an FTP connection every second or so. We found this allows us to reproduce the failure mode reliably if we use it in conjunction with attenuating the Wi-Fi signal.

0 Kudos