How do you detect (and handle!) a lwIP disconnection?

cancel
Showing results for 
Show  only  | Search instead for 
Did you mean: 

How do you detect (and handle!) a lwIP disconnection?

21,183 Views
dave408
Senior Contributor II

My application handles graceful closing of the socket, but it does not recognize when the Ethernet cable has been disconnected.

 

In my packet-handling loop, I check the result of all netconn commands used, which include:

netconn_recv

looping over netconn_data and netbuf_next to assemble a packet

netbuf_delete (no error check here, returns void)

netconn_write

 

When I remove the ethenet cable, the loop never exits like I would expect.  Outside the loop is where I call netconn_close and netconn_delete.  Because the loop doesn't exit, I cannot reconnect to my device after plugging the cable back in.

 

I can't get any MQX task data when I pause execution, but at one time I did see that the lwIP tcpip_task was blocked on a semaphore.

 

Can anyone suggest ways that I should be handling cable disconnections in lwIP?

Labels (1)
23 Replies

9,912 Views
haraldadolph
Contributor III

Hi all,

I'm using SDK version 2.6 for a K66 processor on a proprietary hardware, and the latest release of the mcuxpresso. We use freeRTOS, lwip and web sockets. Communication works in principle, but with the mentioned hurdles regarding the connect and disconnect of the ethernet cable.

It seems there is no applicable solution for that item, at least for the moment. I have investigated the item in several steps, and have also changed the library code in that it gets multitasking capabilites.

It was mentioned by the NXP support earlier that in SDK Version 2.6 a cable connect / disconnect would be properly handled, but in fact that is not the case in 2.6 lwip. The only thing that behaves better than before is the initial connect status of the network, i.e. if no cable is connected initially and you plug in the ethernet, the system does this connect properly, and as long as there is no task with lower priority than the "communication task" that initiates the ethernet handling, that is OK. That means the system is blocked in parts in the "communication task" in several loops as long as we don't have a connection available. That is not favourable in a multitasking system.

After all the changes and additions I made I am able to connect and disconnect, but sometimes the cleanup of the lwip system fails in that sense, that the slowtimer hangs in an assert after disconnecting (the pcb that is is still in the list of pcbs for the timer has state "CLOSED" whereas a "TIME_WAIT" is required). The question is now, how (and also where in the lwip library) to handle the disconnect and with it the cleanup properly. Is it possible to modify the slowtimer such that it accepts a CLOSED-state pcb?

Thank you for your assistance.

9,912 Views
haraldadolph
Contributor III

Hi Ben. thanks for your quick answer.

I'm already in the state to "see" the connect and disconnect, comparably to your processing.

But for me, there is no way to do a system reset on the run, neither on disconnect nor on connect.

So, the only thing that helps is to gracefully reset and delete all data structures lwip built up during the recent connect.

And to re-construct them on next cable connect. I'm pretty far in that item, but just the question of the slowtimer is a remaining problem.

Greetings Harald.

0 Kudos

9,911 Views
benmccormick
Contributor III

When this function is called every 2 seconds:

PHY_DRV_Read(0, enetIfPtr->phyAddr, kEnetPhySR, &phyStatus);

if ( (phyStatus & 0x04) == 0){

printf("Ethernet cable removed.\r\n");

}

It works just fine.

When the cable gets plugged back into the connector is another issue. I didn’t keep track of all the various states it could be in, so I just do a reset.

This is an older version of lwIP though, I haven’t upgraded.

0 Kudos

9,912 Views
mpazzi
Contributor III

Hi Ben McCormick,

I think that NXP team should give some suggestion/or solution for this bug resolution.

In my case I can not do a reset of microcontroller.

Thanks

0 Kudos

9,936 Views
n_2t
Contributor I

Hi dave408, did you sucess???

0 Kudos

9,935 Views
dave408
Senior Contributor II

thinhnguyen‌ It has been a long time since I worked on that project, and honestly, I cannot remember if I personally solved that problem.  It's possible that one of my colleagues that took over did figure out how to handle disconnections.  I'll ask him.

9,936 Views
n_2t
Contributor I

Hi Dave did you ask him?

I have tried to use PHY_DRV_GetLinkStatus but linkstatus always return false, so it did not work for me.

bool PHY_Get_Initialized_LinkStatus() {
// return true;
 if (!g_initialized) return false;
 bool linkstatus = false;
 int timeout = 10;
 uint32_t result;
 int count = 0;
 while ((count < timeout) && (!linkstatus)) {
 result = PHY_DRV_GetLinkStatus(g_devNumber,g_enetIfPtr->phyAddr,&linkstatus);
// if (result == kStatus_ENET_Success) {
// PRINTF("result == kStatus_ENET_Success, linkStatus = %d\r\n", linkstatus);
// return (linkstatus);
// } else {
// PRINTF("result == kStatus_ENET_Failed\r\n");
// return false;
// }
 count++;
 }
 if (count == timeout)
 {
 return false;
 } else {
 return true;
 }
}
0 Kudos

9,935 Views
mpazzi
Contributor III

Hi everyone,

I have the same problem but not the solution that work right.

Any suggestion from NXP team ?

Thanks so much

0 Kudos

9,936 Views
DavidS
NXP Employee
NXP Employee

Hi Dave,

You are on the right track.

I did a test using the lwip_ping_bm_frdmk64f example in KSDK_v2 using KDS_3.2.

The packet sent in this example finally ends up in ethernetif.c low_level_output() function.  I made the following edits using #if 1's:

    /* Send a multicast frame when the PHY is link up. */

    if (kStatus_Success == PHY_GetLinkStatus(ENET, phyAddr, &link))

    {

        if (link)

        {

#if 1 //DES 1=test, 0=default code

        netif_set_link_up(&fsl_netif0);

#endif

            if (kStatus_Success == ENET_SendFrame(ENET, &g_handle, pucBuffer, packetBuffer->tot_len - ETH_PAD_SIZE))

            {

                return ERR_OK;

            }

        }

#if 1 //DES 1=test, 0=default code

        netif_set_link_down(&fsl_netif0);

#endif

    }

My callback blinks the Blue LED fast when connected, and slow when disconnected.

#if 1 //DES 1=test, 0=default code

void delay(uint32_t loop_cnt, uint32_t blinks)

{

  uint32_t i,j;

  for(j=0;j<blinks*2;j++) {

  LED_BLUE_TOGGLE(); //DES blink

  for(i=0;i<loop_cnt;i++) //DES delay

  {

  __asm("nop");

  }

  }

}

void my_link_callback(void)

{

  if(netif_is_link_up(&fsl_netif0)) { //DES link up blink fast

  delay(1000000U, 8U);

  }

  else { //DES link down blink slow

  delay(4000000U, 8U);

  }

}

#endif

My main() had following:

    netif_set_default(&fsl_netif0);

    netif_set_up(&fsl_netif0);

#if 1 //DES 1=test, 0=default code

    netif_is_link_up(&fsl_netif0);

    netif_set_link_callback(&fsl_netif0, my_link_callback); //DES called when link transitions

#endif

    LWIP_PLATFORM_DIAG(("\r\n************************************************"));

And I added PCR initialization to pin_mux.c BOARD_InitPins():

    CLOCK_EnableClock(kCLOCK_PortB);

    /* Affects PORTB_PCR16 register */

    PORT_SetPinMux(PORTB, 16u, kPORT_MuxAlt3);

    /* Affects PORTB_PCR17 register */

    PORT_SetPinMux(PORTB, 17u, kPORT_MuxAlt3);

#if 1 //DES 1=test, 0=default code

    /* Led pin mux Configuration */

    PORT_SetPinMux(PORTB, 21U, kPORT_MuxAsGpio); //DES Blue LED on PTB21

#endif

Regards,

David

9,935 Views
mpazzi
Contributor III

Hi everyone,

I have the same problem but not the solution that work right.

Any suggestion from NXP team ?

 

Thanks so much

0 Kudos

9,936 Views
dave408
Senior Contributor II

Actually, even if I am able to detect the problem, I'm not sure yet what to do with it.  The main issue is really that the lwip tcpip_thread remains blocked if I pull the Ethernet cable.  This prevents me from reconnecting to my device.

pastedImage_3.png

In tcpip_thread, this looks like the only place it could be held up:

pastedImage_4.png

Here's what I have found -- when the cable is disconnected, sys_arch_mbox_fetch returns SYS_ARCH_TIMEOUT, which calls a handler and then jumps to a label called "again".  This is what is causing the tcpip_thread to look like it's blocked on a semaphore in the TAD view -- because the cable is disconnected, there isn't any data to get from the mbox, and the code just loops back and tries again.

I have a massive hack that seems to get things going in the right direction.  I replaced the "goto again" in sys_timeouts_mbox_fetch to this:

      if( attempts++ < 10)

       goto again;

      else

       return;

What this does is allow the function to return after the timeout functions (tcpip, arp, dhcp, etc) timeout a certain number of times.  It's clunky, but seems to work.  The return statement will result in the caller detecting an invalid message, which is logged and ignored, which I think will work for me.  One issue with this solution that bothers me is the 10 attempts.  I selected that because there are cases where we do need retry pulling data from the mbox even when the cable is connected.  So 10 is basically just the threshold where I seem to have reliable network communications, but can also recognize a disconnected cable and recover in a timely manner when the cable is eventually reconnected.

However, there must be a better way to deal with disconnections than this.  I'll keep working on a better solution, but if you have any ideas, please let me know!  Thanks, DavidS

0 Kudos

9,936 Views
benmccormick
Contributor III

Hi Dave,

In my application, if the cable is pulled, then the Ethernet apps have nothing to talk to, so no packets are sent or received. When the cable is re-inserted, I do a System reset.

  Startup:

    cableStatus=0;

    enet_main();

    PHY_DRV_Read(0, enetIfPtr->phyAddr, kEnetPhySR, &phyStatus);

    if( (phyStatus & 0x04) == 0x04)cableStatus=1;                   //If link up, then cable is attached.

while(1){

                          PHY_DRV_Read(0, enetIfPtr->phyAddr, kEnetPhySR, &phyStatus);

                           if ( ((phyStatus & 0x04) == 0) && (cableStatus==1) ){

                             printf("Ethernet cable removed.\n");

                             cableStatus=0;

                           }

                          if ( ((phyStatus & 0x04) == 0x04) && (cableStatus==0) ){

                           //Here when Ethernet cable inserted

                             NVIC_SystemReset();

                             for(;;);

                           }            

}

0 Kudos

9,936 Views
dave408
Senior Contributor II

benmccormick​ I started to look into your solution that uses PHY_DRV_Read().  What I am seeing is that my PHY status is always 0x7849, whether my ethernet cable is connected or not.  However, I think that got me to dig some more and I ended up in low_level_init() in ethernetif.c -- in there, the library uses PHY_DRV_GetLinkStatus to determine whether or not there is a link with the client, which I think I'll be able to use now.  I'll keep updating this post with my progress.

9,936 Views
gustavocosta
Contributor III

Hi dave408,

did you succeed? I'm having the same problem as you.

Any help would be apreciated.

Gustavo Costa,

R&D Engineer

0 Kudos

9,936 Views
dave408
Senior Contributor II

Thanks, benmccormick​!  I'll give those functions a try to see how I can make it work in my application.  Resetting the firmware isn't an option, but I should be able to figure something out.  My original post that has the hack in it to prevent the tcpip_thread from tight looping is flawed, so I needed something else.  I'll share my approach with everyone once I get it working correctly.

0 Kudos

9,936 Views
dave408
Senior Contributor II

I think this might be a potential start of a solution:

In ethernetif.c, low_level_init():

        result = PHY_DRV_GetLinkStatus(devNumber,enetIfPtr->phyAddr,&linkstatus);

        if(result == kStatus_ENET_Success)

        {

         if(linkstatus == true)

But I wonder if it's safe for me to call the PHY_* functions from a MQX task?

0 Kudos

9,936 Views
dave408
Senior Contributor II

Bummer... looks like this solution won't work with KSDK 1.2.  I cannot move to KSDK 2.0 yet.  If you have any suggestions that might work in a similar manner for KSDK 1.2, please let me know!  I'll start digging around for clues.

0 Kudos

9,936 Views
dave408
Senior Contributor II

Thank you for your help!  I will give this a try today and will let everyone know how it goes.

0 Kudos

9,936 Views
dave408
Senior Contributor II

Ok, so it looks like this is the missing link:

/**

* Called by a driver when its link goes down

*/

void netif_set_link_down(struct netif *netif )

{

  if (netif->flags & NETIF_FLAG_LINK_UP) {

    netif->flags &= ~NETIF_FLAG_LINK_UP;

    NETIF_LINK_CALLBACK(netif);

  }

}

It looks like I have to call netif_set_link_callback and pass it a callback function to call when the cable is disconnected.  However, my next question is, if tcpip_task is blocked on a semaphore and I can't figure out what semaphore that is, how can I write a callback function that will allow my packet handling loop to exit gracefully and then accept a new connection?

EDIT -- I added the callback and passed it to netif_set_link_callback.  I also enabled the callback via LWIP_NETIF_LINK_CALLBACK.  Unfortunately, when I removed the cable, my callback function didn't get called.  Am I missing something else here?

0 Kudos

9,936 Views
mpazzi
Contributor III

Hi dave408,

how did you resolve this issue ? I have the same issue but not the solution.

Thanks for your help.

Regards

0 Kudos