SOCK_STREAM_recv() bug drops data on receive timeout

取消
显示结果 
显示  仅  | 搜索替代 
您的意思是: 
已解决

SOCK_STREAM_recv() bug drops data on receive timeout

跳至解决方案
1,331 次查看
bowe
Contributor III

We are using MQX for KSDK 1.3.  We have found that the socket recv() function will drop some data if a receive timeout occurs.  We found this when trying to transferring a relatively large file (~500KB).  I believe this is the relevant code in SOCK_STREAM_recv():

  error = RTCSCMD_issue(parms, TCP_Process_receive);

  if (error) {

    RTCS_setsockerror(sock, error);

 

    /* Start CR 2340 */

    /* If data was copied to the userbuf, but not all that

       the recv() asked for, and a timer was started that has

       now timed out, we need to return with the count, and not

       RTCS_ERROR */

    if (error == RTCSERR_TCP_TIMED_OUT) {

       int n;

       _task_stop_preemption();

       n = parms.TCB_PTR->rcvnxt - parms.TCB_PTR->rcvbufseq;

       _task_start_preemption();

       RTCS_EXIT2(RECV, RTCS_OK, n);

    }

When a receive timeout occurs (detected by line 10), it is supposed to return how much data it partially filled the buffer with.  However, we are seeing it return 0, even though there is data in the buffer (we added a memset to zero out the buffer before calling the socket recv function).  I think that line 13 is the culprit (it seems to always set n to 0), but I am not sure what it should be to fix the issue, because I do not completely understand the TCB (Transmission Control Block) and how they are used.  I am guessing this line used to work with a previous version of RTCS, and was never updated?

标签 (1)
标记 (2)
1 解答
942 次查看
bowe
Contributor III

We believe we found a fix for this (at least it seems to work in our preliminary testing).  I think the issue is rcvbufseq has already been updated when we try to perform the calculation for the number of bytes, so we need to save its value before the RTCSCMD_issue().  Then since rcvbufseq should be updated, and rcvnxt might get updated before SOCK_STREAM_recv() gets to continue running, we actually want to look at the new value of rcvbufseq.  So the code shown in the previous post changes to this (again, that code is in SOCK_STREAM_recv() in sock_stream.c):

  uint32_t prev_recvbufseq = parms.TCB_PTR->rcvbufseq;

  error = RTCSCMD_issue(parms, TCP_Process_receive);

  if (error) {

    RTCS_setsockerror(sock, error);

 

    /* Start CR 2340 */

    /* If data was copied to the userbuf, but not all that

       the recv() asked for, and a timer was started that has

       now timed out, we need to return with the count, and not

       RTCS_ERROR */

    if (error == RTCSERR_TCP_TIMED_OUT) {

       int n;

       n = parms.TCB_PTR->rcvbufseq - prev_recvbufseq;

       RTCS_EXIT2(RECV, RTCS_OK, n);

    } 

Then another tweak to make httpsrv_read() in httpsrv_supp.c return when there is no data to be read (instead of locking up in a loop senslessly), we forced a return when the received number of bytes is 0 (change to line 07):

/* If there is some space remaining in user buffer try to read from socket */

while (read < len)

{

    uint32_t received;

    received = httpsrv_recv(session, dst+read, len-read, 0);

    if ((received != 0) && ((uint32_t)RTCS_ERROR != received))

    {

            read += received;

    }

    else

    {

        break;

    }

}

return(read);

在原帖中查看解决方案

3 回复数
943 次查看
bowe
Contributor III

We believe we found a fix for this (at least it seems to work in our preliminary testing).  I think the issue is rcvbufseq has already been updated when we try to perform the calculation for the number of bytes, so we need to save its value before the RTCSCMD_issue().  Then since rcvbufseq should be updated, and rcvnxt might get updated before SOCK_STREAM_recv() gets to continue running, we actually want to look at the new value of rcvbufseq.  So the code shown in the previous post changes to this (again, that code is in SOCK_STREAM_recv() in sock_stream.c):

  uint32_t prev_recvbufseq = parms.TCB_PTR->rcvbufseq;

  error = RTCSCMD_issue(parms, TCP_Process_receive);

  if (error) {

    RTCS_setsockerror(sock, error);

 

    /* Start CR 2340 */

    /* If data was copied to the userbuf, but not all that

       the recv() asked for, and a timer was started that has

       now timed out, we need to return with the count, and not

       RTCS_ERROR */

    if (error == RTCSERR_TCP_TIMED_OUT) {

       int n;

       n = parms.TCB_PTR->rcvbufseq - prev_recvbufseq;

       RTCS_EXIT2(RECV, RTCS_OK, n);

    } 

Then another tweak to make httpsrv_read() in httpsrv_supp.c return when there is no data to be read (instead of locking up in a loop senslessly), we forced a return when the received number of bytes is 0 (change to line 07):

/* If there is some space remaining in user buffer try to read from socket */

while (read < len)

{

    uint32_t received;

    received = httpsrv_recv(session, dst+read, len-read, 0);

    if ((received != 0) && ((uint32_t)RTCS_ERROR != received))

    {

            read += received;

    }

    else

    {

        break;

    }

}

return(read);

942 次查看
Carlos_Musich
NXP Employee
NXP Employee

Hi Bowe,

your workaround seems fine. Thank you so much for sharing it.

I need to report this issue to MQX development team. I will let you know when they have a final fix.

Regards,

Carlos

0 项奖励
回复
942 次查看
Carlos_Musich
NXP Employee
NXP Employee

Just FYI,

the report number for this issue is MQX-5683.

Regards,

Carlos

0 项奖励
回复