Stuck in TCP/IP Task?
- 新着としてマーク
- ブックマーク
- 購読
- ミュート
- RSS フィードを購読する
- ハイライト
- 印刷
- 不適切なコンテンツを報告
Hi All,
I'm seeing a problem in my project with two TWR-MPC-5125 modules communicating via Ethernet where one of the modules will transmit something and cause both modules to be stuck in the MQX TCP/IP task. It seems to require both modules attempting to transmit to each other (5 or 6 small messages, 40k or so) at the same time to cause this to happen. When it happens, my normal application tasks all show ready (priorities around 10 and 11) and my timer task that services my leds every 100mS (priority 2) runs properly but the TCP/IP task (priority 6) never blocks to let other tasks run. If I hit the reset button on one of the modules, the other module returns to normal operation. It seems like something has gotten the socket in a state that won't allow RTCS to allow other (lower priority) tasks to run and that bad state seems to be cleared when one of the modules is reset.
Any suggestions for what could cause a socket to do this or what kinds of things I should look for in troubleshooting? If I prevent one of the modules from transmitting it's 5 or 6 messages until the other one has finished all works properly. Also, once these 5 or 6, 40k messages are exchanged, continuous small (20 or 30 byte) messages are sent back and forth for days/weeks with no ill effects. I'm using MQX 3.8.1.1 patched with RTCS 4.0.1. Thanks for any guidance or ideas.
Best,
Tim
- 新着としてマーク
- ブックマーク
- 購読
- ミュート
- RSS フィードを購読する
- ハイライト
- 印刷
- 不適切なコンテンツを報告
A little more information, I set a breakpoint on the MQX function _task_set_error() and have had no hits. Also, I increased the size of the MQX RTCS Socket Tx and Rx buffers from the default size of 4380 bytes to 17,532 bytes (4 times larger). After increasing the buffer sizes I haven't seen the lockups again. Does this make sense? Do these buffers need to be sized for the largest message that will be transmitted or received to prevent the TCP/IP task from locking up? I assumed if I passed send() a buffer pointer to more data then it could hold in it's TxBuffer it would just block until it had transferred all the data then it would return. Is this incorrect? Thanks for any suggestions.
Best,
Tim
- 新着としてマーク
- ブックマーク
- 購読
- ミュート
- RSS フィードを購読する
- ハイライト
- 印刷
- 不適切なコンテンツを報告
Hi Tim, this seems like a problem with acknowledge. The send data sit in the send buffer until it is acknowledged by remote peer. So the lockup situation might be: the send buffer is full and the board is not able to receive ACK ?
If you want to be sure this is the problem, I think you can put breakpoint when PCB alloc fails. That would be in the source file "rtcspcb.c", when rtcs_pcb = RTCS_part_alloc(RTCS_data_ptr->RTCS_PCB_partition); is NULL. If this occurs, it might be all PCBs are scheduled for transmission and there is no more PCB available to receive ACK ?
If this is the case, the fix is very easy - increase the number of PCBs available: _RTCSPCB_init/_RTCSPCB_grow/_RTCSPCB_max global variables.
If you have enough memory for 16 KB send/receive buffers, it will give you much better throughput.
-Martin
- 新着としてマーク
- ブックマーク
- 購読
- ミュート
- RSS フィードを購読する
- ハイライト
- 印刷
- 不適切なコンテンツを報告
Hi Martin,
I tested your theory that perhaps RTCSPCB_alloc(void) may be unable to allocate a PCB and didn't see this happening. To catch this, I modified the MQX ...rtcs\source\tcpip\RTCSPCB_alloc() function slightly to place an ELSE asm(nop); that I could set a breakpoint on (see code insert). Then I rebuilt MQX, recompiled my project with the size of the Tx/Rx buffers shrunk down to the default 4380 bytes (which caused the problem to re-appear). I placed a breakpoint on the asm(nop); statement in the RTCSPCB_alloc() function and let the problem happen. Neither box A or box B hit this breakpoint when the problem occurred.
At this point I have corrected the problem by increasing the size of the MQX Tx and Rx buffers from the default 4380 up to 17520 (4 times larger). Since doing this I haven't seen the problem again. What's your opinion, is this a reasonable solution? I worry about experiencing future problems as traffic increases and message sizes grow. Here's the values I used to init the Ethernet system do these values look reasonable to you?
_RTCSPCB_init = 16; //Default value: 4
_RTCSPCB_grow = 8; //Default value: 2
_RTCSPCB_max = 64; //Default value: 20
_RTCS_msgpool_init = 64; //Default value: 4
_RTCS_msgpool_grow = 32; //Default value: 2
_RTCS_msgpool_max = 320; //Default value: 20
_RTCS_socket_part_init = 12; //Default value: 4
_RTCS_socket_part_grow = 8; //Default value: 2
_RTCS_socket_part_max = 160; //Default value: 20
_RTCSTASK_stacksize = 10000; //Default value: 10000
Thanks a lot for your advice Martin I appreciate it.
Best Regards,
Tim
RTCSPCB_PTR RTCSPCB_alloc(void) { RTCS_DATA_PTR RTCS_data_ptr; RTCSPCB_PTR rtcs_pcb;
RTCS_data_ptr = RTCS_get_data();
rtcs_pcb = RTCS_part_alloc(RTCS_data_ptr->RTCS_PCB_partition);
if(rtcs_pcb != NULL) { rtcs_pcb->IP_SUM_PTR = NULL; _mem_zero(&rtcs_pcb->LINK_OPTIONS, sizeof(rtcs_pcb->LINK_OPTIONS)); } /* Endif */ else asm(nop); //To allow an error trap
RTCSLOG_PCB_ALLOC(rtcs_pcb); return(rtcs_pcb); }
- 新着としてマーク
- ブックマーク
- 購読
- ミュート
- RSS フィードを購読する
- ハイライト
- 印刷
- 不適切なコンテンツを報告
Thank you for your reply Martin, I will give this a try and post back here what I see. Your idea does make sense given what I'm seeing here. Also, when I increased the size of my send receive buffers to 17,532 bytes the problem disappeared.
Best,
Regards