RT1064 enet_send_frame() may block forever

cancel
Showing results for 
Show  only  | Search instead for 
Did you mean: 

RT1064 enet_send_frame() may block forever

1,383 Views
tmeyer1
Contributor III

Hi,

My application is continuously sending many ~200 byte UDP packets over Ethernet and the thread which is calling `udp_sendto()` function will eventually block forever (after hundreds of messages, randomly < ~2 minutes). I traced this down to `ENET_SendFrame()` returning `kStatus_ENET_TxFrameBusy` and then the call to `xEventGroupWaitBits` (in `enet_send_frame()`). After further debugging I determined that the TX interrupt never comes in to set the TX event even though `ENET_ActiveSend()` is being called. Interestingly if I change `ENET_TXBD_NUM` to 1, the issue goes away. There seems to be an issue or race condition in the ISR versus to `ENET_SendFrame()/ENET_ActiveSend`. Every so often the ISR seems to take a slightly longer to be raised, and sometimes this will cause the blocking issue. 

Has anyone else seen this? Could this be a bug in the fsl_enet driver? 

Note: I also tried changing the enet_ethernetif_kinetis driver to timeout inside `enet_send_frame()` instead of blocking forever waiting for the event. This slightly changed the behavior and I would sometimes recover from the missing TX ISR, but eventually (after 1-5 "busy" errors) I would stop getting TX ISR's altogether. I should also note that I would never see any other error interrupt flags (after I enabled them).

For now, my work around is to only use 1 TX buffer and it appears to be working well and not missing TX ISRs.

 

*Setup*

RT1064 ENET1 attached to a Marvell switch via RMII

Custom project based off the SDK 2.8.2 code: lwip, FreeRTOS. However, I also saw this issue with the previous SDK.

FSL_SDK_ENABLE_DRIVER_CACHE_CONTROL is defined in my project settings.

Isolated Ethernet network with only my development system, the Ethernet switch, and the RT1064. 

 

Thanks,

Tim

Labels (1)
6 Replies

723 Views
HectorOrd
Contributor I

Hi, 

I am also facing the same issue, when ENET_SendFrame() returns kStatus_ENET_TxFrameBusy the TX interrupt never comes in to set the TX event.

Important to mention is that this happens when I have ENET_TXBD_NUM 3. I tried Tim's workaround and set ENET_TXBD_NUM to 1 and it worked fine. Every time ENET_SendFrame() returns TxFrameBusy it always recovers and make the TxBuffer ready again, in contrast with the other configuration using 3 TX buffers,  which cannot recover from the busy state. In my case it takes about 3-4 hours to fail. I am using the RT1051 on a custom project based off the SDK 2.8.2 code: lwip, FreeRTOS.

Best regards, 

Hector

 

 

 

 

 

0 Kudos

1,363 Views
kerryzhou
NXP TechSupport
NXP TechSupport

Hi tmeyer1

   Thanks for your interest in the NXP MIMXRT product, I would like to provide service for you.

   Can you also reproduce the issues in the MIMXRT1064-EVK board?

  Do you mean, this project:

\SDK_2.8.2_EVK-MIMXRT1064\boards\evkmimxrt1064\lwip_examples\lwip_udpecho\freertos

  When run about 2 minutes also can reproduce the issues?

  Please help to confirm the question at first, then I will try it on my side and check more details about it.

 

Wish it helps you!

If you still have questions about it, please kindly let me know!

Best Regards,

Kerry

-------------------------------------------------------------------------------

Note:

- If this post answers your question, please click the "Mark Correct" button. Thank you!

 

- We are following threads for 7 weeks after the last post, later replies are ignored

Please open a new thread and refer to the closed one, if you have a related question at a later point in time.

-----------------------------------------------------------------------------

0 Kudos

1,352 Views
tmeyer1
Contributor III

Hi Kerry,

I have not yet reproduced this on the EVK nor with a example project. I started with that code base, however my software/hardware has converged too far to go back to the EVK. 

I've attempted to reproduce this scenario on my EVK, however I could not reproduce the issue on it.I also tried to create a simplified project to run on my board and I couldn't reproduce the issue then either. 

The other running threads must be introducing some race condition to the system which have an effect on the Enet driver or peripheral. 

Tim

0 Kudos

1,339 Views
kerryzhou
NXP TechSupport
NXP TechSupport

Hi  tmeyer1,

     Thanks so much for your updated information.

     So, can you try to add some other related code to your simplified project, whether you can reproduce the issues or not?

      In your issue project, do you try to disable the cache, whether it has some improvement or not?

      Please also try to enlarge the freertos heap size, any improvement or not?

 

Wish it helps you!

If you still have questions about it, please kindly let me know!

Best Regards,

Kerry

-------------------------------------------------------------------------------

Note:

- If this post answers your question, please click the "Mark Correct" button. Thank you!

 

- We are following threads for 7 weeks after the last post, later replies are ignored

Please open a new thread and refer to the closed one, if you have a related question at a later point in time.

-----------------------------------------------------------------------------

0 Kudos

1,293 Views
tmeyer1
Contributor III

Hi, a minor update on some further tests:

I have the following memory configuration:

FreeRTOS heap: is in OCRAM, size if > 700KB

newlib heap : issue shows up in both OCRAM and DTCM, I tried increasing the size 0xC000

Disabling cache unfortunately slows down my FreeRTOS threads too much to add much value as a test. They slow down enough that they hardly send data over Ethernet anymore. 

My only guess that I have thus far is that something at the physical layer (noise?) is occurring between the switch and the rt1064 which is causing the RT1064 MAC driver to get out of whack and then block forever. 

Tim

0 Kudos

1,282 Views
kerryzhou
NXP TechSupport
NXP TechSupport

Hi tmeyer1,

    Thanks for your updated information.

    Now, before you run the ENET_SendFrame()/ENET_ActiveSend, can you let the code enter critical, protect it won't be interrupt by other code? Whether it can send normally or not?

 

Kerry

-------------------------------------------------------------------------------

Note:

- If this post answers your question, please click the "Mark Correct" button. Thank you!

 

- We are following threads for 7 weeks after the last post, later replies are ignored

Please open a new thread and refer to the closed one, if you have a related question at a later point in time.

-----------------------------------------------------------------------------

 

0 Kudos