Hi,
I'm working on a project with the RT1060, and we're using MCULink and LinkServer:
daniel@daniel-XPS-15-9530:~/Downloads$ /usr/local/LinkServer/LinkServer --version
LinkServer v1.6.133 [Build 133] [2024-07-03 14:06:32]
daniel@daniel-XPS-15-9530:~/Downloads$ /usr/local/LinkServer/LinkServer probes
# Description Serial
--- -------------------------------- -------------
1 MCU-LINK (r0FF) CMSIS-DAP V3.146 RS4JT2HXOZPYH
Our OS is ThreadX, and we're using the Cortex-Debug extension in VSCode for debugging. We've recently started experiencing very slow debugging, where hitting a breakpoint takes more than 30s to trigger.
When reducing the number of threads we use in ThreadX to <=14, the debugging experience becomes fast again. When it's >=15 it's slow.
Here's how we run LinkServer
/usr/local/LinkServer/LinkServer gdbserver --keep-alive work/connected/boards/bursen_c/MIMXRT1060_linkserver_config.json
and our config
{
"copyright": "Copyright 2023 NXP",
"license": "SPDX-License-Identifier: BSD-3-Clause",
"version": "1.0.0",
"vendor": "NXP",
"devices": [
{
"board": "MIMXRT1060-EVKB",
"device": {
"name": "MIMXRT1062xxxxB",
"family": "MIMXRT1060",
"memory": [
{
"location": "0x20000000",
"size": "0x00080000",
"type": "RAM"
},
{
"location": "0x20200000",
"size": "0x00080000",
"type": "RAM"
},
{
"location": "0x60000000",
"size": "0x01000000",
"type": "ExtFlash",
"flash-driver": "MIMXRT1060_SFDP_QSPI.cfx"
}
],
"cores": [
{
"type": "cm7",
"name": "cm7"
}
]
},
"debug": {
"no-packed": true,
"protocol": "swd",
"swo": true,
"connect-script": "RT1060_connect.scp"
}
}
]
}
When we experience slow debugging, it seems to be when GDB is trying to get information about the various threads in the system, and we see the following in the log:
000019946+00000: 33-thread-info 2
0000023961+04015: -> ~"Ignoring packet error, continuing...\n"
0000023961+00000: Ignoring packet error, continuing...
0000023962+00001: -> 33^done,threads=[{id="2",target-id="Thread 536953240",details="\"System Timer Thread\" : TX_SUSPENDED",frame={level="0",addr="0x600dd7a8",func="__get_ipsr_value",args=[],file="/sdk/mcuxsdk/rtos/azure-rtos/threadx/ports/cortex_m7/gnu/inc/tx_port.h",fullname="/sdk/mcuxsdk/rtos/azure-rtos/threadx/ports/cortex_m7/gnu/inc/tx_port.h",line="489",arch="armv7e-m"},state="stopped"}]
0000023962+00000: 34-thread-info 3
0000027984+04022: -> ~"Ignoring packet error, continuing...\n"
0000027985+00001: Ignoring packet error, continuing...
0000027985+00000: -> 34^done,threads=[{id="3",target-id="Thread 536878512",details="\"thread_logging\" : TX_QUEUE_SUSP",frame={level="0",addr="0x600dd7a8",func="__get_ipsr_value",args=[],file="/sdk/mcuxsdk/rtos/azure-rtos/threadx/ports/cortex_m7/gnu/inc/tx_port.h",fullname="/sdk/mcuxsdk/rtos/azure-rtos/threadx/ports/cortex_m7/gnu/inc/tx_port.h",line="489",arch="armv7e-m"},state="stopped"}]
with the "Ignoring packet error, continuing" message coming after each thread-info call.
Do you know of anything we can do to fix this?
Thanks,
Daniel
Hi @MulattoKid
I summarize this case as follows, we tried both for NXP RT1060 EVK and RT1170 with same code, but cannot reproduce this issue.
1. Work well with CMSIS-DAP + MCUXpresso
2. Work well with J-Link + MCUXpreeso
3. We can't support Cortex-Debug(non-NXP product, that might be the bottleneck), you mentioned before ' .......running a GDB server, and the Cortex-Debug extension in VSCode for GDB.' https://stackoverflow.com/questions/78837355/slow-breakpoint-triggering-with-gdb-and-linkserver-mcul...
4. We recommend to install/use our extension and provide feedback: MCUXpresso for VS Code - Visual Studio Marketplace extension + VS Code
B.R,
Sam
Hi @Sam_Gao ,
I appreciate your help on this, and we can look at using NXP's extension instead. However, it'd still be great to be able to see which arguments is being used by MCUXpressoIDE to launch LinkServer's GDBServer.
Hi, @MulattoKid
It's hard to find the key guys to provide arguments used by IDE to lanuch GDB servers for me because of some reasons, and I would like to suggest to use MCUXpresso-for-VS-Code plugin which is from NXP to debug with, it is also based on similar arguments and give trace log when debugging it.
Please follow https://www.nxp.com/design/training/getting-started-with-mcuxpresso-for-visual-studio-code:TIP-GETTI... to install plugin, it show necessay trace log.
B.R,
Sam
OK, thanks anyway. We'll try using the extension when we have some available time
Hi @MulattoKid
Thanks your questions and great inputs about this issue, we will check this issue.
B.R,
Sam
I've tested with a newer version of gdb-multiarch, and arm-none-eabi-gdb from the latest Arm toolchain, and the behavior is the same, so I'm suspecting the issues lies in either MCULink or LinkServer.
I wanted to try using a JLink Plus instead, but have problems getting it working. Even just trying to erase the flash using JFLashExe fails, even though the program is able to connect to the core through the JLink. Any ideas? This is the log I'm getting:
Connecting ...
- Connecting via USB to probe/ programmer device 0
- Probe/ Programmer firmware: J-Link V11 compiled Dec 4 2023 10:22:45
- Probe/ Programmer S/N: 601016349
- Device "MIMXRT1061XXX5B" selected.
- Target interface speed: 4000 kHz (Fixed)
- VTarget = 3.348V
- InitTarget() start
- InitTarget() end - Took 1.15ms
- Found SW-DP with ID 0x0BD11477
- DPIDR: 0x0BD11477
- CoreSight SoC-400 or earlier
- Scanning AP map to find all available APs
- AP[1]: Stopped AP scan as end of AP map has been reached
- AP[0]: AHB-AP (IDR: 0x04770041)
- Iterating through AP map to find AHB-AP to use
- AP[0]: Core found
- AP[0]: AHB-AP ROM base: 0xE00FD000
- CPUID register: 0x411FC271. Implementer code: 0x41 (ARM)
- Cache: L1 I/D-cache present
- Found Cortex-M7 r1p1, Little endian.
- FPUnit: 8 code (BP) slots and 0 literal slots
- CoreSight components:
- ROMTbl[0] @ E00FD000
- [0][0]: E00FE000 CID B105100D PID 000BB4C8 ROM Table
- ROMTbl[1] @ E00FE000
- [1][0]: E00FF000 CID B105100D PID 000BB4C7 ROM Table
- ROMTbl[2] @ E00FF000
- [2][0]: E000E000 CID B105E00D PID 000BB00C SCS-M7
- [2][1]: E0001000 CID B105E00D PID 000BB002 DWT
- [2][2]: E0002000 CID B105E00D PID 000BB00E FPB-M7
- [2][3]: E0000000 CID B105E00D PID 000BB001 ITM
- [1][1]: E0041000 CID B105900D PID 001BB975 ETM-M7
- [1][2]: E0042000 CID B105900D PID 004BB906 CTI
- [0][1]: E0040000 CID B105900D PID 000BB9A9 TPIU-M7
- [0][2]: E0043000 CID B105F00D PID 001BB101 TSG
- I-Cache L1: 32 KB, 512 Sets, 32 Bytes/Line, 2-Way
- D-Cache L1: 32 KB, 256 Sets, 32 Bytes/Line, 4-Way
- Executing init sequence ...
- Initialized successfully
- Target interface speed: 4000 kHz (Fixed)
- Found 1 JTAG device. Core ID: 0x0BD11477 (None)
- Connected successfully
Erasing chip ...
- 4096 sectors, 1 range, 0x60000000 - 0x6FFFFFFF
- Start of determining flash info (Bank 0 @ 0x60000000)
- ERROR: Failed to perform RAMCode-sided Prepare()
- ERROR: Error while determining flash info (Bank 0 @ 0x60000000)
- ERROR: Failed to erase chip
Disconnecting ...
- Disconnected
I was able to get the JLink working. I put it into "Serial Downloader Mode", and could then erase the flash through JFlashExe.
Upon triggering on the same place in the code I'm immediately put into the interactive debugging, and don't have to wait for all the thread info being read out. However, the reason for this is that it seems the JLink is only aware of the current callstack, and doesn't have knowledge about the other threads in the system, i.e. I'm limited to inspecting the the callstack of the thread where the breakpoint it hit.
So, I think the original question still stands: any idea why it's so slow when using more than 14 threads?
If I set remotetimeout to be 1 (which seems to be the lowest usable value), it brings me to the breakpoint faster, and everything still works. It's still slow, but it's faster. Could there be a bug in some protocol that's triggered when the number of threads (or something related) gets to a certain number?
I enabled "set debug remote 1", and now I'm seeing this:
0000018993+00000: -> ~"Thread 7 hit Breakpoint 2, app () at /workspaces/connected2/app/src/app.cpp:652\n"
0000018993+00000: Thread 7 hit Breakpoint 2, app () at /workspaces/connected2/app/src/app.cpp:652
0000018993+00000: -> ~"652\t auto message = application_queue->receive(Wait::DontWait());\n"
0000018993+00000: 652 auto message = application_queue->receive(Wait::DontWait());
0000018993+00000: -> *stopped,reason="breakpoint-hit",disp="keep",bkptno="2",frame={addr="0x6011da94",func="app",args=[],file="/workspaces/connected2/app/src/app.cpp",fullname="/workspaces/connected2/app/src/app.cpp",line="652",arch="armv7e-m"},thread-id="7",stopped-threads="all"
0000018993+00000: mi2.status = stopped
0000018994+00001: 31-thread-list-ids
0000018994+00000: -> &"[remote] Sending packet: $qfThreadInfo#bb\n"
0000018994+00000: [remote] Sending packet: $qfThreadInfo#bb
0000019002+00008: -> &"[remote] Received Ack\n"
0000019002+00000: [remote] Received Ack
0000019002+00000: -> &"[remote] Packet received: m20015AB8,20002230,20002410,20002500,20002320,200028C0,200027D0,200026E0,200025F0,2003A810,20039158,2003B5F0,200029B0,20002AA0,20002B90,20002C80,20002D70,20002E60,20002F50\n"
0000019002+00000: [remote] Packet received: m20015AB8,20002230,20002410,20002500,20002320,200028C0,200027D0,200026E0,200025F0,2003A810,20039158,2003B5F0,200029B0,20002AA0,20002B90,20002C80,20002D70,20002E60,20002F50
0000019002+00000: -> &"[remote] Sending packet: $qsThreadInfo#c8\n"
0000019002+00000: [remote] Sending packet: $qsThreadInfo#c8
0000019002+00000: -> &"[remote] Received Ack\n"
0000019002+00000: [remote] Received Ack
0000019002+00000: -> &"[remote] Packet received: lOK\n"
0000019002+00000: [remote] Packet received: lOK
0000019002+00000: -> 31^done,thread-ids={thread-id="2",thread-id="3",thread-id="4",thread-id="5",thread-id="6",thread-id="7",thread-id="8",thread-id="9",thread-id="10",thread-id="11",thread-id="12",thread-id="13",thread-id="14",thread-id="15",thread-id="16",thread-id="17",thread-id="18",thread-id="19",thread-id="20"},current-thread-id="7",number-of-threads="19"
0000019002+00000: 32-thread-info 2
0000019002+00000: -> &"[remote] Sending packet: $qfThreadInfo#bb\n"
0000019002+00000: [remote] Sending packet: $qfThreadInfo#bb
0000019008+00006: -> &"[remote] Received Ack\n"
0000019008+00000: [remote] Received Ack
0000019008+00000: -> &"[remote] Packet received: m20015AB8,20002230,20002410,20002500,20002320,200028C0,200027D0,200026E0,200025F0,2003A810,20039158,2003B5F0,200029B0,20002AA0,20002B90,20002C80,20002D70,20002E60,20002F50\n"
0000019008+00000: [remote] Packet received: m20015AB8,20002230,20002410,20002500,20002320,200028C0,200027D0,200026E0,200025F0,2003A810,20039158,2003B5F0,200029B0,20002AA0,20002B90,20002C80,20002D70,20002E60,20002F50
0000019008+00000: -> &"[remote] Sending packet: $qsThreadInfo#c8\n"
0000019008+00000: [remote] Sending packet: $qsThreadInfo#c8
0000019008+00000: -> &"[remote] Received Ack\n"
0000019008+00000: [remote] Received Ack
0000019008+00000: -> &"[remote] Packet received: lOK\n"
0000019008+00000: [remote] Packet received: lOK
0000019008+00000: -> &"[remote] Sending packet: $qThreadExtraInfo,20002d70#44\n"
0000019008+00000: [remote] Sending packet: $qThreadExtraInfo,20002d70#44
0000019020+00012: -> &"[remote] Received Ack\n"
0000019020+00000: [remote] Received Ack
0000019020+00000: -> &"[remote] Packet received: 226d6f64756c65732f6d6963726f6f63707022203a2054585f534c454550\n"
0000019020+00000: [remote] Packet received: 226d6f64756c65732f6d6963726f6f63707022203a2054585f534c454550
0000019020+00000: -> &"[remote] Sending packet: $Hg20002d70#6e\n"
0000019020+00000: [remote] Sending packet: $Hg20002d70#6e
0000019020+00000: -> &"[remote] Received Ack\n"
0000019020+00000: [remote] Received Ack
0000019020+00000: -> &"[remote] Packet received: OK\n"
0000019020+00000: [remote] Packet received: OK
0000019020+00000: -> &"[remote] Sending packet: $g#67\n"
0000019020+00000: [remote] Sending packet: $g#67
0000019026+00006: -> &"[remote] Received Ack\n"
0000019026+00000: [remote] Received Ack
0000019026+00000: -> &"[remote] read_frame: Bad checksum, sentsum=0x25, csum=0xe2, buf=000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000ce924200000000000000000\\000002B90,20002C80,20002D70,20002E60,20002F50\n"
0000019026+00000: [remote] read_frame: Bad checksum, sentsum=0x25, csum=0xe2, buf=000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000ce924200000000000000000\000002B90,20002C80,20002D70,20002E60,20002F50
0000020027+01001: -> &"[remote] getpkt_or_notif_sane_1: Timed out.\n"
0000020027+00000: [remote] getpkt_or_notif_sane_1: Timed out.
0000021028+01001: -> &"[remote] getpkt_or_notif_sane_1: Timed out.\n"
0000021028+00000: [remote] getpkt_or_notif_sane_1: Timed out.
0000021028+00000: -> ~"Ignoring packet error, continuing...\n"
0000021028+00000: Ignoring packet error, continuing...
0000021029+00001: -> 32^done,threads=[{id="2",target-id="Thread 536882544",details="\"modules/microocpp\" : TX_SLEEP",frame={level="0",addr="0x00000000",func="??",args=[],arch="armv7e-m"},state="stopped"}]
I received some information on StackOverflow, and it points to there being an issue on your end, either with LinkServer's GDB server, or the FW on the MCULink. See https://stackoverflow.com/questions/78837355/slow-breakpoint-triggering-with-gdb-and-linkserver-mcul... for more details.
It'd be great if this could be investigated, and an update to either LinkServer or MCULink's FW could be provided if an issue is found.
Hi @MulattoKid
Thanks you questions, we are checking this issue and trying to find the root cause. Let me update the status as below.
1. We cannot reproduce the problem on a local setup. The IDE seems really decent even with 40+ ThreadX threads (ThreadX example). Is it possible to provide ELF image which can be run on reference RT1060 EVK/EVKB board, we want to reproduct this issue based on yours.
2. J-Link is another probe with Azure RTOS GDB thread awareness support (check the User Guide), J-Link thread awareness might be faster.
B.R,
Sam
Hi @Sam_Gao,
Thanks for your response.
I've attached an ELF that's a modified version of the threadx_demo sample in the SDK for the 1060EVK. Here's how threadx_demo.c looks:
#include "pin_mux.h"
#include "clock_config.h"
#include "board.h"
#include "tx_api.h"
#include "fsl_debug_console.h"
int main()
{
/* Init board hardware. */
BOARD_ConfigMPU();
BOARD_InitBootPins();
BOARD_InitBootClocks();
BOARD_InitDebugConsole();
PRINTF("THREADX example ...\r\n");
/* Enter the ThreadX kernel. */
tx_kernel_enter();
return 0;
}
#define THREAD_COUNT (14U)
#define THREAD_STACK_SIZE (1024U)
static TX_THREAD threads[THREAD_COUNT];
static uint8_t thread_stacks[THREAD_COUNT][THREAD_STACK_SIZE];
void thread_entry(ULONG thread_input)
{
while (1)
{
PRINTF("Thread %i\r\n", thread_input);
tx_thread_sleep(100);
}
}
/* Define what the initial system looks like. */
void tx_application_define(void *first_unused_memory)
{
TX_THREAD_NOT_USED(first_unused_memory);
for (uint32_t i = 0; i < THREAD_COUNT; i++)
{
UINT status = tx_thread_create(&threads[i], "thread N", thread_entry, i, thread_stacks[i], THREAD_STACK_SIZE, 1, 1, 1, TX_AUTO_START);
assert(status == TX_SUCCESS);
}
}
I'm still running this on our custom PCB, through the same setup as before (Cortex-Debug extension in VSCode, MCULink and LinkServer), but I'm now seeing the same issue happening when having 14 threads. At 14 threads the debugging is really slow, while at 13 threads it's fast.
Are you also using an MCULink with MCUXpresso IDE, or are you using a JLink?
For reference, here is our launch.json for Cortex-Debug:
{
"configurations": [
{
"name": "MIMXRT1060-EVKB Blinky LinkServer cortex-debug",
"type": "cortex-debug",
"request": "launch",
"servertype": "external",
"gdbTarget": "localhost:3334",
"cwd": "${workspaceFolder}",
"executable": "boards/evkbmimxrt1060/azure_rtos_examples/threadx_demo/armgcc/flexspi_nor_debug/threadx_demo.elf",
"armToolchainPath": "/home/daniel/work/nxp/arm-gnu-toolchain-13.2.Rel1-x86_64-arm-none-eabi/bin", // needed for the gdb
"runToEntryPoint": "main", // or "ResetISR"
"showDevDebugOutput": "raw",
"showDevDebugTimestamps": true,
"preLaunchCommands": [
"show remotetimeout",
"show remotewritesize"
]
},
}
Hi @MulattoKid
We tried the same code with over 15 threads, but problem not reproducible.
We use RT1060 EVKB with CMSIS-DAP/J-Link(on board, LPCXpresso55S69, default) and 1170-EVK + J-Link(onboard, LPCXpresso55S69).
B.R,Sam
Hi @Sam_Gao,
Right, we haven't tried on an EVK, only our custom PCB + MCULink. We do have a standalone JLink Pro, so I hope to test that sometime next week with our setup. I'll report back here.
Thanks, please let me know if this issue can be reproduced.
Note: Over 14 thread working well as below.
So, this is interesting: when debugging with MCULink+LinkServer+VSCode it's slow, but when debugging the same code with MCULink+IDE it's fast, even with 20 threads. This is the same ELF I shared with you.
I believe the IDE also uses LinkServer internally, so is there a way to view what arguments it's using etc.? We run LinkServer like this:
LinkServer gdbserver --keep-alive work/connected/boards/bursen_c/MIMXRT1060_linkserver_config.json
with the JSON file looking like this:
{
"copyright": "Copyright 2023 NXP",
"license": "SPDX-License-Identifier: BSD-3-Clause",
"version": "1.0.0",
"vendor": "NXP",
"devices": [
{
"board": "MIMXRT1060-EVKB",
"device": {
"name": "MIMXRT1062xxxxB",
"family": "MIMXRT1060",
"memory": [
{
"location": "0x20000000",
"size": "0x00080000",
"type": "RAM"
},
{
"location": "0x20200000",
"size": "0x00080000",
"type": "RAM"
},
{
"location": "0x60000000",
"size": "0x01000000",
"type": "ExtFlash",
"flash-driver": "MIMXRT1060_SFDP_QSPI.cfx"
}
],
"cores": [
{
"type": "cm7",
"name": "cm7"
}
]
},
"debug": {
"no-packed": true,
"protocol": "swd",
"swo": true,
"connect-script": "RT1060_connect.scp"
}
}
]
}
as we've altered the memory configuration.
Hi @MulattoKid
Got it, I am looking for some tools guys to check the arguments from IDE, please wait for a moment.
BTW, It works well for both 'RT1060 EVKB with CMSIS-DAP(MCU-Link)' and 'RT1170 EVK with J-link', please see below shown. Please note: ALL-Stop(default is non-stop) means RTOS thread awareness is enabled. Actually for any rtos.
Note
All-stop and non-stop refer to something else actually... But that's the effect of using all- vs. non-stop with LinkServer.
Non-Stop Mode (Debugging with GDB) (sourceware.org)
All-Stop Mode (Debugging with GDB) (sourceware.org)
As described in the User Guide, you'll see only the executing thread in Debug view, when RTOS GDB thread awareness is disabled. You'll also see in the User Guide that J-Link needs an extra activation step for RTOS GDB Thread Awareness - see "4.2 SEGGER J-Link probes"
Would you please try NXP IDE-MCUXpreeso to reproduce it?
B.R,
Sam
Maybe I wasn't clear in my previous comment, but I did test with MCUXpressoIDE, and debugging works fine using the MCULink.
I'm really struggling getting JLink to work with our setup. It struggles to set breakpoints etc., so it would be very useful if you can test with an MCULink.