CW7.0 slow BDM console I/O...

cancel
Showing results for 
Show  only  | Search instead for 
Did you mean: 

CW7.0 slow BDM console I/O...

4,784 Views
RichTestardi
Senior Contributor II
Hi all,
 
I recently upgraded my CW from 6.4 to 7.0.  I'm running CW7.0, Build 15 on a M52221DEMO board.
 
In general, CW7.0 works fine (and even has the option to *not* hijack all file extensions at install time -- hurray!!! .
 
However, when I use the BDM console I/O, I print at about 50 baud!  It seems every character printed is taking the following code path:
 
  0: TRKAccessFile( ), at console_io_cf.c:196
  1: __access_file( ), at console_io_cf.c:312
  2: __write_console( ), at console_io_cf.c:258
  3: TERMIO_PutChar( ), at printf_tiny_IO.c:1166
  4: _out( ), at printf_tiny_IO.c:1121
  5: vprintf( ), at printf_tiny_IO.c:348
  6: printf( ), at printf_tiny_IO.c:1188
  7: main( ), at main.c:16
  8: _startup( ), at startcf.c:295
 
And it seems that the "trap #14" in TRKAccessFile is taking ~200ms.
 
Please note I am not using a UART, but rather, the "46 Trap #14 for Console I/O" as selected by the CONSOLE_INTERNAL_RAM target.  The minimal example program created by the wizard exhibits this behavior for that target.
 
In CW6.4, BDM Console I/O was *much* faster.
 
Does anyone know how to get the old behavior back?  I'd really like to stick with CW7.0 if I can.
 
Thanks in advance!
 
-- Rich
 
Labels (1)
0 Kudos
11 Replies

751 Views
TudorS_
NXP Employee
NXP Employee
Hello.
 
The console I/O library that uses the TRAP #14 mechanism and includes the IO functions such as printf() has been changed to print one char at a time for reasons pertaining to code size on Kirin based boards, since generally the Kirin cores do not have external buses. Therefore the IO code needs to occupy as little as possible on the microcontroller internal RAM. Hence the printf_tiny_IO.
 
You will notice that the IO speed is not affected on more powerful cores which make use of external RAM.
0 Kudos

751 Views
tim35ca
Contributor I
Can someone tell me how to turn off this new tiny_IO stuff in V7.0.


Tim
0 Kudos

751 Views
RichTestardi
Senior Contributor II
If it helps, I just use a minimal printf from the attached file now -- I don't include any library routines.  (Of course if you want more than printf, this won't help!)
 
-- Rich
 
0 Kudos

751 Views
tim35ca
Contributor I
Thanks Rich,

I probably can get some of what I need. I'm actually not using the console IO. I just use sprintf for uart and character LCD. I was having problems getting an answer on another thread so I searched and found this thread referencing tiny IO. I'm still interested in how I would go about getting V7.0 to use the same library switches as V6.4. This particular project is released as source (long story) and if I have to modify the default libraries in Code Warrior I'll need to explain how to the end users.

Tim
0 Kudos

751 Views
TudorS_
NXP Employee
NXP Employee
The following flags in ansi_prefix.CF.size.h return to MSL's buffered IO, compatible with CWCF 6.x SIZE libs
 
#define _MSL_C_TINY_IO                0        
#define _MSL_TINY_FILE_IO            0
0 Kudos

751 Views
RichTestardi
Senior Contributor II
Hi Tudor,
 
Thanks for your response!
 
> You will notice that the IO speed is not affected on more powerful cores which make use of external RAM.
 
OK -- I can always do my own buffering to make up for the one-character-at-a-time change (I seem to have plenty of RAM so far :-).
 
However, independent of the one-character-at-a-time change, it seems a single Trap #14 takes 10-20x as long for CW7.0 as it did for CW6.4 -- do you have any ideas why this might be, or if I can get the old behavior back?  That seems to be the larger problem.
 
Thanks again.
 
-- Rich
0 Kudos

751 Views
TudorS_
NXP Employee
NXP Employee
Hi Rich.
 
Not only the console IO library functions have been changed, but also the user TRAP #14 handling mechanism in the debugger itself, for a more unified approach on all cores. This unfortunately means a degradation of performance on Kirins for console IO (I wouldn't say 10-20x, though). This also means that the 6.4 behaviour is not present anymore in 7.0. This will be subject to further optimization.
 
Your M52221DEMO board has a UART port. If you initialize it via your code or the code provided in the INTERNAL_RAM target of the CodeWarrior project for this board, you can view in a Hyper Terminal window the output of the UART version of the printf(). This should be faster as there isn't any TRAP handling involved.
 
-- Tudor
0 Kudos

751 Views
RichTestardi
Senior Contributor II
Hi Tudor,
 
Thanks for your explanation...  A unified approach is always good, and I am able to work with the console I/O as-is (after adding simple line buffering to it).
 
I'll look into using the UART, but I've actually already assigned those pins to GPIO, and I don't think I have any extra at the moment (using the 64 pin part on the DEMO board -- we'll soon have PCBs with the 100 pin part).
 
Thank you again.
 
-- Rich
 
0 Kudos

751 Views
J2MEJediMaster
Specialist I
It might have to do with the default memory access size in your .mem file changing when you upgraded. This value affects the data transfer rate between your host system and the target board, and might be the culprit. For an explanation of how to change the access size for better throughput, consult FAQ-28195. HTH.

---Tom
0 Kudos

751 Views
RichTestardi
Senior Contributor II
Hi Tom,
 
Thanks for the quick response!
 
I checked my .mem files and they are all 4-byte, like:
 
  //         Memory Map:
  //         ----------------------------------------------------------------------
  range      0x00000000 0x0001FFFF 4 Read    // 128 KByte Internal Flash Memory
  reserved   0x00020000 0x1FFFFFFF
  range      0x20000000 0x20003FFF 4 ReadWrite                 // 16 Kbytes Internal SRAM
  reserved   0x20008000 0x3FFFFFFF
  //         $IPSBAR_BASE   $IPSBAR_BASE + 0x1FFFFF // Memory Mapped Registers
  reserved   $IPSBAR_BASE + 0x200000  0xFFFFFFFF
 
So I think they are OK.
 
I'm experiencing more like a factor of 1000 slowdown than a factor of 4, as well. :smileysad:
 
-- Rich

 
0 Kudos

751 Views
RichTestardi
Senior Contributor II
And I have a bit more info...
 
It seems the slow BDM colsole I/O is because of two separate reasons:
 
1. in CW7.0, every character printed results in a "Trap #14"; whereas, in CW6.4, every *line* printed results in a "Trap #14", and
2. in CW7.0, the "Trap #14" seems to have slowed down by 10-20x or so.
 
Anyway, if I'm printing 50 character lines, printf2(), below, goes 50x faster than printf(), just by doing simple line buffering:
 
//**************************************************************************
 
#include <stdarg.h>
#define assert(x)  if (! (x)) { asm { halt } }
 
static
void TERMIO_PutChar2(char ch)
{
    static char buffer[128];
    static int i;
   
    buffer[i++] = ch;
    assert(i <= sizeof(buffer));
    if (ch == '\n') {
        __write_console(1, buffer, &i, 0L);
        i = 0;
    }
}
 
int printf2(const char *, ...);
int printf2(const char *format, ...)
{
    int i;
    va_list args;
    set_printf(TERMIO_PutChar2); /* set up TERMIO_PutChar2 for writing */
    va_start(args, format);
    i = vprintf(format, args);
    va_end(args);
    return i;
}
 
//**************************************************************************
0 Kudos