A couple of things jump out – first of all, the PC and SP values that you see are coming from the header of the S19 file. These values are dictated by the compiler at link time.
Example S19 file header, 2nd line (spaces added for clarity):
S3510000A000 00FC0020 F1220100....
so app will load at 0x0000A000
SP will be set to 0x2000FC00 (read little Endian)
PC will be set to 0x000122F1 (read little Endian)
The printf's added to the original AN4368 make it easy to detect an error in these values. The AN4368 bootloader "lives" in an area of Flash below the app in the memory space, and it must be protected from overwriting obviously. So the app must have all of it's code past the end of the bootloader. Bootloader uses IMAGE_ADDR to set the start of usable memory where the app can reside. It is important that your linker file have the same start point for the app to agree with IMAGE_ADDR.
For an MQX project that I'm using with AN4368 type loader, I have two linker files, one for debugging without loader, and the other for production. Below is the start of linker file for a K70 project, basically the default settings kicked out by CW for an MQX example:
vectorrom (RX): ORIGIN = 0x00000000, LENGTH = 0x00000400
cfmprotrom (R): ORIGIN = 0x00000400, LENGTH = 0x00000020
rom (RX): ORIGIN = 0x00000420, LENGTH = 0x000FFBE0 # Code + Const data
ram (RW): ORIGIN = 0x70000000, LENGTH = 0x08000000 # DDR2 - RW data
sram (RW): ORIGIN = 0x1FFF0000, LENGTH = 0x00020000 # SRAM - RW data
Now this is the production mapping when using the loader (note change in location of "rom"):
vectorrom (RX): ORIGIN = 0x0000A000, LENGTH = 0x00000400 <---- this agrees with IMAGE_ADDR in bloader
cfmprotrom (R): ORIGIN = 0x0000A400, LENGTH = 0x00000020
rom (RX): ORIGIN = 0x0000A420, LENGTH = 0x000F5BE0 # Code + Const data
ram (RW): ORIGIN = 0x70000000, LENGTH = 0x08000000 # DDR2 - RW data
sram (RW): ORIGIN = 0x1FFF0000, LENGTH = 0x00020000 # SRAM - RW data
Also make sure you have enabled RAM usage for the MQX vector table!!
It is possible to debug the startup of the app from the loader, but it's a bit of a nightmare. However it is essential to do this if you have some kind of clock startup issue in the app, so here is how I've done it:
1. Load up the debug version of app in CW, have it halt at startup entry point (not 'main' in your app, I mean the startup code itself).
2. Step through the startup code, or set breakpoint to halt at the jump to the MQX app (startup code varies with processor, but you will find a 'jsr' to main somewhere)
3. Step into the jsr to main app entry and follow through the code in disassembly mode. You can use 'step over' for this.
4. When you get accustomed to the disassembly of the app's startup it get's easier. We really only need to see where CLKs', PLL and GPIO get set up.
5. This is the hard part: get a printout of the disassembled code. CW doesn't lend to doing this, so I had to capture screens using Windows ALT-PRTSC to capture to clipboard and then stick each screen shot in a Word or WordPad document. Now you have an image of the disassembled MQX startup code.
6. Presuming that you have a good app, set up CW to debug the bootloader. Put a breakpoint in the 'Switch_mode' func ahead of the jump to app.
7. Run the loader, and let it load the app. It will halt at 'Switch_mode'.
8. Step into your app from the loader. Unfortunately, all you will see is the disassembled code (that's why we did printout earlier).
9. The big debug technique: use 'step over' until you hit something that fails. It will be obvious because CW will barf up with bad memory space and control will be lost if the clocks are not initialized, or the app will lock up like you said, waiting for PLL lock.
This is all very messy, but I found that my app crashed very early on so it was entirely practical to do the steps above. When the chips are down (pun intended) you have to resort to this level of debug, which effectively takes us back to the programmer's world of 1980.
Another observation: if you can set triggers in your app like using GPIO pin toggle, or serial I/O output, so you can see if app is running very early on. If the pulse period or serial output is wrong, then you clearly have clock setup problem. Best way to debug this is to run the app by itself, and presuming it runs successfully then halt somewhere and go to the various CPU peripheral regs and capture (screenshots again, CW has no capture or print for this) all the important regs for the successfully running app. Then compare those regs for the unsuccessful bootload of the app to see what's different. You will have to either explain away the differences, or you will stumble into a critical difference that is the source of error.
This whole discussion of course relies on the problem being in software, not hardware. You must verify that VCC and RESET on the CPU is to spec and there's not some other hardware issue.