-- FIXED -- Code works via JTAG but not after USB DFU -- FIXED --

lpcware · ‎06-15-2016

Content originally posted in LPCWare by ehughes on Sun Oct 26 10:09:26 MST 2014
I am working on a project with a client using the the LPC4357 We have a hard deadline of Nov 3 to ship the product to a first customer. I’ll try to succinctly describe the issue and then I can supply more data to work through it.

Up until today, I have been using the debugger (LPC-LINK-2 and RedProbe) to load code. All is well as I am just trying to get the 1st version of the application code ready.     Out hardware is setup for the DFU boot on USB0.    We were shipping the customers an initial version and will be providing them some updates down the road.   So, this piece is important.

1.) I have verified that our DFU boot circuit works but making a simple IO test program.    It simply toggles an IO on the M4 core.      I generate the BIN file w/ correct checksum and program with not issues using the DFUsec utility.
2.)As I am using both cores in the LPC4357, I got a second bin file for the second core setup and it programs using another step in the DFUsec utility. The other bin is generated in the M0 core project
3.)At this point I believe I am correctly generating BIN files with proper checksums as the test program boots up just fine
4.)When I tried to load my real application, I noticed that everything seemed to hang early in initialization.   (I have a buzzer and LCD on our hardware for output).   Keep in mind, my code works fine over a the JTAG debugger.   No issues what so ever.
5.)So, I started using a bisect method on my code (commenting out init code to see where things were hanging).   I was able to find that calls to the USB libraries (I am using the ROM driver on USB 0 and USB host software driver on USB 1 from LPCOpen) were the issue.   It appears (hard to tell without debugger) that the program hangs around the clock/PLL routines.   I have not gotten to the exact problem as it it first seemed like USB1 but after removing that init call, USB1 and the SD card are also a problem.

6.) There is something different about the DFU utility as when I use the JTAG,   the code runs just fine.   After it is loaded, it will restart fine after power up, etc.

Now, I am able to keep working with the JTAG until the ship date but I really need to figure this one out.    I searched the forums and could not find a similar issue.

I will supply you with whatever you guys need to work this one out. I am a bit frantic here (as evidence by my weekend work schedule). Any help would be appreciated.

lpcware · ‎06-15-2016

Content originally posted in LPCWare by ehughes on Wed Oct 29 08:24:51 MST 2014
I verified the solution in my last post. Everything appears to work correctly!

Thank you for all the help. It is appreciated.

I changed the forum thread title so people know that the problem has a solution

lpcware · ‎06-15-2016

Content originally posted in LPCWare by ehughes on Tue Oct 28 10:52:06 MST 2014
Ok.....

I think I almost have it. I just did a load and I *think* everything is OK. Here is my build script.

I took the safe approach and included everything that can possible target the FLASH. It seemed easier than figuring out what to remove.

I can load without a crash... More testing but I think I am almost of the shark tank. I'll post with final results.

arm-none-eabi-size "${BuildArtifactFileName}"

arm-none-eabi-objcopy -v  -j ".text" -j ".data" -j ".data_RAM2" -j ".data_RAM2" -j ".data_RAM3" -j ".data_RAM4" -j ".data_RAM5" -j ".data_RAM6" -j ".data_RAM7" -j ".data_RAM8" -j ".ARM.extab" -j ".ARM.exidx" -O binary "${BuildArtifactFileName}" "${BuildArtifactFileBaseName}.bin"

arm-none-eabi-objcopy -v  -j ".text_Flash2" -O binary "${BuildArtifactFileName}" "TM0.bin"

checksum -p ${TargetChip} -d "${BuildArtifactFileBaseName}.bin"

lpcware · ‎06-15-2016

Content originally posted in LPCWare by ehughes on Tue Oct 28 09:22:34 MST 2014
Thats sounds like a possibility but here are 2 other data points.

1.) I used hex workshop to do a binary compare of 0x1A00000 to 0x1A080000 against 2 files:

a.)   The .bin from my objcopy
b.)   An export of the memory after I have JTAG downloaded the code

They are the same

2.)   I have attached the linker files. Starting at line 36:

/* MAIN TEXT SECTION */
    .text : ALIGN(4)
    {
        FILL(0xff)
        __vectors_start__ = ABSOLUTE(.) ;
        KEEP(*(.isr_vector))

        /* Global Section Table */
        . = ALIGN(4) ;
        __section_table_start = .;
        __data_section_table = .;
        LONG(LOADADDR(.data));
        LONG(    ADDR(.data));
        LONG( SIZEOF(.data));

It looks like .data gets placed in .text for my by the linker. It seemed like this would get me what I want.

lpcware · ‎06-15-2016

Content originally posted in LPCWare by lpcxpresso-support on Tue Oct 28 08:39:43 MST 2014
So, there is your problem...

You are only converting the .text section. An image in flash consists of the code (.text) and the initialised data (.data).

As a MINIMUM, you also need to convert the .data section(s). Your objcopy command will need to change to remove sections you don't want, rather than only keeping a single section. You will need to examine your image to see what other sections need to be kept.

lpcware · ‎06-15-2016

Content originally posted in LPCWare by ehughes on Tue Oct 28 08:08:04 MST 2014
The map file is attached.

Here is the post build step:

arm-none-eabi-size "${BuildArtifactFileName}"
arm-none-eabi-objcopy -v -j ".text" -O binary "${BuildArtifactFileName}" "${BuildArtifactFileBaseName}.bin"
arm-none-eabi-objcopy -v -j ".text_Flash2" -O binary "${BuildArtifactFileName}" "TM0.bin"
checksum -p ${TargetChip} -d "${BuildArtifactFileBaseName}.bin"

lpcware · ‎06-15-2016

Content originally posted in LPCWare by lpcxpresso-support on Tue Oct 28 06:40:50 MST 2014
Sounds to me like you are missing some initialisation data.

So
1. How are you generating your image?
2. How are you converting to binary?

Post you map file and the exact command line for converting to binary.

lpcware · ‎06-15-2016

Content originally posted in LPCWare by ehughes on Tue Oct 28 06:28:11 MST 2014
OK.

I got the debugger to attach while running. I also have an interrupt handler for the HardFault to look at the stacked registers. I used the while(Variable == Somevalue){} to hold my program so I could debug.

A could interesting things:

1.) THis was the start of the program:

volatile uint32_t DebugStop = 1;
int main(void)
{

while(DebugStop == 1){}

The code would pass the while loop after being programmed from DFU! BUt, if I download via JTAG it would worked. IF I change to:

volatile uint32_t DebugStop = 0;
int main(void)
{

while(DebugStop == 0){}

it works after USB_FDU!    It APPEARS that that the copy down if not happening correctly!

2.)    I get a hard fault in calls to rand();      I have a attached an image of the disassembly window.

What is scary to me is that rand() uses malloc.     Since I cannot see the source code, I cannot verify this.    I looked at sources for NewLib and did not see a malloc in rand().    I am starting to think that maybe the Flash is gettingcorrupted up by DFU.    I cannot ever imagine rand() using malloc.    Also, there is no "free" in the disassembly window

It seems to be crashing many instructions after rand() (highlighted).    Now, this was the value of the stacked PC in my hard fault routine.    I can verify that it was the rand() function crashing as it would go to the hard fault during a single step.

Now, if I remove the rand() calls,   I simply get the crash somewhere else.

This new data confuses me even more... It almost seems that the flash isn't getting programmed correctly...    The copy down/init problem is very curious (Unless I am mistaken about how that works but I believe the problem in 1.) is genuine).

lpcware · ‎06-15-2016

Content originally posted in LPCWare by ehughes on Mon Oct 27 13:54:52 MST 2014
I put the stack in an easy place for me to inspect how "deep" it is going. It the examples I tested, it is only about 25%

It is very possible that there is a routine during start up that is using the vStackTop symbol (although I have it fixed to the correct location as well.) it way that is causing the program.

BTW: The problem does occur under both Redlib and Newlib.

As far as the JTAG load, I can understand that there is some init happening dering the process, BUT the code does work from a cold boot after jtag load. It is my understanding that the debugger does not insert additional code in FLASH.

lpcware · ‎06-15-2016

Content originally posted in LPCWare by ehughes on Mon Oct 27 13:47:48 MST 2014
Thanks for the tips.

I got setup on Sunday with the "attach" process.   I was trying to insert the break point instruction to stop the code but your suggestion seems better.

I installed a HardFault handler (found on site) that allows me to trace what is going on.    From my preliminary investigation, there is a hard fault occuring, I just don't know where.    I can't understand why it would be different because code was loaded via JTAG vs DFU.   Should it not be the same binary image?

I think the main PLL is OK.   My code makes it quite a bit in (long after board_sysinit()).    I do not use printf (although I do use sprintf).     None of of the code uses malloc (not sure about the USB code).

Here is another random piece of information:

1.) The M4 stack is not it the 1st ram section.   I defined another section and manually put it there (by changing the 1st entry of the IRQ table to point to an array I can allocated in another area.     I would assume that if this strategy was bad, I would have seen it via a JTAG load.     I was trying to pll the stack out of the 1st 32K area to leave that exclusively for variables.

Time is ticking...   I'll post mor later.

lpcware · ‎06-15-2016

Content originally posted in LPCWare by TheFallGuy on Mon Oct 27 10:32:05 MST 2014

Quote:
When relocating the SP maybe you must also tell the linker that the top of stack is now elsewhere. I have not done this myself, just speculating. Otherwise there could evolve conflicts with BSS/heap/data during runtime.

You don't need to tell the linker anything about the stack. However, you are right to speculate that you have to be careful with placement. There is no stack checking, so if the stack is not large enough, it is very easy to overwrite whatever is adjacent to it in RAM - either resulting in stack corruption, or data corruption.

lpcware · ‎06-15-2016

Content originally posted in LPCWare by mch0 on Mon Oct 27 09:07:56 MST 2014
OK, looks like the PLL is not your problem.

sprintf() may still implicitly use malloc(), although I think it is not necessary. But who knows?
If you have nothing special in it, you could probably switch to nanolib just to see whether the problem goes away.

The main difference between JTAG-download and DFU is obviously, that the chip is already partly initialized when DFU was running before your code. The actual initialization depends on the boot mode (see Fig 14 in the UM).

When relocating the SP maybe you must also tell the linker that the top of stack is now elsewhere. I have not done this myself, just speculating. Otherwise there could evolve conflicts with BSS/heap/data during runtime.

But since it did work with JTAG-download it should not make a difference, unless the DFU-bootloader merely jumps to the entry point instead of also loading SP from the vector table (as would a real reset). You checked that the SP is relocated indeed after DFU-induced restart?

Good luck anyway,

Mike

lpcware · ‎06-15-2016

Content originally posted in LPCWare by lpcxpresso-support on Mon Oct 27 02:16:39 MST 2014
You can debug your application after you have downloaded it using "attach mode" debugging.

Attach mode debugging allows you to connect to a running system, and does not download the application to flash.

To do this, edit your Launch Configuration:
http://www.lpcware.com/content/faq/lpcxpresso/launch-configuration-menu
and change the "Attach only" option to "true"

Make sure you are debugging the image used to create the binary.

You can then download your binary using DFUSec and reset it, to start running. You can then start the Attach mode debug session to see where your code is. If you want to stop your code from running until you have attached the debugger, the simplest way of doing this is to put this code at the place you want to stop:

volatile int wait_for_debugger = 1 ;
while (wait_for_debugger)
   ;

When the debugger attached, it will be stuck in this loop. You can then use the debugger to change the wait_for_debugger variable and then debug you application.

lpcware · ‎06-15-2016

Content originally posted in LPCWare by mch0 on Sun Oct 26 12:20:40 MST 2014
Hi,

I had a similar problem: When loading via JTAG all was well, wenn booting "cold" (in my case SPIFI) the system died.
Thread is here:
Thread

In my case the problem was a faulty malloc() in Redlib. This has been corrected by NXP, see thread. You can follow the instructions there if that's your problem.

But much earlier I also ran into a problem when changing CPU frequency (main PLL) and your description rang a bell.
In my case the 4370 (similar to 4357) would die when I called the supplied LPCOpen PLL-routine.
It did not die when called from SPIFI but locked when called from SRAM. For the LPC4357 I can't say since you probably execute from internal flash which the LPC4370 does not have.

Maybe you can test that by checking whether you call Chip_Clock_SetupMainPLL() at some point. If so, restrict it to a change below 110 MHz and see whether your code still freezes. If this function turns out to be the problem, you may use my replacement (comes without guarantee but works for me rock stable).

/* Directly set the PLL1 frequency
*
* Input Clock is divided by 3 to obtain a finer range.
* For 12MHz crystal we can thus get CCO frequencies that are a multiple of 4 in the range from 156 to 320 MHz.
* This is important if Ethernet is used, because we want to have a 50 MHz clkout signal for the PHY chip.
* We can derive that frequency if we use 50/100/150/200 MHz as the PLL1 output clock.
*/
uint32_t Chip_Clock_SetupMainPLL_mch(CHIP_CGU_CLKIN_T Input, uint32_t fout)
{
volatile uint32_t delay = 1000;
uint32_t PLL1Reg, cco, msel, nsel = 2, psel;
uint32_t fin = Chip_Clock_GetClockInputHz(Input)/(nsel+1);

// sanity checks
if ((fout>204000000) || (fout <10000000)) return 0;

// determine required postscaler
cco=fout;
psel=0;
if (fout < 156000000)
{
cco*=2;
for (; cco<156000000; cco*=2)
{
psel++;
}
}
// determine M
msel=cco/fin-1;

// now set up new loop parameters without direct mode on
PLL1Reg = (Input << 24) | (msel << 16) | (nsel << 12) | (1 << 11) | (psel << 8);
LPC_CGU->PLL1_CTRL = PLL1Reg;

// wait until PLL1 has locked
while (!Chip_Clock_MainPLLLocked());

// connect base clock to PLL1 now
Chip_Clock_SetBaseClock(CLK_BASE_MX, CLKIN_MAINPLL, true, false);

/* Wait for at least 50uSec */
while(delay--) {}

// if f>156 MHz use direct output (no postscaler)
if (fout > 156000000)
{
PLL1Reg |= (1<<7);
LPC_CGU->PLL1_CTRL = PLL1Reg;
}

return Chip_Clock_GetMainPLLHz();
}

Hope you can match your deadline, good luck!

Mike

-- FIXED -- Code works via JTAG but not after USB DFU -- FIXED --

-- FIXED -- Code works via JTAG but not after USB DFU -- FIXED --

LPC43xx