HELP - project suddenly stopped booting unless debugging session active

cancel
Showing results for 
Show  only  | Search instead for 
Did you mean: 

HELP - project suddenly stopped booting unless debugging session active

Jump to solution
3,323 Views
comsosysarch
Contributor III

Working with IAR on bare-metal C/asm project.

 

I have had everything running for months, working through various issues until mid-August I made a successful demo.

Stashed away the demo code in an archive and a few weeks later started making mods to add more functionality.

 

At some point over a two-week period, I ran into a problem where my code would boot up and run just fine when a debug session was active, and I could even disconnect from the debugger and it would keep running. BUT... any attempt to reset or reboot without an active debug session would cause the processor to hang (and obviously since the debugger is not attached I do not know what is going on).

 

The stuff I added has to do with some code to un-block a few ascii commands (ie add a a state machine call in lieu of directly sequential processing). And I added the SYSTICK timer to assist in this.

 

For a while I was trying to comment out a bunch of code and find out the offending piece - but while I had this chasing down an array initializer, when I left -just- that code in it also booted just fine. Not sure if there is some issue with improper data overlay or what. IAR thinks it is a Freescale Kinetis issue although they are not more specific.

 

Any ideas on what to try? Or how to gather more info without using the debugger and thereby avoiding the issue?

 

Maybe I ship a few thousand units with the J-Link attached? And require the customer to hook it to a PC, running a licensed copy of IAR? Mmmmm... maybe not.

 

So what can I do other than wipe out two weeks of coding, retry, and maybe wind up down the same cul-de-sac?

 

I do not think my CPU is inadvertently locked. And I had the luxury of flashing another CPU and watching it do the same thing. Luckily I am able to revert to the older code in the archive and have a working product. But I'd like to have those features I worked on.

 

0 Kudos
Reply
1 Solution
2,617 Views
mjbcswitzerland
Specialist V

Hi

 

Do you work with the watchod or disable it? The debugger may be disabling it for you and so cause things to work normally when it is connected.

 

When not working with the watchdog it needs to be disabled within a few instructions after a reset otherwise an attempt to write it will cause a watchdog reset to fire. Is it possible that you added a few instructions to the startup code which delay this process and trigger the problem?

 

See: http://www.utasker.com/docs/KINETIS/uTaskerV1.4_Kinetis_demo.pdf

and http://www.utasker.com/forum/index.php?topic=1664.0 

 

Regards

 

Mark

 

View solution in original post

0 Kudos
Reply
9 Replies
2,617 Views
RobertSignum
Contributor I

Hi,

 

On Cortex DAP certain registers may change behaviour of core (for example disable the watchdog or overwrite sleep modes etc.). In other words CPU (when running under control of debugger) may never sleep, while in real case it may go to sleep and (maybe?) never wake up.

 

Plese be aware that also running code may affect the debug registers as well (rather nobody is doing that on purpose).

 

Our JTAGjet emulator will allow you the following:

 

1. You connect emulator to PC and target.

2. You power-up your board and it should be free-running (emulator by itself will NOT touch the CPU).

3. In the moment when your CPU seems to be dead you may start debugger and possibly you may see where your code is running.

 

>Maybe I ship a few thousand units with the J-Link attached?

One JTAGjet may be less expensive than 1000-s of J-Link :smileyhappy:

 

Robert

 

0 Kudos
Reply
2,617 Views
comsosysarch
Contributor III

I disable the watchdog timer in one of my first code lines - almost the first lines of code I wrote when I started the project, and the only sleep I use is a wfi in my main loop (this being a somewhat typical bare-metal C/asm super-loop program). I can add code lines enough (using an LED) to assure myself that the code is never running all the way to the point of the wfi instruction.

 

And I have a working project like this, just with the recent changes (adding systick support, changing modem commands to non-blocking using state machine) this problem with failure to boot outside a debug session began. In fact I can flash the old code on which this is based and it boots just fine without the debugger.

 

While I am using a custom PCB (which works great) I have checked the power to the chip during failure and there does not seem to be any difference between that and when it works on the debugger (my board has its own external power and does not use debugger power). And again, it works with the earlier version regardless of debugger presence.

 

I am using almost all the 128kB RAM and while not using much flash I am using a non-volatile data storage table in the top half of flash and running out of the bottom half. But after finding many issues to overcome with the flash setup I have had that licked for a while now and it certainly also seems to work on the older code.

 

I admit I am completely stumped - how do I troubleshoot something when the failure does not occur on the debugger and without the debugger it will not even run to the first lines of main()?

 

At the moment, barring someone else coming to the rescue with an idea, I am looking at starting from the last working code and repeating two weeks of programming step-by-step, although the little bit of trial and error stuff I have tried leads me to believe it has something to do with the startup code maybe depending on things like local and global variable initialization that for some reason behaves differently on the debugger? But IAR thought it was a Kinetis problem of some kind. So I am almost clueless on this one.

 

the part is marked:

PK60N512VLQ100

OM33Z

QEAK1109A

 

0 Kudos
Reply
2,617 Views
mjbcswitzerland
Specialist V

Hi

 

Try running the IAR simulator and checking out whether the code is calling __iar_program_start() before or after the disabling of the watchdog.

 

__iar_program_start() usually performs variable initialiation and then passes control to the users main(). When extra variables are added to a project, the variable initialisation takes longer and so disabling the watchdog in main() will start failing.

 

In my IAR project I don't call __iar_program_start() from reset since it had this problem, but instead call an intermediate routine that immediately disables the watchdog and then calls __iar_program_start(). Then there were no such problems anymore.

 

Regards

 

Mark

 

 

 

0 Kudos
Reply
2,617 Views
comsosysarch
Contributor III

Well, making those new static locals global __no_init made my symptom disappear.

So I suspect the watchdog was indeed kicking off during the variable initialization.

 

I know __iar_program_start is the function address in my reset vector table.

The first three lines of my main() are...

WDOG_BASE_PTR->UNLOCK = 0xC520;WDOG_BASE_PTR->UNLOCK = 0xD928;WDOG_BASE_PTR->STCTRLH &= ~WDOG_STCTRLH_WDOGEN_MASK;

 ...where WDOG_BASE_PTR is an address defined in the device header file. (actually, I exagerate, these are inside a sub-function call so technically the register pushes and branch occur before them).

 

I would like to know what source file to modify to insert these prior to the variable initializations (these lines certainly do not require any C variables to execute). Because __iar_start_program is one of those buried functions in the IDE files somewhere and not built out to my project it will take some digging.

 

I do see a low_level_init.c file that looks promising - it executes prior to variable inits and is intended for exactly this purpose. However, I am not sure my system is set up to actually call that function or how to know since that is all part of the somewhat hidden IAR init code.

 

But thanks, you made me go back and look at something that I was ignoring as unlikely (man that means the K60 watchdog is on a _very_ short leash).

 

0 Kudos
Reply
2,617 Views
mjbcswitzerland
Specialist V

Hi

 

With IAR I do this:

 

__root const RESET_VECTOR __vector_table @ ".intvec"= {    (void *)(RAM_START_ADDRESS + SIZE_OF_RAM), // stack pointer to top of RAM    disable_watchdog,                                            // start address};

 where

 

typedef struct stRESET_VECTOR{    void  *ptrResetSP;             // initial stack pointer    void  (*ptrResetPC)(void);  // initial program counter} RESET_VECTOR;

 

and the program starts at

 

// This is the first function called so that it can immediately disable the watchdog so that it
// doesn't fire during variable initialisation
// static void disable_watchdog(void){    UNLOCK_WDOG();                 // open a windows to write to watchdog    WDOG_STCTRLH = (WDOG_STCTRLH_STNDBYEN | WDOG_STCTRLH_WAITEN |
                                WDOG_STCTRLH_STOPEN | WDOG_STCTRLH_ALLOWUPDATE |
                                WDOG_STCTRLH_CLKSRC);          // disable watchdog    }    __iar_program_start(); // now call the IAR initialisation code which initialises variables and then calls main() }

 

Regards

 

Mark

 

0 Kudos
Reply
2,617 Views
comsosysarch
Contributor III

Just to offer up another way to do this, I found once I knew what the problem was that it was fairly easy in IAR to just add the following (I put it in my main.c file but it could go anywhere the compiler can find it)...

#pragma language=extended__interwork int __low_level_init(void); // Initialize hardware__interwork int __low_level_init(void){ // disable watchdog timer WDOG_BASE_PTR->UNLOCK = 0xC520; WDOG_BASE_PTR->UNLOCK = 0xD928; WDOG_BASE_PTR->STCTRLH &= ~WDOG_STCTRLH_WDOGEN_MASK; // enable separate fault interrupts SystemControl_BASE_PTR->SHCSR |= SCB_SHCSR_MEMFAULTENA_MASK | SCB_SHCSR_BUSFAULTENA_MASK | SCB_SHCSR_USGFAULTENA_MASK; // disable flash caching since it does not work on Kinetis FMC_BASE_PTR->PFB0CR &= ~(FMC_PFB0CR_B0SEBE_MASK | FMC_PFB0CR_B0IPE_MASK | FMC_PFB0CR_B0DPE_MASK | FMC_PFB0CR_B0ICE_MASK | FMC_PFB0CR_B0DCE_MASK); FMC_BASE_PTR->PFB1CR &= ~(FMC_PFB1CR_B1SEBE_MASK | FMC_PFB1CR_B1IPE_MASK | FMC_PFB1CR_B1DPE_MASK | FMC_PFB1CR_B1ICE_MASK | FMC_PFB1CR_B1DCE_MASK); // set up high-speed EXTAL as system clock MCG_BASE_PTR->C2 = MCG_C2_RANGE(2); // select oscillator 8-32 MHz range although it is kept off since using external 50MHz clock MCG_BASE_PTR->C1 = MCG_C1_CLKS(2) | MCG_C1_FRDIV(2); // select EXTAL with FLL DIV of 128 (39.0625 kHz) while (MCG_BASE_PTR->S & MCG_S_IREFST_MASK); // wait for Reference clock to switch to external reference while ((MCG_BASE_PTR->S & MCG_S_CLKST_MASK) != MCG_S_CLKST(2)); // wait for MCGOUT to switch over to the external reference clock // set up SysTick timer, will enable later in main C code SysTick_BASE_PTR->CSR = SysTick_CSR_CLKSOURCE_MASK; // select CPU core clock SysTick_BASE_PTR->RVR = EXTAL_CLOCK_FREQ / 10; // set reload register for 100ms based on EXTAL frequency SysTick_BASE_PTR->CVR = 0; // clear SysTick current value; // set up and initialize GPIO ports (mostly edited out) SIM_BASE_PTR->SCGC5 |= SIM_SCGC5_PORTA_MASK | SIM_SCGC5_PORTB_MASK | SIM_SCGC5_PORTC_MASK | SIM_SCGC5_PORTD_MASK | SIM_SCGC5_PORTE_MASK; return 1; // 1 to run seg_init}#pragma language=default

 ...and this all executes before main() and even before all the normal C init such as variable initialization.

You need to assure the use of extended language, and you need to follow the function name and definition exactly, but you can anything you want in the function body. Just note that you can not count on any C variables to be initialized for use in the "low level init" function, and if you return 0 the C variable init will be skipped entirely.

I think the program boot has the following steps then...

1. reset vector table provides stack pointer and PC address of "low level init" function, and link register is set

2. "low level init" as shown above, or whatever you put in it, is run

3. C variables are initialized if you returned "1" from "low level init"

4. main() starts

0 Kudos
Reply
2,617 Views
mjbcswitzerland
Specialist V

Hi

I just checked a new demo project in the latest IAR version. It looks as though the method has been changed to the following:

 

1) They are using void start(void) as startup location.

2) This then does this:

 

void start(void)
{
    /* Disable the watchdog timer */
    wdog_disable();

    /* Copy any vector or data sections that need to be in RAM */
    common_startup();

    /* Perform processor initialization */
    sysinit();

 

...

 

As can be seen, the first operation is to disable interrupts and then to initialise variables.

 

 

 

The reason why I don't use this directly is becasue I prefer the code to be the same irrespective of the compiler used. This avoids the need for an assembler file in the project (ARM in fact gave this feature as a major advantage to Cortex M3 when it was initially released, although almost every example one seems starts from an assembler file, and the 68000 was in fact doing the same startup for the last 25 years..).

 

Regards

 

Mark

 

 

0 Kudos
Reply
2,617 Views
comsosysarch
Contributor III

btw...

 

I am not using -any- IAR compiler optimization (anywhere that needed optimizing was done by hand looking at asm listings). I am sure I am not the only one who has found lots of times that a lot of the optimization in fact is much worse at what it supposedly optimizes than just leaving it turned off.

 

I am using all that RAM but 99.9% (sic) of it is "__no_init" variables because of past problems I have with various C IDE's and microcontrollers that never make it past variable initialization (especially large arrays) to disable the watchdog. That said, yeah there are a very few static local variables that can not be declared "__no_init" and yeah some of those are new in the "broken" code. Maybe one thing to try is making them global __no_init variables instead.

 

As stated, watchdog disable is the first couple lines of main(), and it uses direct register writes not any sort of macro so I would presume it should be very fast but I will check -that- in the asm listing as well. All my IO port inits are well after the watchdog is disabled.

 

0 Kudos
Reply
2,618 Views
mjbcswitzerland
Specialist V

Hi

 

Do you work with the watchod or disable it? The debugger may be disabling it for you and so cause things to work normally when it is connected.

 

When not working with the watchdog it needs to be disabled within a few instructions after a reset otherwise an attempt to write it will cause a watchdog reset to fire. Is it possible that you added a few instructions to the startup code which delay this process and trigger the problem?

 

See: http://www.utasker.com/docs/KINETIS/uTaskerV1.4_Kinetis_demo.pdf

and http://www.utasker.com/forum/index.php?topic=1664.0 

 

Regards

 

Mark

 

0 Kudos
Reply