lpcware

Debugging Hard Fault

Discussion created by lpcware Employee on Jun 15, 2016
Latest reply on Jun 15, 2016 by lpcware
Content originally posted in LPCWare by martinmckee on Wed Dec 26 23:35:30 MST 2012
Okay, full disclosure, I'm quite new to 32-bit microcontrollers in general ( used to AVR 8-bit ) and to ARM Cortex-M0 in particular.  One thing that means is that I have not had to deal with exceptions ( or faults ) before.

The issue I'm having is that I have some code that implements a cooperative task handling API.  Everything was going fine until I added another, more complicated, version of a function.  The two earlier versions I implemented worked just fine.  Calling the third seems to trigger a Hard Fault.  But, here's the bit I don't understand.  The Hard Fault is triggered within a function that I am calling that does the bulk of the work.  This function is identical for all three implementations, it simply takes different parameters ( created by the calling functions ).

The struct being operated upon is declared as:

typedef struct COOP_TASK {
//
// Control Members
//
bool enabled;
coop_task_priority_t priority;
uint8_t id;

//
// Event Members
//
coop_event_function_t event;
void * event_data;
coop_size_t event_data_size;

//
// Task Implementation Members
//
coop_task_function_t func;
void * task_data;
coop_size_t task_data_size;

//
// Task List
//
struct COOP_TASK * next_task;
} coop_task_t;

And the code that triggers the Fault looks like this:

task->enabled = true;
task->priority = _priority;
task->id = tasker_system.next_id;
task->event_data = _event_data; // This line triggers it...
task->event = _event;
task->func = _func;
task->task_data = _task_data;
task->next_task = 0;

The task memory is allocated by a pool manager which seems to be working just fine.  When I step through and trap the instruction that actually causes the fault it ends up being:
str r2, [ r3, #8 ]
Where r3 correctly holds the base address of the structure( 0x100002b ) and the computed offset access address is ( 0x1000033 ).  Obviously there is an unaligned access happening, but the question is, why?  This is all code generated by the compiler that is acting up, and the other two calls to the function are working just fine ( alone or with this last version of the call ). For what it's worth, this all happens the same way ( with the same addresses ) with or without optimization ( O0, O2, or Os ).

I'm out of ideas.

Martin Jay McKee

Outcomes