It's a little hard to tell what's going on based upon your description. The ARM Cortex-M0+ Devices Generic User Guide (document DUI0662A) may shed some light, specifically section 2.4 "Fault handling":
"Faults are a subset of exceptions, see Exception model on page 2-16. All faults result in the HardFault exception being taken or cause Lockup if they occur in the NMI or HardFault handler.
The faults are:
• execution of an SVC instruction at a priority equal or higher than SVCall
• execution of a BKPT instruction without a debugger attached
• a system-generated bus error on a load or store
• execution of an instruction from an XN memory address
• execution of an instruction from a location for which the system generates a bus fault
• a system-generated bus error on a vector fetch
• execution of an Undefined instruction
• execution of an instruction when not in Thumb-State as a result of the T-bit being previously cleared to 0
• an attempted load or store to an unaligned address
• if the device implements the MPU, an MPU fault because of a privilege violation or an attempt to access an unmanaged region.
I recently encountered a HardFault caused by a BLX instruction that was clearing the T-bit. I didn't see anything wrong with my C source code. I found the problem as I single-stepped through the resultant assembly language. I found the cause in section 3.3.2 "Restrictions when using PC or SP" of the same manual:
"When you update the PC with a BX, BLX, or POP instruction, bit[0] of any address must be 1 for correct execution. This is because this bit indicates the destination instruction set, and the Cortex-M0+ processor only supports Thumb instructions. When a BL or BLX instruction writes the value of bit[0] into the LR it is automatically assigned the value 1."
In my case, the destination address for the BLX instruction had its least significant bit cleared, i.e., bit[0] = 0. Sure enough, when the BLX instruction executed I saw the T-bit clear in the Execution Program Status Register (EPSR). As indicated above, this needs to change to bit[0] = 1 to stay in the the Thumb code state. I fixed this by simply incrementing the destination address (which was a pointer variable) prior to the execution of the BLX instruction.
I'm not sure if this applies to your particular scenario but I hope that it gives you an idea as to where to look.
Best Regards,
Derrick