Tips for making a multiple software image bootloader for MC9S12XEP100?

utwig · ‎11-15-2013

Hi,

I have seen a lot of application notes and forum posts about bootloaders. But these are always about replacing the whole software on a microcontroller. On the contrary, I am trying to make a system with multiple software images and an ability to update just one of the images at a time.

My ultimate goal is to program a system with multiple, e.g. 4 redundant software images. When the system is started, the bootloader would run through a checksum operation on the first image to see if it is corrupted. If the image seems okay, image 1 execution would start. In case the image is corrupted, the same checksum chinspection would be done to the second image etc.

Another feature of this system would be the ability to perform an update to replace one of the software images with a new image. Currently my plan is to achieve this by putting the normal subroutines of each image to distinct P-Flash areas using pragmas e.g. #pragma CODE_SEG CODE_VER_A and #pragma CODE_SEG CODE_VER_B etc. Segment CODE_VER_A would contain functions like A_function1(), A_function2()... Segment CODE_VER_B would contain similar functions named B_function1(), B_function2().. My plan is to do something similar also with the XGATE code of each of the software images. The software update would be done with data coming in via an SCI module and stored to a RAM or D-Flash buffer. This data would be used to overwrite the code segments of a software image not currently in use.

This approach causes some problems to arise from the Interrupt Service Routines. Each ISR can be directly defined only once in the source files. However, I want every image to have their own subroutines for handling the interrupts. My idea is to achieve this by redirecting each ISR to an image-specific interrupt handling subroutine, which passes the interrupt to a relevant subroutine for servicing the appropriate interrupt:

Example of an general ISR (all ISRs will have a similar structure):

#pragma CODE_SEG __NEAR_SEG NON_BANKED

interrupt VectorNumber_Vsci3 void SCI3_Receive(void) {

switch(version){

case 0x01:

_A_IsrRedirect(VectorNumber_Vsci3);

return;

case 0x02:

_B_IsrRedirect(VectorNumber_Vsci3);

return;

case 0x03:

_C_IsrRedirect(VectorNumber_Vsci3);

return;

case 0x04:

_C_IsrRedirect(VectorNumber_Vsci3);

return;

}

Image specific interrupt handling subroutine for image version "A":

#pragma CODE_SEG ISR_HNDLR_VER_A

void _A_IsrRedirect(unsigned char vec_num){

switch(vec_num){

case VectorNumber_1:

_A_Service_1();

break;

case VectorNumber_2:

_A_Service_2();

break;

case VectorNumber_Vsci3:

_A_SCI3_Receive();

break;

//etc...

}

//else

//RESET!

}

Example of an interrupt servicing subroutine in image version "A":

#pragma CODE_SEG CODE_VER_A

void _A_SCI3_Receive(void){

//Enable further interrupts

_asm CLI;

//Clear the SCI3 XGATE interrupt flag

XGIF_4F_40 = XGIF_4F_40_XGIF_44_MASK;

//Enable SCI3 interrupt to start transmitting data

SCI3CR2 |= SCI3CR2_TIE_MASK;

}

Do you think that this is a viable approach to achieve what I am trying to do? Or do you have any ideas to make the implementation simpler? The point of having the intermediate image specific interrupt handling subroutine is to have only one interface between the ISRs (that cannot be updated) and each of the software images (A, B, C, D) interrupt handling subroutines.

Best Regards,

Timo

kef2 · ‎11-15-2013

I have seen a lot of application notes and forum posts about bootloaders. But these are always about replacing the whole software on a microcontroller.

No. They are, or should be always about replacing only application part. Replacing bootloader is in general the same like shooting to your own foot.

Are you aware that S12XE flash is ECC protected?... Even if your users are supposed to receive FW updates every week, you aren't going to force automatic FW upgrades without user permission, are you? If you aren't, then I see no point having more than two app images in flash. It could be either backup copy of the same code, or peraphs older version in case unsatisfied user wants to roll back. OK, you could have more older versions stored, but having several working versions? Compiling all code position independent would slow down code execution quite a lot. Compiling the same code for different addresses will rise coding and testing expenses. I respect your fresh idea, but I don't like it.

Problems with vectors you mentioned not exist for S12X. Both CPU12X and XGATE have programmable interrupt vectors base addresses. See IVBR and XGVBR registers descriptions. No problem at all.

utwig · ‎11-18-2013

Thanks for your reply!

In my message, I was referring to updating a part of the application software. I understand that the bootloader needs to be protected.

Also, I did not tell you about the application and why do we want to have redundant versions of application code. This is because the MCU is used in remote controlled scientific equipment. The microcontroller faces all kinds of harsh environmental conditions. Thus, redundant software versions were deemed necessary, in case of e.g. physical damage to some of the memory modules. More importantly, the system will be very hard to access locally, so we don't want to have the risk of a failed firmware update to lose the application code. This is why we want to have at least two versions, one of which is never updated without the BDM connection.

Thanks for the tip about IVBR and XGVBR. I took another look some of the application note example codes and came up with a way how to relocate the HCS12X interrupt vectors:

#define CPU12IVBR 0x3000 //The vectors can be placed anywhere in non-paged flash or RAM, I believe?

#define SCI3Ch 0x88 //

static void _A_SetupIVBR(void) {

IVBR = (CPU12IVBR >> 8);

*((unsigned int *__near)(CPU12IVBR + SCI3Ch)) = (unsigned int)SCI3_Handler;

//etc...

}

The ISRs are then formulated using the following syntax:

interrupt void SCI3_Handler(void){

//Clear the SCI3 XGATE interrupt flag

XGIF_4F_40 = XGIF_4F_40_XGIF_44_MASK;

//Enable SCI3 interrupt to start transmitting data

SCI3CR2 |= SCI3CR2_TIE_MASK;

}

Cheers,

Timo

kef2 · ‎11-18-2013

Ah, cosmic rays or some other kind of ionizing? Ok, I see. But in this case you still are in trouble, I think, and your approach is questionable. You can't have 100 robust memory backup in single unit. At least main reset vector is not moveable, it is always at FFFE. If this location fails, you'll get dead unit. The same with bootloader. Are you going to keep several bootloader copies in the same unit? How is this supposed to provide reliable backup? Several shielded units may give better results...

Regarding your code. Yes, something like that. Though, if it's harsh environment, then interrupt vectors should be placed in flash, I think, not in the RAM.

utwig · ‎11-27-2013

Thanks for the points you gave. I understand the limitations of this implementation. However, there are trade-offs in every design, and I don't want to go to the details which lead to our design architecture here... :smileyhappy:

At the moment it seems that it would be more feasible to have the interrupt vector table in the RAM. However, while testing this I found out something strange. I modified the (MC9S12XEP100) RAM definitions in the .PRM file as follows:

/* non-paged RAM */

RAM = READ_WRITE DATA_NEAR 0x2000 TO 0x3EFF ALIGN 2[1:1]; /* word align for XGATE accesses */

RAM_RO = READ_ONLY DATA_NEAR 0x3F00 TO 0x3FFF ALIGN 2[1:1]; /* word align for XGATE accesses */

In the PLACEMENT section of the .PRM file, I did not put anything into RAM_RO segment. In the beginning of the application code, I copy the used interrupt vectors into the address range defined by RAM_RO. After running the application for a while and e.g. causing interupts, strange things start to happen. Some unknown processes perform overwrites at parts of the RAM_RO segment, and the program eventually crashes.

Is it not enough to declare a RAM area as "READ_ONLY" and place nothing in that segment to prevent the RAM area from unplanned write accesses? In that case I guess using the MPU is the only way to protect the RAM range? But I wonder which kind of process is rewriting the RAM? From the Project.MAP file I can see that there is indeed nothing allocated to the addresses defined by RAM_RO. In the application code, I manually write the vectors only once to the address range. Still, while running the application, the same operations always cause the same unexpected changes in the RAM_RO segment leading to a crash in the end.

I tried to use some other definitions for RAM_RO address range, and some of these ranges did not have the same problem. Nevertheless, I would like to understand what causes the writes to happen to make sure that they don't.

Cheers,

Timo

kef2 · ‎11-27-2013

First you should keep in mind that what appears at 16bit CPU12X addreess at 0x3F00 has 3 more aliases. 1) CPU12X global addressing GPAGE=0x, global offset=0xFF00. 2) CPU12X paged RAM RPAGE=0xFF, CPU12X address=0x1F00. 3) XGATE address=0xFF00.

If your RAM_RO isn't used in any of these aliases, then most likely some code misbehaves and overwrites your RAM. Also make sure that XGATE and CPU12X stack pointers aren't pointing to any of these aliases.

MPU of corse helps preventing writes to specific areas, but unknown overwrite is very serios issue, which MPU may hide for a while until it bites you again... BTW if CPU12X code causes this problem, then try enabling MPU interrupt. In ISR, at some offset from SP it must be possible to find address of routine, which caused wrong access.

S12XE built in debug circuits alllow setting breakpoint on access to range of addresses. Try figuring out how to set up such breakpoint, at now I can't give you instructions how to do it.

utwig · ‎11-27-2013

Good thing that you mentioned the stack pointers!

It turns out that I had removed lines related to the XGATE stack from the source files in the very beginning of the project. (Back then, I thought the lines were related to the example functions generated by CodeWarrior) Adding the following lines back to the code fixed the problem:

main.c:

/* Two stacks in XGATE core3 */

#pragma DATA_SEG XGATE_STK_L

word XGATE_STACK_L[1];

#pragma DATA_SEG XGATE_STK_H

word XGATE_STACK_H[1];

//Added in setupXGATE():

/* when changing your derivative to non-core3 one please remove next five lines */

XGISPSEL= 1;

XGISP31= (unsigned int)(void*__far)(XGATE_STACK_L + 1);

XGISPSEL= 2;

XGISP74= (unsigned int)(void*__far)(XGATE_STACK_H + 1);

XGISPSEL= 0;

Project.prm:

//Added in SEGMENTS

	RAM_XGATE_STK_L_ = NO_INIT DATA_FAR	0xF81000 TO 0xF8107D;
	RAM_XGATE_STK_L = NO_INIT DATA_FAR	0xF8107E TO 0xF8107F;
	RAM_XGATE_STK_H_ = NO_INIT DATA_FAR	0xF81080 TO 0xF810FD;
	RAM_XGATE_STK_H = NO_INIT DATA_FAR	0xF810FE TO 0xF810FF;

//Added in PLACEMENT

XGATE_STK_L INTO RAM_XGATE_STK_L;

XGATE_STK_H INTO RAM_XGATE_STK_H;

I guess somehow the XGATE stack (while its location was not defined by the .PRM file) got onto the RAM area, which I was using for the interrupt vector table.

kef2 · ‎11-27-2013

Out of reset initial XGISP74 and XGISP31 settings are 0. One push with downwards growing stack and XGATE SP is 0xFFFE, which is nonpaged CPU12X RAM @0x3FFE.

Tips for making a multiple software image bootloader for MC9S12XEP100?

Tips for making a multiple software image bootloader for MC9S12XEP100?

General