That's right. I never said there was a problem with Flash programming, just that I was looking for way to reduce the RAM required (especially for the QG8), and that the manuals leave unanswered the question of 'when exactly' flash access is inhibited.
Anyway, I managed to shorten the example code given in Fig. 4-12 of the HCS08 Family Reference Manual from 24 bytes to 21 bytes + 2 for the JSR/BSR.
This was done by making these changes (RAM portion of code listed below):
1. Preload A with the value to write just before calling the routine.
2. Use immediate addressing mode for the loading of the command byte.
3. Have the loader patch the address in STA FLASH with the actual address that eliminating the need to load HX inside the routine and then use STA ,X. (We couldn't possibly avoid using HX for calling the RAM routine because any other method would mean extra stack used by the loader portion, so the benefit would be lost.)
But I had hoped to do even better by being able to the "sta FLASH" and "lda...sta FCMD" outside this routine.
tonyp@acm.org
;*******************************************************************************
; Purpose: RAM routine to do the job we can't do from Flash
; Input : A = value to program
; Note(s): This routine is modified in RAM by its loader at @2,3 and @5
; : Stack needed: 21 bytes + 2 for JSR/BSR
?RAM_Execute sta FLASH ;FLASH (@2,@3) is replaced
lda #mByteProg ;mByteProg (@5) is replaced
sta FCMD ;Step 2 - Write command to FCMD
lda #FCBEF_
sta FSTAT ;Step 3 - Write FCBEF_ in FSTAT
nop ;required delay
?RAM_Execute.Loop lda FSTAT ;Step 4 - Wait for completion
lsla ;check FCCF_ for completion
bpl ?RAM_Execute.Loop
?RAM_Execute_End rts ;on exit, A has non-zero if error
?RAM_Needed equ *-?RAM_Execute