Using K70 with 1MB program flash we found that parts of the ROM can become unreadable after programming. We are using MQX 4.1 but seen it happen in 4.0 as well.
The firmware is running from one half of PFLASH while it attempts to reprogram the other half using MQX's ftfe driver but fails. The first sector is erased, and most of it can be read back except the first 16 bytes: trying to read anything from the first 16 bytes (via a pointer to ROM) resets the CPU.We tried to recover the flash using block erase to no avail. The only solution seems to be a chip erase via JTAG.
Has anybody seen this issue? Is there any way to fix it? Any way to avoid it?
Hi Andras Lipoth,
Would you please share your source code for a review?Thanks for your patience!
Have a great day,
Kan
-----------------------------------------------------------------------------------------------------------------------
Note: If this post answers your question, please click the Correct Answer button. Thank you!
-----------------------------------------------------------------------------------------------------------------------
Hi Kan,
This is the function we use to program the flash. I'd like to stress, that it is working appx. 99.99% of the time, as we have programmed the flash hundreds of times on hundreds of units and it got corrupted only twice so far. However we would like to scale up our production and this is becoming an issue, as it is happening on the field (over-the-air firmware update).
//-----------------------------------------------------------------------------
// Copy one bank worth of data from firmware partition to flash
// (Skip the last block of Bank3, as it is used for flash swap indicator)
// part_block: block offset in firmware partition
// bankname: flash bank to write to
static bool WriteBank( uint32_t part_block, const char *bankname ) {
MQX_FILE_PTR flash_file;
bool success = true;
PROG_DEBUG( "Opening %s\n", bankname );
/* open flash device */
if (NULL == ( flash_file = fopen(bankname, NULL) ) ) {
PROG_ERROR( "Couldn't open %s\n", bankname );
return false;
}
uint32_t sector_size;
ioctl( flash_file, FLASH_IOCTL_GET_SECTOR_SIZE, §or_size );
/* Unprotecting the the FLASH might be required */
uint32_t ioctl_param = 0;
ioctl(flash_file, FLASH_IOCTL_WRITE_PROTECT, &ioctl_param);
ioctl(flash_file, FLASH_IOCTL_ENABLE_SECTOR_CACHE, NULL);
char *buf;
if ( NULL == ( buf = ((char *)ddrmalloc( BANK_SIZE ) ) ) ) {
PROG_ERROR( "Couldn't allocate %d bytes of memory\n", BANK_SIZE);
fclose( flash_file );
return false;
}
int sectors;
if (0 == strcmp(bankname, BANK2)) {
sectors = BANK_SIZE / sector_size ;
} else {
// bank3: leave the last one sector intact (used as
// flash indicator)
sectors = (BANK_SIZE / sector_size) - 1;
}
int blocks = BANK_SIZE / PM_SECTOR_SIZE;
success = ReadPartition( buf, part_block, blocks);
if ( success ) {
int bytes_to_write = sectors * sector_size;
if ( bytes_to_write != write( flash_file, buf, bytes_to_write ) ) {
PROG_ERROR( "Couldn't write data to flash\n" );
success = false;
}
}
fclose( flash_file );
free( buf );
return success;
}
Hi
I have seen the behaviour described. It occurs when a non-blank phrase is programmed again with bits that attempt to set '0' to '1'. When this happens the 'line' of Flash that the phrase is on cannot be read and any attempt to read from it results in a bus error. I don't know whether the 'line' of flash is equal to the phrase length (I don't remember exactly) but my "theory" is that the Flash uses ECC to detect and correct errors in a phrase (or line). Once the phrase is incorrectly programmed (only one write between erases is specified) it causes the ECC bits (also saved in Flash) to no longer match with the data and so the pharase is effectively a failed Flash line. (it may not be exactly like that but the NXP ARM7 processors used a 128 bit ECC which behaved similarly - except it didn't cause bus error when read - it was just "junk" when used incorrectly since teh ECC was trying to correct the content which mangled it totally).
I don't know whether a full erase is needed or not - probably as you tried just sector erases to no avail - but the source of the problem is a SW error since it is doing something illegal in the first place.
When I work with the Kinetis I use the uTasker simulator which emulates Flash and will exception if it detects any illegal flash use so that the SW errors are immediately seen. I didn't actually see the issue until investigating someone elses project where bad writing was taking place. There is no 'crash' when the bad write happens but later when code tries to use its content again (that is when the exception takes place).
To debug this you can add a check as low down as possible in the Flashing routine where all writes to the K70 Flash MUST be phrase writes (aligned 8 bytes).
Simply do something like (assuming ptrWord is an unsigned long pointer to the Flash address where the phrase begins)
if ((unsigned long)ptrWord & 0x7) {
print("Some crazy is writing with unaligned address!!!");
}
if ((*ptrWord != 0xffffffff) || (*(ptrWord + 1) != 0xffffffff)) { // check that the phrase is erased sicne a write will otherwise wreak havock
print("Some one is trying to kill my Flash!!!");
}
.. the actual programming operation follows
Set breakpoints on the possible errors and then use the call stack to find out who the culprit is.
Regards
Mark