MFS FAT Corruption

cancel
Showing results for 
Show  only  | Search instead for 
Did you mean: 

MFS FAT Corruption

1,415 Views
timias
Contributor IV

We had a unit returned that we believe was powered down when MFS was in an unsafe state. It ended up on my desk with a wiped boot sector, and I tried to design an experiment to replicate the problem, by cutting power to a unit, with MFS in various random states such as:files open, the middle of write, and in process of deletion (this one really jacks things up BTW). I was able to seriously mangle things up, and was surprised to find that Microsoft chkdsk.exe couldn't make things even partly right.

Then I remembered seeing a reference in MFS source that set  #define MFSCFG_NUM_OF_FATS 1, and reading some white papers on chkdsk.exe suggested that a single FAT table might cause prevent a chkdsk type program from fixing common errors as the secondary FAT provides backup and is relied on for certain errors (2 FAT tables are the standard).

Well adding #define MFSCFG_NUM_OF_FATS 2 in my User_Config.h file really made a mess of things.

Here is a screen shot of the directory structure after rebuilding MQX 3.8 with the new setting. Then my code calling

ioctl(fshandle, IO_IOCTL_DEFAULT_FORMAT,NULL);

followed by several hundred ioctl(fshandle, IO_IOCTL_CREATE_SUBDIR, (uint_32_ptr)dirName) and fwrite operations.

QUESTION TIME: IF I missed a step what do I need to do to make it all work nice?

pastedImage_0.png

Labels (1)
5 Replies

497 Views
DavidS
NXP Employee
NXP Employee

Hi Robert,

Thank you for sharing your experience with one and all.

Sorry for the hassle too.

Regards,

David

0 Kudos

497 Views
Cdn_aye
Senior Contributor I

Hi Robert

We use the FAT system as well and what we did was to do a flush of the file at each collection, then after n flushes, 64 in our case, we close the file and reopen. We also built the array of data into a size that was a set of structures that is 512 bytes. FAT writes best in 512 byte blocks although you can do other sizes.  The reason was, that way we only loose the last block if a problem occurs and all the file table information is updated properly. Failing this method, we found that there are a lot of things that can cause the FAT system to go sideways and with 0 blocks written, names all muddled up and so on.

The last file software we used that was our own, and a flush moved the end of file marker (0x55, 0xAA) to after the record written and all was well if the system crashed. But the FSL FAT system doesn't move the marker except on a close and if the file is not closed the whole sorry mess was lost.

There is an overhead in doing it this way but at least all the enteries are completed correctly. It is on our list todo, to mod the system to move the EOF after each flush.

Regards

Robert Lewis

497 Views
timias
Contributor IV

Interestingly enough. I decided to rebuild one more time, redeploy, and retest, and this time it seemed to work fine. I am 100% convinced that rebuilding the PSP/ BSP was not out of sync as I did a complete clean and rebuild before deploying. . Could the fact that the SD Card had been formatted with a single FAT earlier, be contributing factor?

In either case I am not sure why it has decided to work. Clearly something didn't work right the first time, any Ideas where to start looking/ repeating the results?

0 Kudos

497 Views
timias
Contributor IV

I seem to have confirmed that the corruption happens, when formatting a disk where the previous format had been done using MFSCFG_NUM_OF_FATS 1 and you are now formatting MFSCFG_NUM_OF_FATS 2. But if you format it again, from the corrupted state to MFSCFG_NUM_OF_FATS 2 it works fine. UGH I thought I was going nuts. FYI the format command takes a really long time when it corrupts.

497 Views
Fabi
Contributor III

The patch from aimozg may help you (see Howto flush MFS? (Use case: mfs onto ramdisk)). Good luck!

0 Kudos