The root cause of this issue is an overflow of a unsigned 32-bit number when trying to access the sector. It is confusing to follow through the functions because sometimes the LOCATION is referred to by a SECTOR, and sometimes it is referred to by BYTE.
In my SD Card, a sector is 512 bytes. In the function mfs_rw.c a sector location is converted into a byte location. This is accomplished by multiplying the byte location by the number of bytes in a sector. This is all done with 32-bit variables. This variable will overflow when the sector is 0x0080_0000. When multiplying by 512 the byte location becomes 0x0001_0000_0000. Since it is a 32-bit value it will be saved as 0x0000_0000 and writing to this location will corrupt the file system.
By the way, a sector location of 0x0080_0000 with 512 bytes per sector corresponds to 4 GB. When I tested with a 4 GB SD card I never saw the issue. We will be using 16 GB going forward.
The following changes pass the test I created as described in the original post:
mfs_format.c
Original
sector_size = drive_ptr->SECTOR_SIZE;
error_code = ioctl(drive_ptr->DEV_FILE_PTR, IO_IOCTL_GET_NUM_SECTORS, &num_sectors);
#if !MQX_USE_IO_OLD
if (error_code == -1 && errno == NIO_ENOTSUP)
{
num_sectors = lseek(drive_ptr->DEV_FILE_PTR, 0, SEEK_END) / sector_size;
error_code = MFS_NO_ERROR;
}
Changes
if (error_code == -1 && errno == NIO_ENOTSUP)
{
num_sectors = _nio_lseek(drive_ptr->DEV_FILE_PTR, 0, SEEK_END, &error) / sector_size;
error_code = MFS_NO_ERROR;
}
Explanation
lseek returns a 32-bit value (off_t) for my compiler. lseek is going to return the number of bytes, which when greater than 4 GB will overflow. (This might actually be 2 GB if its signed...) _nio_lseek returns a 64-bit number.
mfs_rw.c
Changes are made to the functions MFS_Write_device_sectors and MFS_Read_device_sectors
Original
uint32_t attempts;
int32_t num, expect_num, seek_loc, shifter;
char *data_ptr;
_mfs_error error;
...
MFS_LOG(printf("MFS_Write_device_sectors %d %d\n", sector_number, sector_count));
if (sector_number > drive_ptr->MEGA_SECTORS)
{
return MFS_SECTOR_NOT_FOUND;
}
if (drive_ptr->BLOCK_MODE)
{
shifter = 0;
seek_loc = sector_number;
expect_num = sector_count;
}
else
{
shifter = drive_ptr->SECTOR_POWER;
seek_loc = sector_number << shifter;
expect_num = sector_count << shifter;
}
#if MQX_USE_IO_OLD
fseek(drive_ptr->DEV_FILE_PTR, seek_loc, IO_SEEK_SET);
#else
lseek(drive_ptr->DEV_FILE_PTR, seek_loc, SEEK_SET);
#endif
Changes
uint32_t attempts;
int64_t seek_loc;
int32_t expect_num;
int32_t num, shifter;
int nio_error;
char *data_ptr;
_mfs_error error;
...
if (sector_number > drive_ptr->MEGA_SECTORS)
{
return MFS_SECTOR_NOT_FOUND;
}
if (drive_ptr->BLOCK_MODE)
{
shifter = 0;
seek_loc = sector_number;
expect_num = sector_count;
}
else
{
shifter = drive_ptr->SECTOR_POWER;
seek_loc = (int64_t)sector_number << shifter;
expect_num = sector_count << shifter;
}
#if MQX_USE_IO_OLD
fseek(drive_ptr->DEV_FILE_PTR, seek_loc, IO_SEEK_SET);
#else
_nio_lseek(drive_ptr->DEV_FILE_PTR, seek_loc, SEEK_SET, &nio_error);
#endif
Explanation
Commented out MFS_LOG since it is not needed. Changed seek_loc to int64_t since it will be converted into a BYTE location from a SECTOR. Type-cast sector_number with an (int64_t) since it will be multiplied by 512 and will overflow if it stays at 32-bit. Changed call from lseek to _nio_lseek due to off_t issue.
part_mgr.c
These changes need to be applied to the _io_part_mgr_write and _io_part_mgr_read functions. (Note: Make sure to call read when editing the _mgr_read)
Original
uint64_t location;
uint64_t part_start;
uint64_t part_end;
int32_t result;
...
result = lseek(pm_struct_ptr->DEV_FILE_PTR, location, SEEK_SET);
if (result >= 0)
{
result = write(pm_struct_ptr->DEV_FILE_PTR, data_ptr, num);
}
Changes
int64_t location;
int64_t part_start;
int64_t part_end;
int32_t result;
...
location = _nio_lseek(pm_struct_ptr->DEV_FILE_PTR, location, SEEK_SET, error);
if (location >= 0)
{
result = write(pm_struct_ptr->DEV_FILE_PTR, data_ptr, num, error);
}
else
{
if(error)
{
*error = MFS_ERROR_SEEK;
}
result = -1;
}
Explanation
_nio_lseek returns a signed 64-bit number because a value less than 0 is an error. Changed location, part_start, part_start, and part_end to signed as well. My application will only address up to 16 GB, so the signed / unsigned will not impact it.
Additional notes:
off_t is described by my compiler (GCC) as 32-bits. It was mentioned in another post that the comp.h file for an IAR project explicitly defines off_t as 64-bits signed. I tried redefining off_t in comp.h for the GCC project, but I ran into many issues and didn't feel like trying to solve them, fearing I could unintentionally break something else. My solution was to remove all calls to lseek and replace them with _nio_lseek.
comp.h can be found at \KSDK_1.3.0\rtos\mqx\mqx\source\psp\cortex_m\compiler\iar