Well, it's not as simple as that. Similar simple calculations were possible on very old devices (getting back to my nostalgic memories about 8051) but modern devices are not deterministic like that. Due to features increasing the performance (flash line buffer / flash mini-cache, prefetching, data cache on core) and other influence (like crossbar switch, other bus masters), you can get different results based on current conditions. Even ARM does not provide detailed instruction timing.
If you are going to read more bytes which are stored sequentially in the flash, just the first access will take more time due to flash wait states. Then corresponding flash line will be already loaded in the buffer, so other bytes can be read in single cycle... Then it depends if all the bytes are in single flash line only, on prefetching etc.
Generally, the best and simplest way to check timing of some critical code sections is to do that empirically - just toggle a pin before/after execution of such code and measure the pulse by an oscilloscope.
Regards,
Lukas