Thanks for reply.
1. in "AN4745" it mentioned:

Core has access to SRAM_U from System bus which its speed is like I/D Bus. so Where does this delay come from?
2. it RM says that crossbar connect masters to slaves simultaneously, so there is should not be delay! (in my opinion). you say because of the crossbar, DMA access speed to SRAM, will be less than Core? (i.e. Core clock = 80MHz, sys_clk = 80MHz, DMA access clock will be 80MHz or less?)