I'm doing detailed pipeline analysis for a critical section of high-speed DSP code. I'm considering using a trick to speed up the code -- place a heavily used (but small) data item in on-board RAM (IMMR) instead of pulling the data from cache/DDR. I have other larger data items that have no chance of fitting on-board so the data cache will already be challenged presenting the large data. I'm just trying to get a single 16 word data element on-board with guaranteed single cycle access to minimize cache data traffic.
Question: Are any wait states imposed on IMMR reads or writes or should they execute as fast as any data in cache?
Question: Ref Manual section 3.2 says:
To guarantee that the results of any sequence of writes to configuration registers are in effect, the final configuration register write should be followed immediately by a read of the same register, and that should be followed by a sync instruction. Then accesses can safely be made to memory regions affected by the configuration register write.
If I have to execute W/R/Sync for every IMMR access then it will be too slow and I will pursue another path. The W/R/Sync seems quite unusual; I've never seen this requirement before for on-chip peripheral registers in any other CPU. Or, am I misinterpreting the language and it ONLY applies to setting up IMMBAR? If the triple-access is indeed the case, can you share another solution to achieve my goal?
Question: I need 16 32-bit RAM locations (treat some IMMR regs as RAM==trick). My candidates are:
- PCI Mailbox Registers (128 words)
- Ethernet (eTSEC) MACXXADDRY Registers (32 words)
Do you see any issues (side effects) from using these registers as RAM or should they function just fine?
I'm open to any solution. Is there a better way? The above are just a few examples of what I discovered quickly. The basic trick is to repurpose unused peripheral regs as local RAM.
Have others used this trick before? Success?