Hello,
Feel free to jump to the questions in the end of this post, as my experiment description is quite verbose and may be superfluous...
I have been doing some benchmarking with the S32K396-BGA-DC1 running in S32DS for S32. My reference for timing behavior expectations is a MPC5777CEVB running the same program in S32DS for Power Architecture - I usually run it at 264MHz but also tried it a couple of times at 300MHz (not often because my chip is the old PN that is not supposed to run at 300MHz).
I believe I am running the MPC5777C at its optimal configuration (stack in cache, optimized Flash read/prefetch) as recommended by AN5191. My program takes some 750us to run when I let the processor run at 264MHz and some 650us when the processor runs at 300MHz (it is a sort of sequential code which does not rely very heavily on caches).
As I am a beginner on S32K3, I inserted my code in the Dio_Example_S32K396 example available in "S32K396 AUTOSAR R21-11 RTD 4.0.0 P14 D2403 Example Projects" - its linker directives allocate code/constants in Flash, data in SRAM and stack in DTCM. I also used the configuration tool to set the following clocks (using the "Mode A+" values presented in Table 122 of the S32K39/37 reference manual):
PLL_PHI0 = 320MHz
PLL_PHI1 = 480MHz
CORE_CLK = 160MHz
AIPS_PLAT_CLK = 80MHz
AIPS_SLOW_CLK = 40MHz
HSE_CLK = 80MHz
DCM_CLK = 40MHz
LBIST_CLK = 40MHz
QSPI_MEM_CLK = 320MHz
CM7_CORE_CLK = 320MHz
While I expected the new MCU to beat the old one due to its faster and more efficient core, I was surprised to see that it ran my program faster than the 264MHz MPC5777C but slower than the 300MHz MPC5777C. Both processors had very similar performance when I allocated all my data in the CPU0 DTCM, which is not something that would work for a larger application that would use most of the SRAM.
I did not configure the Flash by hand but it seems the configuration tool (or the example startup code) had set it wait states to 5 (which seems to match the 160MHz CORE_CLK), although it did not enable prefetching. I suppose this is a bit critical for my code, as disabling prefetching in the MPC5777C caused a 100us increase in its execution time.
Having said this all, my questions are:
- Can I enable prefetching using the S32DS configuration tool? If this is not possible, what would be the cleanest way of doing this? I do not know where my _start function comes from...
- Given the 5 wait states of the Flash (vs. 3 of the MPC5777C when its peripheral clock runs at 132MHz or even 150MHz), is it correct to assume the MPC5777C Flash memory performs better than the S32K396 one?
- Is there any document like the AN5191 that provides guidance for extracting the best performance of the S32K396?
Thanks!
Ricardo
Solved! Go to Solution.
Hello,
- Can I enable prefetching using the S32DS configuration tool? If this is not possible, what would be the cleanest way of doing this? I do not know where my _start function comes from...
No — prefetching is not configurable via the S32DS Configuration Tool (as of the latest RTD and SDK versions).
You’ll need to manually configure the Flash prefetch settings in your startup code.
- Given the 5 wait states of the Flash (vs. 3 of the MPC5777C when its peripheral clock runs at 132MHz or even 150MHz), is it correct to assume the MPC5777C Flash memory performs better than the S32K396 one?
MPC5777C:
S32K396:
- Is there any document like the AN5191 that provides guidance for extracting the best performance of the S32K396?
This is the closest equivalent to AN5191 for the S32K3 family. It provides:
Best regards,
Peter
Hello,
- Can I enable prefetching using the S32DS configuration tool? If this is not possible, what would be the cleanest way of doing this? I do not know where my _start function comes from...
No — prefetching is not configurable via the S32DS Configuration Tool (as of the latest RTD and SDK versions).
You’ll need to manually configure the Flash prefetch settings in your startup code.
- Given the 5 wait states of the Flash (vs. 3 of the MPC5777C when its peripheral clock runs at 132MHz or even 150MHz), is it correct to assume the MPC5777C Flash memory performs better than the S32K396 one?
MPC5777C:
S32K396:
- Is there any document like the AN5191 that provides guidance for extracting the best performance of the S32K396?
This is the closest equivalent to AN5191 for the S32K3 family. It provides:
Best regards,
Peter