Questions about SPT Kernels(range_2048smp_512crp_16ch_zrpad.pspt ) in S32R45_RSDK_0.9.5

cancel
Showing results for 
Show  only  | Search instead for 
Did you mean: 

Questions about SPT Kernels(range_2048smp_512crp_16ch_zrpad.pspt ) in S32R45_RSDK_0.9.5

1,350 Views
CSBoy
Contributor II

Hi:

While reading  source code of range_2048smp_512crp_16ch_zrpad.pspt , I was a little confused about the memory format of the FFT output to OPRAM (thd_scs0 and thd_scs2). as follows:

Thread thd_scs0 uses OR_0_0_0 (bank4-7) to cache FFT results,

set #(OR_0_0_0 | (OR_2_0_0 < < 24)), WR_24

set #(OR_0_1_0 | (OR_2_64_0 < < 24)), WR_25

Thread thd_scs2 also uses OR_0_0_0 (bank4-7) to cache FFT results

set #(OR_0_8_0 | (OR_2_0_0 < < 24)), WR_29

set #(OR_0_9_0 | (OR_2_64_0 < < 24)), WR_30

However, for 16 channels and 2048 sampling points, the required memory size is 2048/2*16=16384, but the size of a bank is 512*16=8192, a bank can only store data of 8 channels,  for 16 channels and 2048 sampling points, that need to use two banks to store.

But,  the FFT result is finally copied from OPRAM to DDR, two banks are used in the code.

copy.trans .cmplx, OPRAM_BANK_SIZE, OR_0_0_0, OR_2_0_0, 0x1, 0x1, 0x10, 0x10

copy.trans .cmplx, OPRAM_BANK_SIZE, OR_1_0_0, OR_3_0_0, 0x1, 0x1, 0x10, 0x10

Therefore, I think the FFT result should be cached in thd_scs2 OR_1_0_0 (bank4-7) ,the code should be modified as follows

set #(OR_1_0_0 | (OR_2_0_0 < < 24)), WR_29

set #(OR_1_1_0 | (OR_2_64_0 < < 24)), WR_30

Labels (1)
0 Kudos
Reply
4 Replies

1,252 Views
barrybruce
Contributor I

I agree with you, thanks for your sharing.

e-commerce website

0 Kudos
Reply

1,318 Views
GaryRK
NXP Employee
NXP Employee

Hi CSBoy,

The point that's missing from below analysis is the OPRAM logical to physical address mapping in SPT3.1. For the SCS2 this is set up so that logical bank 4-7 actually maps to physical bank 12-15, meaning even though the SPT commands in SCS2 thread reference bank 4-7 this maps to physical memory in bank 12-15.

Please see line 671 and line 683 in the SPT kernel source file.

Best regards,

Gary

 

0 Kudos
Reply

1,313 Views
CSBoy
Contributor II

Hi Gary,

Thank you for your reply, but your reply makes me even more confused. Line 671 in the SPT kernel source file is "set 0x7, SPR_3 "means SET Read NE Controller (BANK12-15) write SW controller (BANK4-7), I think it means write data to physical bank4-7, This is also illustrated in the "spt_kernels_rrm_design_document.pdf" document 3.1.2 (Range kernels 16CH),as shown below, combining SCS0 and SCS2 data into a single Bank, which I think can only support a range FFT up to 1024 points.

CSBoy_0-1661390484335.png

 

Best regards,

CSBoy

Tags (2)
0 Kudos
Reply

1,297 Views
GaryRK
NXP Employee
NXP Employee

Hi CSBoy,

Yes you are right, sorry I got the mapping mixed up in my previous reply.

The reason that the same physical memory banks can be used to store the final output data set (output of the vmt commands after rdx sequence) is because the vmt commands are using a dest_add_inc operand of 0x10. E.g line 583:

vmt.shift_sq2s2.ind .cmplx .rst_sum .in_48 .op_off .no_sq1 .no_sq2s1 WR_46 .no_sq2s3, OUTPUT_SAMPLES_PER_CH, WR_25, 0x1, 0x10, WR_47

Therefore although both SCS0 and SCS2 are writing to the same bank they do not overwrite each other's output because they use different destination address columns and have dest_add_inc  = 0x10. It also means that the combined output of both SCS for all 16 channels spans 2 OPRAM banks, because as you observe 2 banks are needed for 16K output elements.

Best regards,

Gary

0 Kudos
Reply