can't get high resolution PMU cycles On 1046a with LSDK 21.08 & 5.10.35-rt39-dirty

cancel
Showing results for 
Show  only  | Search instead for 
Did you mean: 

can't get high resolution PMU cycles On 1046a with LSDK 21.08 & 5.10.35-rt39-dirty

946 Views
shiMiao
Contributor I

HI,

    work with 1046a with LSDK 21.08, kernel is 5.10.35-rt39-dirty

   also work with DPDK, accroding DPDK's suggesion :

/**
 * This is an alternative method to enable rte_rdtsc() with high resolution
 * PMU cycles counter.The cycle counter runs at cpu frequency and this scheme
 * uses ARMv8 PMU subsystem to get the cycle counter at userspace, However,
 * access to PMU cycle counter from user space is not enabled by default in
 * arm64 linux kernel.
 * It is possible to enable cycle counter at user space access by configuring
 * the PMU from the privileged mode (kernel space).
 *
 * asm volatile("msr pmintenset_el1, %0" : : "r" ((u64)(0 << 31)));
 * asm volatile("msr pmcntenset_el0, %0" :: "r" BIT(31));
 * asm volatile("msr pmuserenr_el0, %0" : : "r"(BIT(0) | BIT(2)));
 * asm volatile("mrs %0, pmcr_el0" : "=r" (val));
 * val |= (BIT(0) | BIT(2));
 * isb();
 * asm volatile("msr pmcr_el0, %0" : : "r" (val));
 */

 

asm volatile("mrs %0, pmccntr_el0" : "=r"(tsc));

 I get the pmccntr_el0 value is 0, while I get the right value in other armv8 platform

 how can I get the right value?

attach '开发板串口日志_启动完整日志.txt' is the full startup log
 

thanks.

0 Kudos
Reply
4 Replies

893 Views
yipingwang
NXP TechSupport
NXP TechSupport

There is no TSC to use on AARCH64 need ASM porting. looks like this DPDK API not working well, just read system counter CNTFRQ_EL0 or wrong PMU counter. anyway, it is DPDK API issue.

There is no much change of implement for rte_tsc_hz(), from DPDK20.xx to current latest 22.xx, AARCH64 need hardware support PMU, and define marco to use ARM64 PMU get procise clock. The RTE_ARM_EAL_RDTSC_USE_PMU is not enable default, so just read system counter “asm volatile(“mrs %0, cntfrq_el0” : “=r” (freq));”, not like the TSC procise on X86.
dpdk source/config/arm/meson.build

dodk source/lib/librte_eal/arm/rte_cycles.c

get_tsc_freq_arch(void)
{
#if defined RTE_ARCH_ARM64 && !defined RTE_ARM_EAL_RDTSC_USE_PMU
return __rte_arm64_cntfrq();
#elif defined RTE_ARCH_ARM64 && defined RTE_ARM_EAL_RDTSC_USE_PMU
#define CYC_PER_1MHZ 1E6
/* Use the generic counter ticks to calculate the PMU * cycle frequency.
*/
uint64_t ticks;
uint64_t start_ticks, cur_ticks;
uint64_t start_pmu_cycles, end_pmu_cycles;

/** Read generic counter frequency */
static __rte_always_inline uint64_t
__rte_arm64_cntfrq(void)
{
uint64_t freq;
asm volatile("mrs %0, cntfrq_el0" : "=r" (freq));

return freq;
}
please use the next link to get the whole post:
https://forums.developer.nvidia.com/t/get-200mhz-arm-frequency-when-using-dpdk-on-dpu/209615/7

0 Kudos
Reply

880 Views
shiMiao
Contributor I

thank you  for your reply

how can  I confirm that whether my board support PMU?

there is bootloader and kernel starup log my initial attach,that shows that Machine model: LS1046A PSCB Board

can you help to confirm whether pmu is available? if not, is there another way to get precise clock in user space?

thanks

best regards

0 Kudos
Reply

840 Views
yipingwang
NXP TechSupport
NXP TechSupport

 We recommend that to have such tests in kernel space or in TFA/uboot

We tried the following test code in TFA(EL3), seems the average time is several ns on 1046ardb for a read of cntvct_el0

uint64_t tsc1, tsc2, tsc;

asm volatile("mrs %0, cntvct_el0" : "=r" (tsc1));

for (int i = 0; i < 10000; i++) {

asm volatile("mrs %0, cntvct_el0" : "=r" (tsc));

}

asm volatile("mrs %0, cntvct_el0" : "=r" (tsc2));

printf("sub: %lu\n", tsc2 -tsc1);

 

The clock_gettime and such precise API are implemented in kernel with also a read to the cntvct_el0.

0 Kudos
Reply

681 Views
hemantagrawal
NXP Employee
NXP Employee

Check this:

https://doc.dpdk.org/guides/prog_guide/profile_app.html

you need to load the kernel module and than you can access the API from userspace.

0 Kudos
Reply