can't get high resolution PMU cycles On 1046a with LSDK 21.08 & 5.10.35-rt39-dirty

キャンセル
次の結果を表示 
表示  限定  | 次の代わりに検索 
もしかして: 

can't get high resolution PMU cycles On 1046a with LSDK 21.08 & 5.10.35-rt39-dirty

1,304件の閲覧回数
shiMiao
Contributor I

HI,

    work with 1046a with LSDK 21.08, kernel is 5.10.35-rt39-dirty

   also work with DPDK, accroding DPDK's suggesion :

/**
 * This is an alternative method to enable rte_rdtsc() with high resolution
 * PMU cycles counter.The cycle counter runs at cpu frequency and this scheme
 * uses ARMv8 PMU subsystem to get the cycle counter at userspace, However,
 * access to PMU cycle counter from user space is not enabled by default in
 * arm64 linux kernel.
 * It is possible to enable cycle counter at user space access by configuring
 * the PMU from the privileged mode (kernel space).
 *
 * asm volatile("msr pmintenset_el1, %0" : : "r" ((u64)(0 << 31)));
 * asm volatile("msr pmcntenset_el0, %0" :: "r" BIT(31));
 * asm volatile("msr pmuserenr_el0, %0" : : "r"(BIT(0) | BIT(2)));
 * asm volatile("mrs %0, pmcr_el0" : "=r" (val));
 * val |= (BIT(0) | BIT(2));
 * isb();
 * asm volatile("msr pmcr_el0, %0" : : "r" (val));
 */

 

asm volatile("mrs %0, pmccntr_el0" : "=r"(tsc));

 I get the pmccntr_el0 value is 0, while I get the right value in other armv8 platform

 how can I get the right value?

attach '开发板串口日志_启动完整日志.txt' is the full startup log
 

thanks.

0 件の賞賛
返信
4 返答(返信)

1,251件の閲覧回数
yipingwang
NXP TechSupport
NXP TechSupport

There is no TSC to use on AARCH64 need ASM porting. looks like this DPDK API not working well, just read system counter CNTFRQ_EL0 or wrong PMU counter. anyway, it is DPDK API issue.

There is no much change of implement for rte_tsc_hz(), from DPDK20.xx to current latest 22.xx, AARCH64 need hardware support PMU, and define marco to use ARM64 PMU get procise clock. The RTE_ARM_EAL_RDTSC_USE_PMU is not enable default, so just read system counter “asm volatile(“mrs %0, cntfrq_el0” : “=r” (freq));”, not like the TSC procise on X86.
dpdk source/config/arm/meson.build

dodk source/lib/librte_eal/arm/rte_cycles.c

get_tsc_freq_arch(void)
{
#if defined RTE_ARCH_ARM64 && !defined RTE_ARM_EAL_RDTSC_USE_PMU
return __rte_arm64_cntfrq();
#elif defined RTE_ARCH_ARM64 && defined RTE_ARM_EAL_RDTSC_USE_PMU
#define CYC_PER_1MHZ 1E6
/* Use the generic counter ticks to calculate the PMU * cycle frequency.
*/
uint64_t ticks;
uint64_t start_ticks, cur_ticks;
uint64_t start_pmu_cycles, end_pmu_cycles;

/** Read generic counter frequency */
static __rte_always_inline uint64_t
__rte_arm64_cntfrq(void)
{
uint64_t freq;
asm volatile("mrs %0, cntfrq_el0" : "=r" (freq));

return freq;
}
please use the next link to get the whole post:
https://forums.developer.nvidia.com/t/get-200mhz-arm-frequency-when-using-dpdk-on-dpu/209615/7

0 件の賞賛
返信

1,238件の閲覧回数
shiMiao
Contributor I

thank you  for your reply

how can  I confirm that whether my board support PMU?

there is bootloader and kernel starup log my initial attach,that shows that Machine model: LS1046A PSCB Board

can you help to confirm whether pmu is available? if not, is there another way to get precise clock in user space?

thanks

best regards

0 件の賞賛
返信

1,198件の閲覧回数
yipingwang
NXP TechSupport
NXP TechSupport

 We recommend that to have such tests in kernel space or in TFA/uboot

We tried the following test code in TFA(EL3), seems the average time is several ns on 1046ardb for a read of cntvct_el0

uint64_t tsc1, tsc2, tsc;

asm volatile("mrs %0, cntvct_el0" : "=r" (tsc1));

for (int i = 0; i < 10000; i++) {

asm volatile("mrs %0, cntvct_el0" : "=r" (tsc));

}

asm volatile("mrs %0, cntvct_el0" : "=r" (tsc2));

printf("sub: %lu\n", tsc2 -tsc1);

 

The clock_gettime and such precise API are implemented in kernel with also a read to the cntvct_el0.

0 件の賞賛
返信

1,039件の閲覧回数
hemantagrawal
NXP Employee
NXP Employee

Check this:

https://doc.dpdk.org/guides/prog_guide/profile_app.html

you need to load the kernel module and than you can access the API from userspace.

0 件の賞賛
返信