Good morning,
we're performing a performance test on our 2 platforms, imx7d and imx8 mmini, to investigate an issue we're facing.
The test is based in a basic communication between 2 threads, implemented with the pthread library, and it works as follow:
1) One thread is in waiting with a conditional wait
2) the other thread wakes it up
3) elapsed time computation
4) threads exchange roles
5) cycle restarts
Fact is: we see very different performance with this test between imx7 and imx8 platforms.
To make a deeper analysis, we make these measurements in 3 different conditions:
- threads on the same core
- threads on different cores
- OS decides on its own
The most critical condition, the one in which the 2 platforms perform very differently is when threads are in different cores. We even tried to put 2 cores offline in the imx8 platform, but as it is possible to see below, nothing changes. Can someone help us with this situation? Are we missing something?
Further details:
We tried to use this system settings to get a better system stability for all the tests:
- scaling_governor: performance
- dynamic frequency scaling driver: disabled
(i.e. on imx8: echo 0 > /sys/bus/platform/drivers/imx_busfreq/busfreq/enable)
Data
imx7d
uname -r: 4.9.11+gf1a31cc
Forcing threads on different cores
[T1] Average is 10 us;
[T2] Average is 14 us;
Forcing threads on same core
[T1] Average is 10 us;
[T2] Average is 14 us;
OS scheduler decides thread-core affinity
[T1] Average is 11 us;
[T2] Average is 15 us;
imx8mmini (all cores online)
uname -r: 4.14.78-imx_4.14.78_1.0.0_ga_dev+g991fec2
Forcing threads on different cores
[T1] Average is 493 us;
[T2] Average is 458 us;
Forcing threads on same core
[T1] Average is 10 us;
[T2] Average is 10 us;
OS scheduler decides thread-core affinity
[T1] Average is 507 us;
[T2] Average is 448 us;
imx8mmini (only 2 cores online)
uname -r: 4.14.78-imx_4.14.78_1.0.0_ga_dev+g991fec2
Forcing threads on different cores
[T1] Average is 474 us;
[T2] Average is 379 us;
Forcing threads on same core
[T1] Average is 10 us;
[T2] Average is 10 us;
OS scheduler decides thread-core affinity
[T1] Average is 480 us;
[T2] Average is 384 us;
As it is possible to see the difference is sensible. We're available for any further useful test, just let us know.
Any help would be appreciated, thanks!