GPU thermal settings

cancel
Showing results for 
Show  only  | Search instead for 
Did you mean: 

GPU thermal settings

1,658 Views
michaelworster
Contributor IV

I'm using an i.MX6DL part for my design and I've been running into some issues with Scaling of the GPU speed based on thermal throttling.

In order to work on/debug this issue, I wanted to set some of the thermal scaling points.

Looking into the code, it appears this code runs if: gcdENABLE_FSCALE_VAL_ADJUST is true and CONFIG_DEVICE_THERMAL is defined.

Then in the function thermal_hot_pm_notify(), if the GPU is ready and registered with the system, we extract the original, max, and min frequency scaling values for the GPU and set the min value. Once the GPU cools down, then reset the original value.

I’m seeing there is a sysfs node should be created for the gpu3DMinClock attribute, however I’ve not yet identified what name was provided for this attribute.

  1. Where are these initially defined?
  2. What is the sysfs handle created for setting GPU frequencies?

I'd like to change the max and min frequencies and see how our issue behaves.

Labels (1)
0 Kudos
2 Replies

1,643 Views
Bio_TICFSL
NXP TechSupport
NXP TechSupport

Hello michael,

i.MX device support the thermal driver, which could signal the overheat event to GPU driver, once GPU driver receive the event, it can enable GPU DFS feature to reduce GPU frequency as N/64 of the original designated clock.
The default N factor is 1 in the original BSP release, the end-user can reconfigure it through below command:
echo N >/sys/bus/platform/drivers/galcore/gpu3DMinClock
The user also can check the existing config as below
cat /sys/bus/platform/drivers/galcore/gpu3DMinClock

You will not be able to perform any power reduction in terms of consumption. In other words, the power consumption will remain the same throughout.
The feature of changing in the CPU frequency on the fly, will be no more available on disabling the Frequency scaling.

The default BSP will set GPU clock to 1/64 when it reach high temperature thermal notification case, and in this case, due to GPU is too slow, there is chance to make GPU crash in some GPU use case.

Customer can adjust it to 32/64, or 16/64 in followed file:

Note, with high GPU frequency, if temperature keeps on raising, the chip still can crash due to it is over its work temperature.

 

drivers/mxc/gpu-viv/hal/os/linux/kernel/platform/freescale/gc_hal_kernel_platform_imx6q14.c

 

diff --git a/drivers/mxc/gpu-viv/hal/os/linux/kernel/platform/freescale/gc_hal_kernel_platform_imx6q14.c b/driver
index 4778aa9..8fa3b98 100644
--- a/drivers/mxc/gpu-viv/hal/os/linux/kernel/platform/freescale/gc_hal_kernel_platform_imx6q14.c
+++ b/drivers/mxc/gpu-viv/hal/os/linux/kernel/platform/freescale/gc_hal_kernel_platform_imx6q14.c
@@ -237,6 +237,7 @@ static int thermal_hot_pm_notify(struct notifier_block *nb, unsigned long event,

     if (event && !bAlreadyTooHot) {
         gckHARDWARE_GetFscaleValue(hardware,&orgFscale,&minFscale, &maxFscale);
+        minFscale = 32;
         gckHARDWARE_SetFscaleValue(hardware, minFscale);
         bAlreadyTooHot = gcvTRUE;
         gckOS_Print("System is too hot. GPU3D will work at %d/64 clock.\n", minFscale);

The imx_thermal driver (drivers\thermal\imx_thermal.c) is used to cool down the system, when the chip temperature had reached 85 degree (#define IMX_TEMP_PASSIVE  85000), the thermal driver will notify CPU and GPU driver to reduce the frequency to make the chip cool.

 

The "gpu_powermanagment" is not related to this, if customer wants to disable the GPU cooling, they need change "gcdENABLE_FSCALE_VAL_ADJUST" to 0 in GPU driver file "drivers\mxc\gpu-viv\hal\kernel\inc\gc_hal_options.h".

 

But if your device cooling is not best, keeps GPU in full speed after the chip had reached 85 degree, there is chance to make the temperature keeps raising. The automotive IMX6 chip's MAX work temperature is 125 degree, if it is higher than this, the system still will crash.

If the you can make sure, their device will never run to such high temperature (125 degree for iMX6 chip) with all kinds of use case, they can disable the CPU and GPU cooling method.

 

Regards

0 Kudos

1,632 Views
michaelworster
Contributor IV

Thanks for the information and helpful tips into the code base. I see where the passive temp is set to 85C in the code. Could you explain the purpose of the IMX_TEMP_PASSIVE_COOL_DELTA?

This code seems to anticipate a standard consumer version of the i.MX6 as it's a hardcoded 85C. Is there anywhere in the code that accounts for other grades (industrial or automotive) of i.MX6? In our design we currently use an industrial grade i.MX6DL, and it's my understanding the "trip" temp for this chips should be 95C instead of 85C. Is that accounted for anywhere?

Last question, I appreciate you pointing out areas the Min GPU frequency could be adjusted during a cooling period; however, our use case I was thinking is there a simple way to adjust the Max GPU rate? We DO NOT require the full 64/64 GPU frequency, as our application only serves simple text based webpages. Can we simply introduce an artificial ceiling frequency to our GPU to force a cooler run by capping the Max value?

Thanks

0 Kudos