AnsweredAssumed Answered

GPU hang on sabre running Android 5.1.1

Question asked by neeraj sharma on May 3, 2017
Latest reply on Oct 27, 2017 by John Smith
Branched to a new discussion

Hi,

 

We have a custom platform based on SabreSD imx6q reference running Lollipop L5.1.1_2.0.0 GA (LMY47V) [kernel:3.14.38-138123]

We have been seeing a problem wherein the gpu and the display would just hang randomly.

System remains in the same state unless rebooted.

 

i have figured out that for some reason using chrome, with random events is the fastest way to reproduce this problem.

[Running "monkey --throttle 100 -p com.android.chrome 1000000" infinitely]

 

Tried with Android 5.1.1 2.0.0-ga-rc4 on sabre and i am able to reproduce the problem.

It takes some time before the screen is unresponsive. Android shell is responsive underneath.

Attached are the video and logcat for the problem happening on sabre board 

 

Is this a known issue or could you help us in fixing this ?

 

Have this posted here already. Was advised to post on this forum.

 

HISTORY:

Already tried sabresd_6dq-eng 5.1.1 2.1.0-ga-rc3 and problem happens on this build also.

Tried  galcore.powerManagement=0, causes the problem every time on system resume after suspend.

Tried NXP provided patch to disable GPU's power management feature. Issue still happens

 

We have seen the problem randomly also, while not using chrome.

Testing it on chrome package with monkey is helping us to reproduce this more often.

[Running "monkey --throttle 100 -p com.android.chrome 1000000" infinitely]

 

The version of chrome that is reproducing the problem for us is version 57.0.2987.132.

This version does not have the "Merge Tab" setting.

 

Other source to get it from: The Open GApps Project

On our platform we have seen signatures like:

 

03-20 13:49:40.478 156 156 I kernel : <6>[36394.351361] fence timeout on [d01dda00] after 3000ms
03-20 13:49:40.478 156 156 W kernel : <4>[36394.351552] objs:
03-20 13:49:40.478 156 156 W kernel : <4>[36394.351552] --------------
03-20 13:49:40.478 156 156 W kernel : <4>[36394.351552] viv timeline viv_sync
03-20 13:49:40.478 156 156 W kernel : <4>[36394.351552] pt signaled@2623.897486
03-20 13:49:40.478 156 156 W kernel : <4>[36394.351552] pt signaled@33633.696669
03-20 13:49:40.478 156 156 W kernel : <4>[36394.351552] pt signaled@33644.294139
03-20 13:49:40.478 156 156 W kernel : <4>[36394.351552] pt signaled@33644.889483
03-20 13:49:40.478 156 156 W kernel : <4>[36394.351552] pt active
03-20 13:49:40.478 156 156 W kernel : <4>[36394.351552] pt active
03-20 13:49:40.478 156 156 W kernel : <4>[36394.351552] pt active
03-20 13:49:40.478 156 156 W kernel : <4>[36394.351552]
03-20 13:49:40.478 156 156 W kernel : <4>[36394.351552] viv timeline viv_sync
03-20 13:49:40.478 156 156 W kernel : <4>[36394.351552]
03-20 13:49:40.478 156 156 W kernel : <4>[36394.351552] viv timeline viv_sync
03-20 13:49:40.478 156 156 W kernel : <4>[36394.351552]
03-20 13:49:40.478 156 156 W kernel : <4>[36394.351552] fences:
03-20 13:49:40.478 156 156 W kernel : <4>[36394.351552] --------------
03-20 13:49:40.479 156 156 W kernel : <4>[36394.351645] [00000000] viv sync_fence-62573: signaled
03-20 13:49:40.479 156 156 W kernel : <4>[36394.351645] viv timeline_pt signaled@2623.897486
03-20 13:49:40.479 156 156 W kernel : <4>[36394.351645]
03-20 13:49:40.479 156 156 W kernel : <4>[36394.351645] [00000000] viv sync_fence-64567: signaled
03-20 13:49:40.479 156 156 W kernel : <4>[36394.351645] viv timeline_pt signaled@33633.696669
03-20 13:49:40.479 156 156 W kernel : <4>[36394.351645]
03-20 13:49:40.479 156 156 W kernel : <4>[36394.351645] [00000000] viv sync_fence-64577: signaled
03-20 13:49:40.479 156 156 W kernel : <4>[36394.351645] viv timeline_pt signaled@33644.294139
03-20 13:49:40.479 156 156 W kernel : <4>[36394.351645]
03-20 13:49:40.479 156 156 W kernel : <4>[36394.351645] [00000000] viv sync_fence-64622: signaled
03-20 13:49:40.479 156 156 W kernel : <4>[36394.351645] viv timeline_pt signaled@33644.889483
03-20 13:49:40.479 156 156 W kernel : <4>[36394.351645]
03-20 13:49:40.479 156 156 W kernel : <4>[36394.351645] [00000000] viv sync_fence-64624: active
03-20 13:49:40.479 156 156 W kernel : <4>[36394.351645] viv timeline_pt active
03-20 13:49:40.479 156 156 W kernel : <4>[36394.351645]
03-20 13:49:40.479 156 156 W kernel : <4>[36394.351645] [00000000] viv sync_fence-64625: active
03-20 13:49:40.479 156 156 W kernel : <4>[36394.351645] viv timeline_pt active
03-20 13:49:40.479 156 156 W kernel : <4>[36394.351645]
03-20 13:49:40.479 156 156 W kernel : <4>[36394.351645] [00000000] viv sync_fence-64626: active
03-20 13:49:40.479 156 156 W kernel : <4>[36394.351645] viv timeline_pt active
03-20 13:49:40.479 156 156 W kernel : <4>[36394.351645]
03-20 13:49:40.480 163 176 E Fence : Throttling EGL Production: fence 36 didn't signal in 3000 ms

 

03-20 13:50:29.885 156 156 W kernel : <4>[36443.761373] [galcore]: GPU[0] hang, automatic recovery.
03-20 13:50:29.885 156 156 W kernel : <4>[36443.761450] [galcore]: recovery done
03-20 13:50:34.880 163 933 W SurfaceFlinger: setTransactionState timed out!
03-20 13:50:34.884 163 176 W SurfaceFlinger: setTransactionState timed out!
Looks similar to: IMX AXI BUS ERROR, GPU hang 

Original Attachment has been moved to: sabreLog.txt.zip

Attachments

Outcomes