Segmentation fault when calling eglMakeCurrent

cancel
Showing results for 
Show  only  | Search instead for 
Did you mean: 

Segmentation fault when calling eglMakeCurrent

Jump to solution
8,804 Views
charlesung
Contributor III

We are using OpenVG with X11 on the imx6 and we encounter a problem that eglMakeCurrent will occasionally crash with segmentation fault. Not very sure on the cause. The input parameters seem to be valid at the time of the call. Below is the call stack when it dies.

 

 

==4346== Process terminating with default action of signal 11 (SIGSEGV)

==4346==  Access not within mapped region at address 0x394

==4346==    at 0x5E9DF64: gcoSURF_ReferenceSurface (gc_hal_user_surface.c:12505)

==4346==    by 0x5F90CC7: _CreateSurfaceObjects (gc_egl_surface.c:604)

==4346==    by 0x5F917CF: veglResizeSurface (gc_egl_surface.c:1389)

==4346==    by 0x5F8D34B: veglMakeCurrent (gc_egl_context.c:2508)

==4346==    by 0x5F8DD43: eglMakeCurrent (gc_egl_context.c:2633)

 

 

I have also attached the valgrind output that shows the steps that lead to this.

Original Attachment has been moved to: eglmakecurrent_bad.txt.zip

Labels (3)
1 Solution
6,908 Views
charlesung
Contributor III

This works. Thanks.

View solution in original post

0 Kudos
67 Replies
2,340 Views
chingling_wang
NXP Employee
NXP Employee

Karina, I created a jira ticket for gpu driver team,

http://sw-jira.freescale.net/browse/MGS-1900

2,340 Views
karina_valencia
NXP Apps Support
NXP Apps Support

chinglingwang​ please continue with the follow up.

0 Kudos
2,340 Views
chingling_wang
NXP Employee
NXP Employee

Hi, Kariana,

I think this ticket can be closed

2,340 Views
charlesung
Contributor III

Please make sure the fix gets included back in the main line and will be in the next formal release. Thanks.

0 Kudos
2,343 Views
charlesung
Contributor III

Do you have any code that you use to verify OpenVG on X11 that you can share?

0 Kudos
2,343 Views
chingling_wang
NXP Employee
NXP Employee

If gpu 8.4 cannot fix your issue,  I have no clue so far what causes the seg fault.  Yes, We do have openvg samples we can share.  in /opt/fsl-gpu-sdk/OpenVG, it has all the openvg sample applications.  And you can download demo framework to built it.  demo framework is release in every version of yocto release. 

You can also download from git stash sw-stash.freescale.net/scm/gtec/demo-framework.git, it also released as stand alone package in nxp website.

Demo framework sample are all C++,   Do you need sample application in c for openVG?  I can package for you.  They are also in old release like 3.10.17 bsp.

0 Kudos
2,343 Views
sebastient
Contributor V

Have you looked at the numerous reported issues "Conditional jump or move depends on uninitialised value" from Charles' report?  These include source file and line numbers for you to confirm.  These are scattered across gc_hal_user_os.c gc_hal_user_hardware_vg.c gc_egl_surface.c etc.

These should be the starting point to investigating the reported issue.

0 Kudos
2,343 Views
chingling_wang
NXP Employee
NXP Employee

I put email watch for this thread, But, it seems not coming to my mailbox.  Sorry about this.

I looked into the output, it seems openvg cannot allocate the egl surface memory specified by your app.

From your log,

Native win geometry after resize 0,48,1280,672

Number of queued expose event 0

Expose region 0,0,1280,672

1. if region is  0,48,1280,672, the surface actually size is 1280x(672-48), I think 0, 48 means the staring point,  why expors region 0,0, 1280, 672,  the surface size is 1280x672? it is becoming bigger,

2. 672 is not 16 bits aligned,  could you pass your surface parameter as 16 bits aligned?  is your display 1280x720?

3. I also attached our sample openvg applications that you can take it as reference for openvg egl setup.

2,343 Views
charlesung
Contributor III

None of the examples demonstrates the case that the X11 window being resize dafter surface creation, which is where we think the problem is. Our guess is that there is something in the driver that handles that but we really do not know without the source code.

0 Kudos
2,343 Views
chingling_wang
NXP Employee
NXP Employee

When resize, you need to destroy the old surface you created before, then eglCreateSurface again.   I didn't see it in your log.

The reason you got valgind output pointing to conditional branches is that in the drvier, gpu check the buffer size, it thinks that what you want is bigger than what it is for max buffer size it has, so it returns out of mememory error, and it exits and do some cleaning

And yes, the width and height both better to be 16 bits aligned when creating egl surface.

usually, 8 bits is ok, but, I talked to driver engineer, it is better to be 16 bits aligned

I try to see if I can get some resize sample for openvg.

0 Kudos
2,341 Views
charlesung
Contributor III

Back to what we saw in the valgrind output. Based on the function names that we saw, we were guessing that it is handling the resize somehow. We can be totally off but it does appear to us that this is the case.

==1199== Conditional jump or move depends on uninitialised value(s)

==1199==    at 0x5F8A2F0: veglFreeRenderList (gc_egl_surface.c:2861)

==1199==    by 0x5F8A81B: veglResizeSurface (gc_egl_surface.c:1412)

==1199==    by 0x5F8635B: veglMakeCurrent (gc_egl_context.c:2508)

==1199==    by 0x5F86D53: eglMakeCurrent (gc_egl_context.c:2633)

==1199== Conditional jump or move depends on uninitialised value(s)

==1199==    at 0x5F3839C: gcoVGHARDWARE_ScheduleVideoMemory (gc_hal_user_hardware_vg.c:8030)

==1199==    by 0x5E97087: _FreeSurface (gc_hal_user_surface.c:910)

==1199==    by 0x5E9767F: gcoSURF_Destroy (gc_hal_user_surface.c:2135)

==1199==    by 0x5F8A337: veglFreeRenderList (gc_egl_surface.c:2875)

==1199==    by 0x5F8A81B: veglResizeSurface (gc_egl_surface.c:1412)

==1199==    by 0x5F8635B: veglMakeCurrent (gc_egl_context.c:2508)

==1199==    by 0x5F86D53: eglMakeCurrent (gc_egl_context.c:2633)

==1199==

==1199== Invalid read of size 4

==1199==    at 0x5F89260: veglAddRenderListSurface (gc_egl_surface.c:2789)

==1199==   by 0x5F89D33: _CreateSurfaceObjects (gc_egl_surface.c:600)

==1199==    by 0x5F8A853: veglResizeSurface (gc_egl_surface.c:1440)

==1199==    by 0x5F8635B: veglMakeCurrent (gc_egl_context.c:2508)

==1199==    by 0x5F86D53: eglMakeCurrent (gc_egl_context.c:2633)

==1199==

==1199== Invalid read of size 4

==1199==    at 0x5F89D40: _CreateSurfaceObjects (gc_egl_surface.c:603)

==1199==    by 0x5F8A853: veglResizeSurface (gc_egl_surface.c:1440)

==1199==    by 0x5F8635B: veglMakeCurrent (gc_egl_context.c:2508)

==1199==    by 0x5F86D53: eglMakeCurrent (gc_egl_context.c:2633)

==1199==    by 0x64C37CB: QVgQpaContext::makeCurrent(void*) (qvgqpacontext.cpp:91)

==1199==    by 0x64C0B5F: QVgQpaBackingStore::beginPaint(QRegion const&) (qvgqpabackingstore.cpp:2359)

==1199==    by 0x496D0DB: ??? (in /usr/lib/libQt5Widgets.so.5.5.1)

==1199==  Address 0x62ad9d0 is 8 bytes inside a block of size 24 free'd

==1199==    at 0x4845FFC: free (in /usr/lib/valgrind/vgpreload_memcheck-arm-linux.so)

==1199==

==1199== Invalid read of size 4

==1199==    at 0x5E9DFDC: gcoSURF_ReferenceSurface (gc_hal_user_surface.c:12505)

==1199==    by 0x5F89D4B: _CreateSurfaceObjects (gc_egl_surface.c:604)

==1199==    by 0x5F8A853: veglResizeSurface (gc_egl_surface.c:1440)

==1199==    by 0x5F8635B: veglMakeCurrent (gc_egl_context.c:2508)

==1199==    by 0x5F86D53: eglMakeCurrent (gc_egl_context.c:2633)

==1199==    by 0x64C37CB: QVgQpaContext::makeCurrent(void*) (qvgqpacontext.cpp:91)

==1199==    by 0x64C0B5F: QVgQpaBackingStore::beginPaint(QRegion const&) (qvgqpabackingstore.cpp:2359)

==1199==    by 0x496D0DB: ??? (in /usr/lib/libQt5Widgets.so.5.5.1)

==1199==  Address 0x394 is not stack'd, malloc'd or (recently) free'd

==1199==

==1199== Process terminating with default action of signal 11 (SIGSEGV)

==1199==  Access not within mapped region at address 0x394

==1199==    at 0x5E9DFDC: gcoSURF_ReferenceSurface (gc_hal_user_surface.c:12505)

==1199==    by 0x5F89D4B: _CreateSurfaceObjects (gc_egl_surface.c:604)

==1199==    by 0x5F8A853: veglResizeSurface (gc_egl_surface.c:1440)

==1199==    by 0x5F8635B: veglMakeCurrent (gc_egl_context.c:2508)

==1199==    by 0x5F86D53: eglMakeCurrent (gc_egl_context.c:2633)

0 Kudos
2,341 Views
chingling_wang
NXP Employee
NXP Employee

Talked with our driver persons, they clarified:

1. to resize window, only need X to do resize and move such as XResizeWindow() or XMoveResizeWIndow(), no other action required from user side, egl in driver will take care of everything.

2. driver also will take care of all alignment, no action from user side.

I attached a simple app to change window position and size, it works OK without seg fault.

To test it on yocto to see the window size and position change

killall matchbox-window-manager

X :0 –noreset &

I am not sure what causes your seg fault, maybe by drive, but we need to reproduce it in a simple code so that gpu driver people can degug it.

0 Kudos
2,340 Views
charlesung
Contributor III

We kind of know that is the way by playing with it. However, we really do not have a simple way to reproduce the problem other than what we have. It works most of the time but if we reorder things at the high level code like enabling or disabling printf, then it may show up. All we have that we think will be useful for you is the valgrind output which we hope will provide enough info for your driver team. So please let your driver team takes a peek at the valgrind output and see whether there is enough information there to figure out why it ends up causing a seg fault. If there really isn't enough information there. Then we will see what else we can do to get you more information.

0 Kudos
2,341 Views
chingling_wang
NXP Employee
NXP Employee

more questions, why you have so many elgMakeCurrent() call?  the resize are done by x11, no action required by user side on egl.  I tested it, it works this way.

1. Do you have several threads to do your rendering so that you need to call eglMakeCurrent() frequently?

2. You have two attahcments,  elgmakecurrent_bad.txt, eglmakecurrent_bad_2.txt,  the stack they show a little different,  and in your first post you have

==4346== Process terminating with default action of signal 11 (SIGSEGV)

==4346==  Access not within mapped region at address 0x394

==4346==    at 0x5E9DF64: gcoSURF_ReferenceSurface (gc_hal_user_surface.c:12505)

==4346==    by 0x5F90CC7: _CreateSurfaceObjects (gc_egl_surface.c:604)

==4346==    by 0x5F917CF: veglResizeSurface (gc_egl_surface.c:1389)

==4346==    by 0x5F8D34B: veglMakeCurrent (gc_egl_context.c:2508)

==4346==    by 0x5F8DD43: eglMakeCurrent (gc_egl_context.c:2633

The stack is also a little different. are they just randow, all the stact dumps are possible?  or which one shall i use?

thanks.

0 Kudos
2,341 Views
charlesung
Contributor III

The first one is with the driver that comes with the yocto release. The second one is with the latest gpu driver that you told me to try.

0 Kudos
2,341 Views
chingling_wang
NXP Employee
NXP Employee

so, the valgrint output is not the log from one run of your program,  it was edited by your for many times when seg fault happens, they are the all random possible condition jumps?  I thought it is the output log from one run,  but, you only need to call eglMakeCurrent() once,  after resize, there is no need to call it.  Do you mean elgMakeCurrent() may have seg fault the first time it was called?

0 Kudos
2,341 Views
charlesung
Contributor III

Just for your information. What we are building is a vg backend for Qt. At the high level, the Qt app can have many different windows. Every time it goes from one window to another, we will have to switch the egl surface so the rendering can be done on the corresponding window.

And for the valgrind output, it is truncated to just show the stuff inside eglMakeCurrent. It starts with the time that we enter the eglMakeCurrent function to the point that it seg fault. Again the entire log is showing what is inside the eglMakeCurrent function until it tips over.

0 Kudos
2,341 Views
chingling_wang
NXP Employee
NXP Employee

So, it is the one run output log of your application( it is ok it is truncated only for eglMakeCurret)? 

But, it still doesn't make sense, how can you application still running after getting seg fault?  So, it is the edited output by you for all seg fault happening?  when seg fault happens, it may happens in may different location in driver although eglmakecurrent() is always there\?

if you change window size in same thread, no need to call eglMakeCurrent.  So the many different windows belongs to different thread, this is why I saw so may eglMakeCurrent() call in your log?

0 Kudos
2,341 Views
charlesung
Contributor III

The log is output from valgrind, which is a debugging tool. You pass your app to it and it will run it and analysis it for you.

And I still do not understand where from the log that makes you think we are making many eglMakeCurrent calls.

0 Kudos
2,340 Views
chingling_wang
NXP Employee
NXP Employee

When I build the gpu code myself and replace them on the sd card, the part3 you gave can pass eglMakeCurrent() without seg fault, but gpu source is the same as the released one for bsp 4.1.15.

I got other error:

eglDisplay = 21928548, surface = 22607156, context = 21939692

**************** Done eglMakeCurrent! ******************

./part3: symbol lookup error: /usr/lib/qt5/plugins/platforms/libvgqpa.so: undefined symbol: vgImageFlushDirectVIV

I tried to write some app which changing elgsurface, resize and re-postition, it runs OK without seg fault.

Do you have anyother apps that can reproduce is issue consistently?

0 Kudos