iMX6Q / Yocto: X11 unstable (SIGSEGV) with acceleration enabled.

cancel
Showing results for 
Show  only  | Search instead for 
Did you mean: 

iMX6Q / Yocto: X11 unstable (SIGSEGV) with acceleration enabled.

3,428 Views
cristianmarussi
Contributor II

Hi,

i'm building a YOCTO based distribution for a SabreSD iMX6Q reference board.

Using the standard fbdev X-driver i have no problem (and no acceleration either), BUT when i enable the X11 accelerated driver vivante_drv.so in xorg.conf, despite the fact that acceleration apparently works (as verified in Xorg.0.log and with a fullscreen glxgears running at 300FPS), troubles arises when i launch some cairo/GTK simple test-app: as an example when i launch 'gtkperf -a', the first time all run smooth, but after the completion of this first execution if i launch AGAIN the same test 'gtkperf -a', this time, when the test ends and the program quits, X crashes with a segmentation fault. I'm able to consistently reproduce this behaviour with gtkperf or one another application written only with cairo (that traces a simple arc): segfault arises always on the second invocation using the accelerated X. Enabling debug and symbols into X and vivante EXA and DRI drivers and with the help of gdb i was able to spot the problem here inside EXA driver :

-------------------------------------------------------

Program received signal SIGSEGV, Segmentation fault.

0x2adcf2fc in memset () from /lib/libc.so.6

(gdb)

(gdb)

(gdb) bt

#0  0x2adcf2fc in memset () from /lib/libc.so.6

#1  0x3317ffcc in CleanSurfaceBySW (galInfo=<value optimized out>,

    pPixmap=<value optimized out>, pPix=0x227130)

    at vivante_gal/vivante_gal_surface.c:696

#2  0x331832a0 in VivModifyPixmapHeader (pPixmap=0x2326a0, width=16,

    height=<value optimized out>, depth=1, bitsPerPixel=64, devKind=64,

    pPixData=0x0) at vivante_exa/vivante_pixmap.c:243

#3  0x331d3514 in exaModifyPixmapHeader_driver (pPixmap=0x2326a0, width=16,

    height=16, depth=2303648, bitsPerPixel=0, devKind=64, pPixData=0x0)

    at exa_driver.c:160

#4  0x331d3260 in exaCreatePixmap_driver (pScreen=0x3568b8, w=16, h=16,

    depth=<value optimized out>, usage_hint=1) at exa_driver.c:110

#5  0x00042cf8 in ServerBitsFromGlyph (pfont=0x2324f0,

    ch=<value optimized out>, cm=0x7efffba0, ppbits=0x5190c) at glyphcurs.c:98

#6  0x00037118 in AllocGlyphCursor (source=228, sourceChar=0, mask=228,

    maskChar=1, foreRed=0, foreGreen=0, foreBlue=0, backRed=4294967295,

    backGreen=4294967295, backBlue=4294967295, ppCurs=0x7efffc08,

    client=0x19a8a0, cid=0) at cursor.c:377

#7  0x0003753c in CreateRootCursor (unused1=<value optimized out>,

    unused2=<value optimized out>) at cursor.c:507

#8  0x00021d84 in main (argc=6, argv=0x7efffde4, envp=<value optimized out>)

    at main.c:235

(gdb)

--------------------

As an additional note, each time one of those app successfully terminates on the first iteration,

even before X crashes, X complains about leakages...i think this is where all starts....

3 XSELINUXs still allocated at reset

SCREEN: 0 objects of 132 bytes = 0 total bytes 0 private allocs

COLORMAP: 0 objects of 4 bytes = 0 total bytes 0 private allocs

DEVICE: 0 objects of 24 bytes = 0 total bytes 0 private allocs

CLIENT: 0 objects of 144 bytes = 0 total bytes 0 private allocs

WINDOW: 0 objects of 28 bytes = 0 total bytes 0 private allocs

PIXMAP: 2 objects of 76 bytes = 152 total bytes 0 private allocs

GC: 0 objects of 52 bytes = 0 total bytes 0 private allocs

CURSOR: 0 objects of 4 bytes = 0 total bytes 0 private allocs

CURSOR_BITS: 0 objects of 4 bytes = 0 total bytes 0 private allocs

DBE_WINDOW: 0 objects of 12 bytes = 0 total bytes 0 private allocs

TOTAL: 2 objects, 152 bytes, 0 allocs

2 PIXMAPs still allocated at reset

PIXMAP: 2 objects of 76 bytes = 152 total bytes 0 private allocs

GC: 0 objects of 52 bytes = 0 total bytes 0 private allocs

CURSOR: 0 objects of 4 bytes = 0 total bytes 0 private allocs

CURSOR_BITS: 0 objects of 4 bytes = 0 total bytes 0 private allocs

DBE_WINDOW: 0 objects of 12 bytes = 0 total bytes 0 private allocs

TOTAL: 2 objects, 152 bytes, 0 allocs

1 PICTUREs still allocated at reset

TOTAL: 0 objects, 0 bytes, 0 allocs

-----------------------

I've investigate further the issues inside CleanSurfaceBySW() in vivante_gal/vivante_gal_surface.c by simple printing and saw that

the fault is effectively at the memset operation...referencing an invalid mem address (but NOT a NULL reference)...in fact it's possible to see

enabling debug in EXA vivante driver () that on the first sucessfull invocation of GTKPERF (i've deliberately cut some line from output...):

---------------------------

time:  1.00

---

Total time: 26.51

Quitting..       <<<<<<<<<<<<<<<<<================= THIS IS GTKPERF FINISHING

# ---- initPixmapQueue   <<<<<<<<<<<<<<<<<<============= this is vivante_drv.so cleaning up

Wrapper address 331ea000 in fb (331ea000-3336a000) to pixmap, offset=0

==>>>>>>>>>CleanSurfaceBySW():669---ENTER------

======>>>> mVideoNode:0x333c5aa8  -  logicalAddr:0x331ea000  -  size:1572864

==>>>>>>>>>CleanSurfaceBySW():700---EXIT------

==>>>>>>>>>CleanSurfaceBySW():669---ENTER------

======>>>> mVideoNode:0x333c7410  -  logicalAddr:0x3b930000  -  size:345600

==>>>>>>>>>CleanSurfaceBySW():700---EXIT------

==>>>>>>>>>CleanSurfaceBySW():669---ENTER------

======>>>> mVideoNode:0x333c78b8  -  logicalAddr:0x3b839000  -  size:302400

==>>>>>>>>>CleanSurfaceBySW():700---EXIT------

==>>>>>>>>>CleanSurfaceBySW():669---ENTER------

======>>>> mVideoNode:0x333c78b8  -  logicalAddr:0x3b839000  -  size:302400

==>>>>>>>>>CleanSurfaceBySW():700---EXIT------

==>>>>>>>>>CleanSurfaceBySW():669---ENTER------

======>>>> mVideoNode:0x333c78b8  -  logicalAddr:0x3b839000  -  size:302400

==>>>>>>>>>CleanSurfaceBySW():700---EXIT------

--------------------------------------------

ALL is fine...but things goes really wrong when cleaning after the completion of the second execution of gtkperf:

---------------------

Quitting..

# ---- initPixmapQueue

Wrapper address 331ea000 in fb (331ea000-3336a000) to pixmap, offset=0

==>>>>>>>>>CleanSurfaceBySW():669---ENTER------

======>>>> mVideoNode:0x333c4860  -  logicalAddr:0x331ea000  -  size:1572864

==>>>>>>>>>CleanSurfaceBySW():700---EXIT------

==>>>>>>>>>CleanSurfaceBySW():669---ENTER------

======>>>> mVideoNode:0x333c7410  -  logicalAddr:0x3b930000  -  size:345600

==>>>>>>>>>CleanSurfaceBySW():700---EXIT------

==>>>>>>>>>CleanSurfaceBySW():669---ENTER------

======>>>> mVideoNode:0x333c5700  -  logicalAddr:0x3392a000  -  size:345600

==>>>>>>>>>CleanSurfaceBySW():692---------

==>>>>>>>>>CleanSurfaceBySW():694---------   <<<<<<<<<<<<<==================== after here SEGV

-------------------------------

After line 694 (in my code full of debugs), it was attempting the memset zero on those mVideo* values....

CleanSurfaceBySW()

   ....

     memset((char *)surf->mVideoNode.mLogicalAddr,0,surf->mVideoNode.mSizeInBytes);

At the end, this is my actual configuration (Freescale GPU binaries are from last LTIB release 3.0.35_4.1.0 where applicable):

(In doubt i tried to lineup all the packages related to DRI / EXA and gpu binaries at the same version of the kernel...3.0.35...don't know if this matters... anyway i had the same problems using binaries freescale and binaries 3.5.7 with this same kernel 3.0.35)

- Kernel 3.0.35

- Linaro toolchain softfp

- X 1.10.4 (but tried already also 1.10.1 / 1.11.2) with DRI enabled Xorg.0.log ---->> http://pastebin.com/auJm0PFj

- Mesa 9.0.2

- Freescale gpu-viv-bin-mx6q 3.0.35 ... this is installed AFTER mesa to properly cover libGL/libEGL/libGAL/libVIVANTE .so and all simlinks are properly '-x11' links

- xf86-video-viv 3.0.35 (EXA and DRI)...BUT nothing changes using xf86-video-imxfb-vivante-3_3.5.7-1.0.0-alpha.2-r11

- imx-lib / imx-vpu-lib 3.0.35

- firmware-imx-3.0.35

- libdrm 2.4.24

- Mesa-demos 8.0.1

- my xorg.conf --> http://pastebin.com/HUKtvGWJ

- /dev/galcore has permission 666

- kernel is configured with enabled  DRM and DRM Vivante...and seems ok ...found at boot

   ## dmesg | grep drm

    [drm] Initialized drm 1.1.0 20060810

    [drm] Initialized vivante 1.0.0 20120216 on minor 0

Any hints ?

Bye

Cristian

Labels (4)
10 Replies

1,436 Views
LeonardoSandova
Specialist I

Which set of branches are you using (dora?) ?  Which image? In case you modified any metadata (.bb or .conf file), please post it.

In the other hand, there are some known issues with X11 and these will be fixed until  the beta Yocto Freescale release but first we need to identify your setup and see if your issue is the same as the ones already detected.

0 Kudos

1,436 Views
cristianmarussi
Contributor II

Hi Leonardo,

I've made a custom distro, built piece by piece using yocto/openembedded tools and recipes, but NOT strictly using last recipes in dora branch (which as i understood is NOT supposed to be officially stable regarding X11); instead i've used SOME of the applications' sources versions of the last LTIB release (3.0.35_4_1_0), which are supposed to be more stable instead, and wrapped them into proper recipes to build them with yocto into our distro: as an example, i've taken latest Vivante driver 3.0.35 (EXA/DRI components) from LTIB, and latest (3.0.35) LTIB gpu-viv-bin/imx-firmware/imx-lib BUT NOT the X11 server which is too outdated in LTIB (1.6.1) to compile against the Vivante driver provided by LTIB itself. (Anyway i've seen many people using different version of X11 of this same forum.)

So at the end the only reference i can provide you now is that the core applications' versions i'm using are in fact the ones i stated in my previous post (with ALL the stated variations of X11/ drivers/binaries etc etc), but you cannot easily reproduce at this point the exact same distro as mine from standard yocto dora branch.

My failing test case, using the above mentioned apps-versions combinations, is to simply launch 'gtkperf -a'  multiple times (2 in fact) in a row with an X reset in the middle: because we have no window-manager, and so gtkperf is the one and only client, in such a scenario the behavior of X is to reset and cleanup after the last client terminates; this is were XSELINUX detects the leakage and where the subsequent invocation of gtkperf (after the X reset) causes X itself to die segfaulting inside the EXA driver.


I understand this "non-dora standard" build process could be a problem for you in reproducing and triaging the bug (unless you're able to trigger it also on dora simply using gtkperf in the same manner) but, in this moment, the only further reference i can give you could be my "configures" options of the core components like X11/GTk+/pango/cairo...etc etc, in order to possibly understand if this situation/bug is exposed by some particular combination of my configuration option. I understand this is far from optimal for your debugging purposes.


Thank You


Bye


Cristian



0 Kudos

1,436 Views
LaurenPost
NXP Employee
NXP Employee

When we build all our binaries we build with the Yocto tool chain.  I am not sure if you can mix and match toolchains using linaro unless you are specifically using the swfp binaries we provided for graphics.

The dora branch has now been fully upstreamed with our 3.10.9-1.0.0_alpha release.  I think the only things not upstreamed are the wayland-weston and the image recipes.

The 3.10.9-1.0.0 alpha release did work with 3.0.35 if you do not want to use the 3.10.9 kernel.

1,436 Views
cristianmarussi
Contributor II

Hello Lauren and thank you for your feedback.

I've already tried the last X11 Vivante EXA/DRI driver 3.10.9-1.0.0 alpha (using proper opts to be compile against older kernel as stated into makefile itself), but i observed the same above-described behaviour (X11 segfaults at 2nd reset).

By the way i'd like also to ask you, what are the 'versioning requirements' (if any) of the Freescale provided packages in general across the distro: i mean, beside the above driver which can be compiled against older kernels, what about other packages like gpu-viv-bin-mx6q  imx-lib imx-vpu-lib firmware-imx, should they be of the same exact version of the Kernel (i saw there are 3.0.35 3.5.7 3.10.9 releases of some of those packages) or should i anyway use the latest and newest available version unregarding of the underlying kernel release ?

Thank you

1,436 Views
cristianmarussi
Contributor II

Some additional and more precise data about

-->> the X configure options::

configure --build=i686-linux --host=arm-fsl-linux --target=arm-fsl-linux --prefix=/usr --exec_prefix=/usr --bindir=/usr/bin --sbindir=/usr/sbin --libexecdir=/usr/libexec --datadir=/usr/share --sysconfdir=/etc --sharedstatedir=/com --localstatedir=/var --libdir=/usr/lib --includedir=/usr/include --oldincludedir=/usr/include --infodir=/usr/share/info --mandir=/usr/share/man --disable-config-hal --enable-config-udev --enable-composite --enable-kdrive --enable-dri2 --enable-dri --enable-xv --enable-xvmc --enable-glx --enable-aiglx --enable-xfbdev --enable-xorg --enable-dga --enable-libdrm --with-default-font-path=/usr/share/fonts/X11/misc --disable-xf86vidmode --disable-xnest --disable-dmx --disable-xephyr --disable-xfake --disable-xselinux --disable-xinerama --disable-xvfb --disable-xquartz --disable-xwin --disable-ipv6 --disable-unit-tests --disable-xdmcp --disable-docs --disable-devel-docs --without-dtrace --without-xmlto --without-fop

-->> the used Linaro  toolchain and basic compilation options from X config.log:

configure:4488: arm-fsl-linux-gcc -march=armv7-a -mtune=cortex-a9 -mfpu=neon -mfloat-abi=softfp --version >&5

arm-fsl-linux-gcc (Freescale MAD -- Linaro 2011.07 -- Built at 2011/08/10 09:20) 4.6.2 20110630 (prerelease)

Copyright (C) 2011 Free Software Foundation, Inc.

This is free software; see the source for copying conditions.  There is NO

warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.

--> X invocation:

X -dpms -fn "Fixed" -nocursor -retro

0 Kudos

1,436 Views
cristianmarussi
Contributor II

As an additional note, something which is NOT the solution BUT solves my specific use-case(luckily):

launching X with -noreset option inhibits X11 from performing the re-initialization when the last client terminates...and life goes on....

..this is clearly only a workaround to avoid going through X and EXA cleanup procedures...but in my case works perfectly.

(unless additional troubles caused by underlying leakages complained by XSELINUX)

So at the end summarizing the issue:

-  using an X11 accelerated server all things run smooth if launched with -noreset option

-  if not, after last client exits (or crashes :smileygrin: ... we won't have any wm as last permanent client), somethings leaks as signalled by XSELINUX

  --->>  at the next iteration when the last client you launched exits again, and another X reset is triggered, X segfaults inside EXA Freescale driver

           vivante_drv.so

The related backtrace (seen after two iteration with 'gtkperf -a' with a 'resetting' X) is:

----------------------------------------       

(gdb) bt

#0  0x2adcf2fc in memset () from /lib/libc.so.6

#1  0x3317ffcc in CleanSurfaceBySW (galInfo=<value optimized out>,

    pPixmap=<value optimized out>, pPix=0x227130)

    at vivante_gal/vivante_gal_surface.c:696

#2  0x331832a0 in VivModifyPixmapHeader (pPixmap=0x2326a0, width=16,

    height=<value optimized out>, depth=1, bitsPerPixel=64, devKind=64,

    pPixData=0x0) at vivante_exa/vivante_pixmap.c:243

#3  0x331d3514 in exaModifyPixmapHeader_driver (pPixmap=0x2326a0, width=16,

    height=16, depth=2303648, bitsPerPixel=0, devKind=64, pPixData=0x0)

    at exa_driver.c:160

#4  0x331d3260 in exaCreatePixmap_driver (pScreen=0x3568b8, w=16, h=16,

    depth=<value optimized out>, usage_hint=1) at exa_driver.c:110

#5  0x00042cf8 in ServerBitsFromGlyph (pfont=0x2324f0,

    ch=<value optimized out>, cm=0x7efffba0, ppbits=0x5190c) at glyphcurs.c:98

#6  0x00037118 in AllocGlyphCursor (source=228, sourceChar=0, mask=228,

    maskChar=1, foreRed=0, foreGreen=0, foreBlue=0, backRed=4294967295,

    backGreen=4294967295, backBlue=4294967295, ppCurs=0x7efffc08,

    client=0x19a8a0, cid=0) at cursor.c:377

#7  0x0003753c in CreateRootCursor (unused1=<value optimized out>,

    unused2=<value optimized out>) at cursor.c:507

#8  0x00021d84 in main (argc=6, argv=0x7efffde4, envp=<value optimized out>)

    at main.c:235

(gdb)

-----------------------------------------------

I specified again that i used gtkperf, because in fact i obtained DIFFERENT backtraces using different programs, BUT the final segfaulting steps were always inside the EXA driver, albeit in different points of the code.

Bye

Cristian

1,436 Views
LeonardoSandova
Specialist I

Hi Cristian. Your Issue has been reported internally.

0 Kudos

1,436 Views
cristianmarussi
Contributor II

Thank you

0 Kudos

1,436 Views
OtavioSalvador
Senior Contributor II

Hello,

This is clearly a bug in Vivante driver and I think this should be checked by them and/or Freescale GPU guys. LeonardoSandovalGonzalez please try to contact the people about this issue; it'd be better to fix the root cause of the problem than workaround it.

0 Kudos

1,436 Views
LeonardoSandova
Specialist I

Cristain, can you please send this issue to the meta-freescale list? This list is Yocto specific, so you get more specialized eyes on this area.

0 Kudos