AnsweredAssumed Answered

[OpenGL] Texel fetch takes too much time on imx6 GPU (GC2000)

Question asked by Dilip Kumar on Jul 29, 2015
Latest reply on Aug 6, 2015 by Dilip Kumar
Branched to a new discussion

I have a question regarding opengl es 2.0 in i.mx6 quad.


I have an opengl app doing multiple texel fetches (upto 12) for calculating each pixel in the fragment shader code.


The fragment shader code is as follows:

varying vec4 gh_TexCoord;

uniform sampler2D source;



void main(void) {

     vec4 tmp1;

    vec4 tmp2;


    tmp1.b = float(texture2D(source,vec2(gh_TexCoord)).r);

    tmp1.g = float(texture2D(source,vec2(gh_TexCoord)).g);

    tmp1.r = float(texture2D(source,vec2(gh_TexCoord)).b);

    tmp1.a = float(texture2D(source,vec2(gh_TexCoord)).a);


    tmp2.b = tmp1.b + float(texture2D(source,vec2(gh_TexCoord)).r);

    tmp2.g = tmp1.g + float(texture2D(source,vec2(gh_TexCoord)).g);

    tmp2.r = tmp1.r + float(texture2D(source,vec2(gh_TexCoord)).b);

    tmp2.a = tmp1.a + float(texture2D(source,vec2(gh_TexCoord)).a);


    gl_FragColor.b = tmp2.b + float(texture2D(source,vec2(gh_TexCoord)).r);

    gl_FragColor.g = tmp2.g + float(texture2D(source,vec2(gh_TexCoord)).g);

    gl_FragColor.r = tmp2.r + float(texture2D(source,vec2(gh_TexCoord)).b);

    gl_FragColor.a = tmp2.a + float(texture2D(source,vec2(gh_TexCoord)).a);


As you can see, I have done multiple texture fetches and simple additions to calculate the final output color. This is just an example and does not do anything useful. The gh_TexCoord is a varying which is used in the vertex shader to calculate the position of the pixel. the vertex calculation is also straight forward and does not involve any complex calculations. I have setup the vertex data as 4 points such that I can call glDrawArrays to draw a triangle fan to form a rectangular plane. The size of the output buffer is set as 1920x1080.


My issue here is that the texture2D calls take a lot of time for processing on the imx6 GPU. If I comment out a few of the above texture2D calls, the draw time is considerably reduced. In fact, each texture2D call cost me about 3 ms of processing time per frame, which appears to be too much. What could be the reason for this? Is this because of the low cache memory in the vivante GPU? Im fairly new to opengl, so any suggestions are welcome. If you need any more info for debugging, I can provide them as well.


Additional info :


no. of texture2D calls per frameglDrawArrays + glFinish time (ms) for full HD frame


Board : i.MX6 SabreLite quad from boundary devices.

kernel : 3.0.35

Galcore version :