AnsweredAssumed Answered

OpenCL strange results (i.MX6QP)

Question asked by Hiroshi Uchino on Nov 17, 2016

Hello

 

I'm testing OpenCL on i.MX6QP, and I get some strange results.
Board : MCIMX6QP-SDB
OS : Linux version 4.1.15-1.2.0+g77f6154

 

OpenCL Kernel:
--------------------------------
float dot24(float8 d0, float8 d1, float8 d2, float8 k0, float8 k1, float8 k2)
{
return (dot(d0.lo, k0.lo) + dot(d0.hi, k0.hi) +
            dot(d1.lo, k1.lo) + dot(d1.hi, k1.hi) +
            dot(d2.lo, k2.lo) + dot(d2.hi, k2.hi));
}

 

__kernel void test(__global float* buf)
{
if (get_global_id(0) == 0) {
float8 d0 = vload8(0, buf);
float8 d1 = vload8(1, buf);
float8 d2 = vload8(2, buf);
float8 d3 = vload8(3, buf);
float8 d4 = vload8(4, buf);
float8 k0 = vload8(5, buf);
float8 k1 = vload8(6, buf);
float8 k2 = vload8(7, buf);

float8 tmp0 = (float8)(d0.s0, d0.s1, d0.s2, d0.s3, d0.s4, d1.s0, d1.s1, d1.s2);
float8 tmp1 = (float8)(d1.s3, d1.s4, d2.s0, d2.s1, d2.s2, d2.s3, d2.s4, d3.s0);
float8 tmp2 = (float8)(d3.s1, d3.s2, d3.s3, d3.s4, d4.s0, d4.s1, d4.s2, d4.s3);

*(buf + 8*8)       = dot24(tmp0, tmp1, tmp2, k0, k1, k2);
*(buf + 8*8 + 1) = dot24((float8)(d0.s0, d0.s1, d0.s2, d0.s3, d0.s4, d1.s0, d1.s1, d1.s2),
                                        (float8)(d1.s3, d1.s4, d2.s0, d2.s1, d2.s2, d2.s3, d2.s4, d3.s0),
                                        (float8)(d3.s1, d3.s2, d3.s3, d3.s4, d4.s0, d4.s1, d4.s2, d4.s3),
                                        k0, k1, k2);

vstore8((float8)(1.0f), 9, buf);
}
}
--------------------------------

 

After execution of kernel test(), the contents of buf are as follows.
--------------------------------
0.0 1.0 2.0 3.0 4.0 5.0 6.0 7.0 // <- d0
8.0 9.0 10.0 11.0 12.0 13.0 14.0 15.0 // <- d1
16.0 17.0 18.0 19.0 20.0 21.0 22.0 23.0 // <- d2
24.0 25.0 26.0 27.0 28.0 29.0 30.0 31.0 // <- d3
32.0 33.0 34.0 35.0 36.0 37.0 38.0 39.0 // <- d4
40.0 41.0 42.0 43.0 44.0 45.0 46.0 47.0 // <- k0
48.0 49.0 50.0 51.0 52.0 53.0 54.0 55.0 // <- k1
56.0 57.0 58.0 59.0 60.0 61.0 62.0 63.0 // <- k2
23146.0 37278.0 66.0 67.0 68.0 69.0 70.0 71.0 // <- result
1.0 73.0 74.0 75.0 76.0 77.0 78.0 79.0 // <- (float8)(1.0f)
--------------------------------

 

There are 2 points I think strange.
1. Two results of dot24() should be the same.
    Correct value is 23146, so the result of second dot24() is wrong.
2. vstore8((float8)(1.0f), 9, buf); should write eight 1.0f values, but only first value is written.

 

[Question]
Are these bugs of GPU driver, or are there anything wrong with my code or my OpenCL environment?

 

Best regards,
Hiroshi

Outcomes