i.MX6: clFinish() not working with out-of-order command queues

cancel
Showing results for 
Show  only  | Search instead for 
Did you mean: 

i.MX6: clFinish() not working with out-of-order command queues

1,074 Views
bthebaudeau
Contributor I

Hi,

In the Vivante GPU package from i.MX6 BSP 3.0.35-4.1.0, some queued commands may still be pending after clFinish() if the command queue is of out-of-order type. This pending state of some commands is confirmed by calls to clGetEventInfo().

Steps to reproduce:

- Create a command queue with clCreateCommandQueue(), setting the CL_QUEUE_OUT_OF_ORDER_EXEC_MODE_ENABLE property.

- Enqueue a few kernels (large enough to take some time) into this command queue, requesting a completion event.

- Call clFinish() for this command queue.

Expected result:

clGetEventInfo() should report either CL_COMPLETE or an error code for all queued kernels.

Actual result:

clGetEventInfo() reports CL_QUEUED, CL_SUBMITTED or CL_RUNNING for some of the queued kernels.

This is a bug since, according to https://www.khronos.org/registry/cl/specs/opencl-1.1.pdf § "5.13 Flush and Finish":

The function

cl_int clFinish (cl_command_queue command_queue)

blocks until all previously queued OpenCL commands in command_queue are issued to the associated device and have completed. clFinish does not return until all queued commands in command_queue have been processed and completed.

The workaround that I've found is:

cl_event end_evt;

clEnqueueMarker(command_queue, &end_evt);

clFlush(command_queue);

clWaitForEvents(1, &end_evt);

clReleaseEvent(end_evt);

I have not tested this issue with the Yocto release, but I have found nothing in the release notes indicating that such a bug has been fixed. Does anyone know?

Labels (3)
Tags (2)
0 Kudos
Reply
1 Reply

726 Views
xchangfeng
Contributor I

I use this clFinished() at first, but I find the data is not completed and time usage is really short.

Then I change to clWaitForEvents(), the data is quite completed, through the time usage is almost double. But this is the right solution.

0 Kudos
Reply