Hi,
In the Vivante GPU package from i.MX6 BSP 3.0.35-4.1.0, some queued commands may still be pending after clFinish() if the command queue is of out-of-order type. This pending state of some commands is confirmed by calls to clGetEventInfo().
Steps to reproduce:
- Create a command queue with clCreateCommandQueue(), setting the CL_QUEUE_OUT_OF_ORDER_EXEC_MODE_ENABLE property.
- Enqueue a few kernels (large enough to take some time) into this command queue, requesting a completion event.
- Call clFinish() for this command queue.
Expected result:
clGetEventInfo() should report either CL_COMPLETE or an error code for all queued kernels.
Actual result:
clGetEventInfo() reports CL_QUEUED, CL_SUBMITTED or CL_RUNNING for some of the queued kernels.
This is a bug since, according to https://www.khronos.org/registry/cl/specs/opencl-1.1.pdf § "5.13 Flush and Finish":
The function
cl_int clFinish (cl_command_queue command_queue)
blocks until all previously queued OpenCL commands in command_queue are issued to the associated device and have completed. clFinish does not return until all queued commands in command_queue have been processed and completed.
The workaround that I've found is:
cl_event end_evt;
clEnqueueMarker(command_queue, &end_evt);
clFlush(command_queue);
clWaitForEvents(1, &end_evt);
clReleaseEvent(end_evt);
I have not tested this issue with the Yocto release, but I have found nothing in the release notes indicating that such a bug has been fixed. Does anyone know?
I use this clFinished() at first, but I find the data is not completed and time usage is really short.
Then I change to clWaitForEvents(), the data is quite completed, through the time usage is almost double. But this is the right solution.