As I cannot find anything about this online, nor can I find an adequate documentation about neither the tools nor the hardware I decided to ask in this forum - even though I am not sure if this is the right place, as it is more of a vivante GPU issue then related to NXP.
I am working on an opencl 1.2 project using the vivante gpu on an imx8 board.
My project consists of a few opencl man-optimized kernels, all precompiled binaries on my development pc with the opencl compiler vcCompiler from the VTK you can find in the recent software section of the imx8qxp board . Luckily running performance of them is super good, makes the CPU free to do other stuff.
Unfortunately at the startup of the program I need to call clCreateKernel and even though I am using clCreateKernel with cl-programs pre-built as binary all 7 clCreateKernel calls take about 30!!~40!!! seconds. This is not only very annoying for debugging but eventually renders the whole software to be not usable. It needs to be working 1-2 seconds after starting the device otherwise the use-case is simply not there.
This issue looks similar for INTEGRITY and linux.
So my question: Is there a way to store/precompile opencl 1.2 kernels to speedup clCreateKernel, maybe even some native galcore functions that can be called to store the cl_kernel objects to disk after clCreateKernel has been called?
Thanks for any help in advance!