This guide is about how to use EVIS to create user nodes and kernels in OpenVX to implement image processing on NPU(i.MX8MP)/GPU(i.MX8QM). Take gaussian filter as an example. It is tested on i.MX8QM and i.MX8MP. User Node Creation from User Kernel 1. Define a user node Register a user kernel by its ID or name For example, #define VX_KERNEL_NAME_GAUSSIAN "com.nxp.extension.gaussian" #define VX_KERNEL_ENUM_GAUSSIAN 100 Get the kernel reference by the ID or name For example, vx_kernel kernel = vxGetKernelByName(context, VX_KERNEL_NAME_GAUSSIAN); vx_kernel kernel = vxGetKernelByEnum(context, VX_KERNEL_ENUM_GAUSSIAN ); Create a user node vx_node node = vxCreateGenericNode(graph, kernel); Set input/output node parameters For example, vx_status status = vxSetParameterByIndex(node, index++, (vx_reference)in_image); status |= vxSetParameterByIndex(node, index++, (vx_reference)out_image); 2. Create InputValidator/OutputValidator functions for the node The validators are only used for graph verification. For example, static vx_status VX_CALLBACK vxGaussianInputValidator(vx_node node, vx_uint32 index) static vx_status VX_CALLBACK vxGaussianOutputValidator(vx_node node, vx_uint32 index, vx_meta_format metaObj) ToDo: a. InputValidator: Get the reference to the parameter object vx_parameter paramObj = NULL; vx_image imgObj = NULL; paramObj=vxGetParameterByIndex(node, index); vxQueryParameter(paramObj, VX_PARAMETER_REF, &imgObj, sizeof(vx_image)); Check meta-data restriction vxQueryImage(imgObj, VX_IMAGE_FORMAT, &imgFmt, sizeof(imgFmt)); Check consistency with other parameters if (VX_DF_IMAGE_U8==imgFmt) status = VX_SUCCESS; else status = VX_ERROR_INVALID_VALUE; b. OutputValidator Set the meta_format object with expected meta-data for the output status |= vxSetMetaFormatAttribute(metaObj, VX_IMAGE_FORMAT, &imgFmt, sizeof(imgFmt)); status |= vxSetMetaFormatAttribute(metaObj, VX_IMAGE_WIDTH, &width, sizeof(width)); status |= vxSetMetaFormatAttribute(metaObj, VX_IMAGE_HEIGHT, &height, sizeof(height)); 3. Create Initializer function for the node. The initializer is used to specify workdim, global work size and local work size for the user kernel. These parameters are similiar to that in OpenCL. For example, /* workdim, globel offset, globel scale, local size, globel size */ vx_kernel_execution_parameters_t shaderParam = {2, {0, 0, 0}, {0, 0, 0}, {0, 0, 0}, {0, 0, 0}}; vx_status VX_CALLBACK vxGaussianInitializer(vx_node nodObj, const vx_reference *paramObj, vx_uint32 paraNum) Set attribute to the node vxSetNodeAttribute(nodObj, VX_NODE_ATTRIBUTE_KERNEL_EXECUTION_PARAMETERS, &shaderParam, sizeof(vx_kernel_execution_parameters_t)); Note: The links below are guides about OpenCL on GPU, which are helpful to understand OpenVX implemented on GPU/NPU. OpenCL Work Item Ids: Global/Group/Local OpenCL Programming Guide OpenCL Resources Introduction to OpenCL 4. Create Deinitializer function for the node (Optional) It is used to de-allocate memory allocated at initializer. User Kernel on NPU/GPU Creation 1. Create description of a user kernel For example, vx_kernel_description_t vxGaussianKernelVXCInfo = { VX_KERNEL_ENUM_GAUSSIAN, VX_KERNEL_NAME_GAUSSIAN, nullptr, vxGaussianKernelParam, (sizeof(vxGaussianKernelParam)/sizeof(vxGaussianKernelParam[0])), vxGaussianValidator, nullptr, nullptr, vxGaussianInitializer, nullptr }; 2. Register the new kernel For example, static vx_kernel_description_t* kernels[] = { &vxGaussianKernelVXCInfo, }; 3. Write kernel source implemented on NPU/GPU For example, char vxcKernelSource[] = { "#include \ \n\ \n\ \n\ __kernel void gaussian\n\ ( \n\ __read_only image2d_t in_image, \n\ __write_only image2d_t out_image \n\ ) \n\ { \n\ int2 coord = (int2)(get_global_id(0), get_global_id(1)); \n\ int2 coord_out = coord; \n\ vxc_uchar16 lineA, lineB, lineC, out;\n\ int2 coord_in1 = coord + (int2)(-1, -1);\n\ VXC_OP4(img_load, lineA, in_image, coord_in1, 0, VXC_MODIFIER(0, 15, 0, VXC_RM_TowardZero, 0));\n\ int2 coord_in2 = coord + (int2)(-1, 0);\n\ VXC_OP4(img_load, lineB, in_image, coord_in2, 0, VXC_MODIFIER(0, 15, 0, VXC_RM_TowardZero, 0));\n\ int2 coord_in3 = coord + (int2)(-1, 1);\n\ VXC_OP4(img_load, lineC, in_image, coord_in3, 0, VXC_MODIFIER(0, 15, 0, VXC_RM_TowardZero, 0));\n\ int info = VXC_MODIFIER_FILTER(0, 13, 0, VXC_FM_Guassian, 0);\n\ VXC_OP4(filter, out, lineA, lineB, lineC, info); ;\n\ VXC_OP4_NoDest(img_store, out_image, coord_out, out, VXC_MODIFIER(0, 13, 0, VXC_RM_TowardZero, 0)); \n\ }\n\ " }; Note: the source is written by EVIS instructions with less latency. But the EVIS instructions are limited. These fucntions defination can be found in "cl_viv_vx_ext.h" located at "/usr/include/CL/cl_viv_vx_ext.h". Read back the processed data by GPU/NPU to check if the operations are correct. For example, status = vxCopyImagePatch(vx_out_image, &rect, 0, &addressing, data2, VX_READ_ONLY, VX_MEMORY_TYPE_HOST); 4. Build the NPU/GPU source code runtime For example, programObj = vxCreateProgramWithSource(ContextVX, 1, programSrc, &programLen); vxBuildProgram(programObj, "-cl-viv-vx-extension"); 5. Add kernel to the program For example, ... kernelObj = vxAddKernelInProgram(programObj, kernels[i]->name, kernels[i]->enumeration, kernels[i]->numParams, kernels[i]->validate, kernels[i]->initialize, kernels[i]->deinitialize ); ... for(vx_uint32 j=0; j < kernels[i]->numParams; j++) { status = vxAddParameterToKernel(kernelObj, j, kernels[i]->parameters[j].direction, kernels[i]->parameters[j].data_type, kernels[i]->parameters[j].state ); 6. Finalize the kernel creation For example, status = vxFinalizeKernel(kernelObj); Exercise The example is attached. You can build and test it on i.MX8QM or i.MX8MP. Results on i.MX8QM: References: Khronosdotorg/resources.md at master · KhronosGroup/Khronosdotorg · GitHub Further Reading: OpenVX Vision Image Extension API Introduction - Basic API OpenVX Vision Image Extension API Introduction - DP Dot Products
查看全文