This guide is about how to use EVIS to create user nodes and kernels in OpenVX to implement image processing on NPU(i.MX8MP)/GPU(i.MX8QM). Take gaussian filter as an example. It is tested on i.MX8QM and i.MX8MP.
User Node Creation from User Kernel
1. Define a user node
Register a user kernel by its ID or name
For example,
#define VX_KERNEL_NAME_GAUSSIAN "com.nxp.extension.gaussian"
#define VX_KERNEL_ENUM_GAUSSIAN 100
Get the kernel reference by the ID or name
For example,
vx_kernel kernel = vxGetKernelByName(context, VX_KERNEL_NAME_GAUSSIAN);
vx_kernel kernel = vxGetKernelByEnum(context, VX_KERNEL_ENUM_GAUSSIAN );
Create a user node
vx_node node = vxCreateGenericNode(graph, kernel);
Set input/output node parameters
For example,
vx_status status = vxSetParameterByIndex(node, index++, (vx_reference)in_image);
status |= vxSetParameterByIndex(node, index++, (vx_reference)out_image);
2. Create InputValidator/OutputValidator functions for the node
The validators are only used for graph verification.
For example,
static vx_status VX_CALLBACK vxGaussianInputValidator(vx_node node, vx_uint32 index)
static vx_status VX_CALLBACK vxGaussianOutputValidator(vx_node node, vx_uint32 index, vx_meta_format metaObj)
ToDo:
a. InputValidator:
vx_parameter paramObj = NULL;
vx_image imgObj = NULL;paramObj=vxGetParameterByIndex(node, index);
vxQueryParameter(paramObj, VX_PARAMETER_REF, &imgObj, sizeof(vx_image));
vxQueryImage(imgObj, VX_IMAGE_FORMAT, &imgFmt, sizeof(imgFmt));
if (VX_DF_IMAGE_U8==imgFmt) status = VX_SUCCESS;
else status = VX_ERROR_INVALID_VALUE;
b. OutputValidator
status |= vxSetMetaFormatAttribute(metaObj, VX_IMAGE_FORMAT, &imgFmt, sizeof(imgFmt));
status |= vxSetMetaFormatAttribute(metaObj, VX_IMAGE_WIDTH, &width, sizeof(width));
status |= vxSetMetaFormatAttribute(metaObj, VX_IMAGE_HEIGHT, &height, sizeof(height));
3. Create Initializer function for the node.
The initializer is used to specify workdim, global work size and local work size for the user kernel. These parameters are similiar to that in OpenCL.
For example,
/* workdim, globel offset, globel scale, local size, globel size */
vx_kernel_execution_parameters_t shaderParam = {2, {0, 0, 0}, {0, 0, 0}, {0, 0, 0}, {0, 0, 0}};vx_status VX_CALLBACK vxGaussianInitializer(vx_node nodObj, const vx_reference *paramObj, vx_uint32 paraNum)
Set attribute to the node
vxSetNodeAttribute(nodObj, VX_NODE_ATTRIBUTE_KERNEL_EXECUTION_PARAMETERS, &shaderParam, sizeof(vx_kernel_execution_parameters_t));
Note: The links below are guides about OpenCL on GPU, which are helpful to understand OpenVX implemented on GPU/NPU.
4. Create Deinitializer function for the node (Optional)
It is used to de-allocate memory allocated at initializer.
User Kernel on NPU/GPU Creation
1. Create description of a user kernel
For example,
vx_kernel_description_t vxGaussianKernelVXCInfo =
{
VX_KERNEL_ENUM_GAUSSIAN,
VX_KERNEL_NAME_GAUSSIAN,
nullptr,
vxGaussianKernelParam,
(sizeof(vxGaussianKernelParam)/sizeof(vxGaussianKernelParam[0])),
vxGaussianValidator,
nullptr,
nullptr,
vxGaussianInitializer,
nullptr
};
2. Register the new kernel
For example,
static vx_kernel_description_t* kernels[] =
{
&vxGaussianKernelVXCInfo,
};
3. Write kernel source implemented on NPU/GPU
For example,
char vxcKernelSource[] =
{
"#include \ \n\
\n\
\n\
__kernel void gaussian\n\
( \n\
__read_only image2d_t in_image, \n\
__write_only image2d_t out_image \n\
) \n\
{ \n\
int2 coord = (int2)(get_global_id(0), get_global_id(1)); \n\
int2 coord_out = coord; \n\
vxc_uchar16 lineA, lineB, lineC, out;\n\
int2 coord_in1 = coord + (int2)(-1, -1);\n\
VXC_OP4(img_load, lineA, in_image, coord_in1, 0, VXC_MODIFIER(0, 15, 0, VXC_RM_TowardZero, 0));\n\
int2 coord_in2 = coord + (int2)(-1, 0);\n\
VXC_OP4(img_load, lineB, in_image, coord_in2, 0, VXC_MODIFIER(0, 15, 0, VXC_RM_TowardZero, 0));\n\
int2 coord_in3 = coord + (int2)(-1, 1);\n\
VXC_OP4(img_load, lineC, in_image, coord_in3, 0, VXC_MODIFIER(0, 15, 0, VXC_RM_TowardZero, 0));\n\
int info = VXC_MODIFIER_FILTER(0, 13, 0, VXC_FM_Guassian, 0);\n\
VXC_OP4(filter, out, lineA, lineB, lineC, info); ;\n\
VXC_OP4_NoDest(img_store, out_image, coord_out, out, VXC_MODIFIER(0, 13, 0, VXC_RM_TowardZero, 0)); \n\
}\n\
"
};
Note: the source is written by EVIS instructions with less latency. But the EVIS instructions are limited. These fucntions defination can be found in "cl_viv_vx_ext.h" located at "/usr/include/CL/cl_viv_vx_ext.h".
Read back the processed data by GPU/NPU to check if the operations are correct.
For example,
status = vxCopyImagePatch(vx_out_image, &rect, 0, &addressing, data2, VX_READ_ONLY, VX_MEMORY_TYPE_HOST);
4. Build the NPU/GPU source code runtime
For example,
programObj = vxCreateProgramWithSource(ContextVX, 1, programSrc, &programLen);
vxBuildProgram(programObj, "-cl-viv-vx-extension");
5. Add kernel to the program
For example,
...
kernelObj = vxAddKernelInProgram(programObj,
kernels[i]->name,
kernels[i]->enumeration,
kernels[i]->numParams,
kernels[i]->validate,
kernels[i]->initialize,
kernels[i]->deinitialize
);...
for(vx_uint32 j=0; j < kernels[i]->numParams; j++)
{
status = vxAddParameterToKernel(kernelObj,
j,
kernels[i]->parameters[j].direction,
kernels[i]->parameters[j].data_type,
kernels[i]->parameters[j].state
);
6. Finalize the kernel creation
For example,
status = vxFinalizeKernel(kernelObj);
Exercise
The example is attached. You can build and test it on i.MX8QM or i.MX8MP.
Results on i.MX8QM:
References:
Khronosdotorg/resources.md at master · KhronosGroup/Khronosdotorg · GitHub
Further Reading:
OpenVX Vision Image Extension API Introduction - Basic API
OpenVX Vision Image Extension API Introduction - DP Dot Products