Regarding OpenCL support on i.MX8 family processors

取消
显示结果 
显示  仅  | 搜索替代 
您的意思是: 

Regarding OpenCL support on i.MX8 family processors

3,198 次查看
turker
Contributor IV

Hello all,

I was wondering about the current state of OpenCL support on i.MX8 processors, specifically for machine learning applications using NXP eIQ framework.

According to NXP eIQ Machine Learning Software Development for i.MX Application Processors "OpenCL is currently not supported in the L4.14.98_2.0.0 and L4.14.78_1.0.0 Yocto configurations" for OpenCV. Has this changed for newer releases because according the example hereeIQ Sample Apps - Object Recognition using OpenCV DNN OpenCL can be used with OpenCV?

Also do any of the samples in NXP eIQ framework have hardware acceleration support? eIQ™ ML Software Development Environment | NXP landing page states that only Arm NN and TensorFlowLite have GPU support however it is also stated that "Arm NN does not currently support the i.MX 8 GPUs due to the Arm NN OpenCL requirements, which are not met by i.MX 8 GPUs.

I would be grateful if somebody could clarify these.

Thanks in advance,

Tahsin

标记 (3)
0 项奖励
4 回复数

2,433 次查看
davidvescovi
Contributor V

same issue

0 项奖励

2,695 次查看
bwasim123
Contributor I

Is there a specific reason why there is no OpenCL support for iMX8M Mini? We have a custom board based on this SoC and would like to use OpenCL to to accelerate our AI Engine. Any ETA on when this support may become available ?

0 项奖励

3,071 次查看
Bio_TICFSL
NXP TechSupport
NXP TechSupport

Hello Tahsin,

OpenCL is supported in MX8 all platform except the MX8MMini. However you can use the OpenCV. The BSP is currently the L5.4.3 where everything works.

The OpenCL is 1.2 in MX8, the Arm NN dont´work with this version.

Regards

0 项奖励

3,071 次查看
fjpmbb
Contributor II

Hi Bio_TICFSL

    I am using imx8m nano with BSP L5.4.3,  but i get some OpenCL issue" OpenCL error CL_INVALID_WORK_GROUP_SIZE (-54) ". How to fix this issue?Thanks.

my program log shows as below:

Start program

Set to 640x480@30fps

Start Capture

OpenCL error CL_INVALID_WORK_GROUP_SIZE (-54) during call: clEnqueueNDRangeKernel('pyrDown', dims=2, globalsize=768x120x1, localsize=256x1x1) sync=false

OpenCL error CL_INVALID_WORK_GROUP_SIZE (-54) during call: clEnqueueNDRangeKernel('pyrUp', dims=2, globalsize=640x480x1, localsize=16x16x1) sync=false

OpenCL error CL_INVALID_WORK_GROUP_SIZE (-54) during call: clEnqueueNDRangeKernel('pyrDown', dims=2, globalsize=512x60x1, localsize=256x1x1) sync=false

OpenCL error CL_INVALID_WORK_GROUP_SIZE (-54) during call: clEnqueueNDRangeKernel('pyrUp', dims=2, globalsize=320x240x1, localsize=16x16x1) sync=false

OpenCL error CL_INVALID_WORK_GROUP_SIZE (-54) during call: clEnqueueNDRangeKernel('pyrDown', dims=2, globalsize=256x30x1, localsize=256x1x1) sync=false

OpenCL error CL_INVALID_WORK_GROUP_SIZE (-54) during call: clEnqueueNDRangeKernel('pyrUp', dims=2, globalsize=160x128x1, localsize=16x16x1) sync=false

OpenCL error CL_INVALID_WORK_GROUP_SIZE (-54) during call: clEnqueueNDRangeKernel('pyrDown', dims=2, globalsize=256x15x1, localsize=256x1x1) sync=false

OpenCL error CL_INVALID_WORK_GROUP_SIZE (-54) during call: clEnqueueNDRangeKernel('pyrUp', dims=2, globalsize=80x64x1, localsize=16x16x1) sync=false

OpenCL error CL_INVALID_WORK_GROUP_SIZE (-54) during call: clEnqueueNDRangeKernel('pyrDown', dims=2, globalsize=256x8x1, localsize=256x1x1) sync=false

OpenCL error CL_INVALID_WORK_GROUP_SIZE (-54) during call: clEnqueueNDRangeKernel('pyrUp', dims=2, globalsize=48x32x1, localsize=16x16x1) sync=false

OpenCL error CL_INVALID_WORK_GROUP_SIZE (-54) during call: clEnqueueNDRangeKernel('pyrUp', dims=2, globalsize=80x64x1, localsize=16x16x1) sync=false

OpenCL error CL_INVALID_WORK_GROUP_SIZE (-54) during call: clEnqueueNDRangeKernel('pyrUp', dims=2, globalsize=160x128x1, localsize=16x16x1) sync=false

OpenCL error CL_INVALID_WORK_GROUP_SIZE (-54) during call: clEnqueueNDRangeKernel('pyrUp', dims=2, globalsize=320x240x1, localsize=16x16x1) sync=false

OpenCL error CL_INVALID_WORK_GROUP_SIZE (-54) during call: clEnqueueNDRangeKernel('pyrUp', dims=2, globalsize=640x480x1, localsize=16x16x1) sync=false

OpenCL error CL_INVALID_WORK_GROUP_SIZE (-54) during call: clEnqueueNDRangeKernel('morph', dims=2, globalsize=640x368x1, localsize=16x16x1) sync=false

OpenCL error CL_INVALID_WORK_GROUP_SIZE (-54) during call: clEnqueueNDRangeKernel('morph', dims=2, globalsize=640x368x1, localsize=16x16x1) sync=false

OpenCL error CL_INVALID_WORK_GROUP_SIZE (-54) during call: clEnqueueNDRangeKernel('reduce', dims=1, globalsize=512x1x1, localsize=512x1x1) sync=true

(0)FScore: 127.00 Mean: 0.00 Dev: 0.00 State: - rZ: 0

OpenCL error CL_INVALID_WORK_GROUP_SIZE (-54) during call: clEnqueueNDRangeKernel('pyrDown', dims=2, globalsize=768x120x1, localsize=256x1x1) sync=false

OpenCL error CL_INVALID_WORK_GROUP_SIZE (-54) during call: clEnqueueNDRangeKernel('pyrUp', dims=2, globalsize=640x480x1, localsize=16x16x1) sync=false

OpenCL error CL_INVALID_WORK_GROUP_SIZE (-54) during call: clEnqueueNDRangeKernel('pyrDown', dims=2, globalsize=512x60x1, localsize=256x1x1) sync=false

OpenCL error CL_INVALID_WORK_GROUP_SIZE (-54) during call: clEnqueueNDRangeKernel('pyrUp', dims=2, globalsize=320x240x1, localsize=16x16x1) sync=false

OpenCL error CL_INVALID_WORK_GROUP_SIZE (-54) during call: clEnqueueNDRangeKernel('pyrDown', dims=2, globalsize=256x30x1, localsize=256x1x1) sync=false

OpenCL error CL_INVALID_WORK_GROUP_SIZE (-54) during call: clEnqueueNDRangeKernel('pyrUp', dims=2, globalsize=160x128x1, localsize=16x16x1) sync=false

OpenCL error CL_INVALID_WORK_GROUP_SIZE (-54) during call: clEnqueueNDRangeKernel('pyrDown', dims=2, globalsize=256x15x1, localsize=256x1x1) sync=false

OpenCL error CL_INVALID_WORK_GROUP_SIZE (-54) during call: clEnqueueNDRangeKernel('pyrUp', dims=2, globalsize=80x64x1, localsize=16x16x1) sync=false

OpenCL error CL_INVALID_WORK_GROUP_SIZE (-54) during call: clEnqueueNDRangeKernel('pyrDown', dims=2, globalsize=256x8x1, localsize=256x1x1) sync=false

OpenCL error CL_INVALID_WORK_GROUP_SIZE (-54) during call: clEnqueueNDRangeKernel('pyrUp', dims=2, globalsize=48x32x1, localsize=16x16x1) sync=false

OpenCL error CL_INVALID_WORK_GROUP_SIZE (-54) during call: clEnqueueNDRangeKernel('pyrUp', dims=2, globalsize=80x64x1, localsize=16x16x1) sync=false

OpenCL error CL_INVALID_WORK_GROUP_SIZE (-54) during call: clEnqueueNDRangeKernel('pyrUp', dims=2, globalsize=160x128x1, localsize=16x16x1) sync=false

OpenCL error CL_INVALID_WORK_GROUP_SIZE (-54) during call: clEnqueueNDRangeKernel('pyrUp', dims=2, globalsize=320x240x1, localsize=16x16x1) sync=false

OpenCL error CL_INVALID_WORK_GROUP_SIZE (-54) during call: clEnqueueNDRangeKernel('pyrUp', dims=2, globalsize=640x480x1, localsize=16x16x1) sync=false

and my imx8mn board /opt/viv_samples/cl11/UnitTest/clinfo is like this:

>>>>>>>> ./clinfo Starting...

Available platforms: 1

Platform ID: 0
CL_PLATFORM_NAME: Vivante OpenCL Platform
CL_PLATFORM_PROFILE: FULL_PROFILE
CL_PLATFORM_VERSION: OpenCL 1.2 V6.4.0.p2.234062
CL_PLATFORM_VENDOR: Vivante Corporation
CL_PLATFORM_EXTENSIONS: cl_khr_icd


Available devices: 1

Device ID: 0
Device Ptr: 0x0b26e300
CL_DEVICE_NAME: Vivante OpenCL Device GC7000UL.6203.0000
CL_DEVICE_VENDOR: Vivante Corporation
CL_DEVICE_TYPE: GPU
CL_DEVICE_OPENCL_C_VERSION: OpenCL C 1.2
CL_DEVICE_VENDOR_ID: 0x00564956
CL_DEVICE_PLATFORM: 0xab112020
CL_DEVICE_VERSION: OpenCL 1.2
CL_DEVICE_PROFILE: FULL_PROFILE
CL_DRIVER_VERSION: OpenCL 1.2 V6.4.0.p2.234062
CL_DEVICE_MAX_COMPUTE_UNITS: 1
CL_DEVICE_MAX_WORK_ITEM_DIMENSIONS: 3
CL_DEVICE_MAX_WORK_ITEM_SIZES[0]: 512
CL_DEVICE_MAX_WORK_ITEM_SIZES[1]: 512
CL_DEVICE_MAX_WORK_ITEM_SIZES[2]: 512
CL_DEVICE_MAX_WORK_GROUP_SIZE: 512
CL_DEVICE_MAX_CLOCK_FREQUENCY: 600 MHz
CL_DEVICE_IMAGE_SUPPORT: Yes
CL_DEVICE_MAX_READ_IMAGE_ARGS: 128
CL_DEVICE_MAX_WRITE_IMAGE_ARGS: 8
CL_DEVICE_IMAGE2D_MAX_WIDTH: 8192
CL_DEVICE_IMAGE2D_MAX_HEIGHT: 8192
CL_DEVICE_IMAGE3D_MAX_WIDTH: 8192
CL_DEVICE_IMAGE3D_MAX_HEIGHT: 8192
CL_DEVICE_IMAGE3D_MAX_DEPTH: 8192
CL_DEVICE_MAX_SAMPLERS: 16

CL_DEVICE_EXTENSIONS: cl_khr_byte_addressable_store
cl_khr_gl_sharing
cl_khr_fp16
cl_khr_global_int32_base_atomics
cl_khr_global_int32_extended_atomics
cl_khr_local_int32_base_atomics
cl_khr_local_int32_extended_atomics

CL_DEVICE_PREFERRED_VECTOR_WIDTH_CHAR: 4
CL_DEVICE_PREFERRED_VECTOR_WIDTH_SHORT: 4
CL_DEVICE_PREFERRED_VECTOR_WIDTH_INT: 4
CL_DEVICE_PREFERRED_VECTOR_WIDTH_LONG: 4
CL_DEVICE_PREFERRED_VECTOR_WIDTH_FLOAT: 4
CL_DEVICE_PREFERRED_VECTOR_WIDTH_DOUBLE: 0
CL_DEVICE_NATIVE_VECTOR_WIDTH_CHAR: 4
CL_DEVICE_NATIVE_VECTOR_WIDTH_SHORT: 4
CL_DEVICE_NATIVE_VECTOR_WIDTH_INT: 4
CL_DEVICE_NATIVE_VECTOR_WIDTH_LONG: 4
CL_DEVICE_NATIVE_VECTOR_WIDTH_FLOAT: 4
CL_DEVICE_NATIVE_VECTOR_WIDTH_DOUBLE: 0
CL_DEVICE_MAX_PARAMETER_SIZE: 1024
CL_DEVICE_MEM_BASE_ADDR_ALIGN: 2048
CL_DEVICE_MIN_DATA_TYPE_ALIGN_SIZE: 128
CL_DEVICE_SINGLE_FP_CONFIG:
CL_FP_DENORM: No
CL_FP_INF_NAN: Yes
CL_FP_ROUND_TO_NEAREST: Yes
CL_FP_ROUND_TO_ZERO: Yes
CL_FP_ROUND_TO_INF: No
CL_FP_FMA: No
CL_FP_SOFT_FLOAT: No
CL_DEVICE_ADDRESS_BITS: 32
CL_DEVICE_GLOBAL_MEM_SIZE: 256 MByte
CL_DEVICE_MAX_MEM_ALLOC_SIZE: 128 MByte
CL_DEVICE_GLOBAL_MEM_CACHE_TYPE: Read/Write
CL_DEVICE_GLOBAL_MEM_CACHELINE_SIZE: 64
CL_DEVICE_GLOBAL_MEM_CACHE_SIZE: 8192
CL_DEVICE_LOCAL_MEM_SIZE: 32 KByte
CL_DEVICE_LOCAL_MEM_TYPE: Global
CL_DEVICE_MAX_CONSTANT_BUFFER_SIZE: 64 KByte
CL_DEVICE_MAX_CONSTANT_ARGS: 9
CL_DEVICE_ERROR_CORRECTION_SUPPORT: Yes
CL_DEVICE_QUEUE_PROPERTIES:
CL_QUEUE_OUT_OF_ORDER_EXEC_MODE_ENABLE:Yes
CL_QUEUE_PROFILING_ENABLE: Yes
CL_DEVICE_HOST_UNIFIED_MEMORY: Yes
CL_DEVICE_PROFILING_TIMER_RESOLUTION: 1000
CL_DEVICE_ENDIAN_LITTLE: Yes
CL_DEVICE_AVAILABLE: Yes
CL_DEVICE_COMPILER_AVAILABLE: Yes
CL_DEVICE_EXECUTION_CAPABILITIES:
CL_EXEC_KERNEL: Yes
CL_EXEC_NATIVE_KERNEL: No

>>>>>>>> Creating CLInfo context...


Context Properties:
Context Ptr: 0x0b26e570
CL_CONTEXT_REFERENCE_COUNT: 1
CL_CONTEXT_NUM_DEVICES: 1
CL_CONTEXT_DEVICES: 0x0b26e300
CL_CONTEXT_PROPERTIES: 0x00001084
0xab112020
0x00000000


>>>>>>>> Creating CLInfo command queue...


Command Queue Properties:
CL_QUEUE_CONTEXT: 0x0b26e570
CL_QUEUE_DEVICE: 0x0b26e300
CL_QUEUE_REFERENCE_COUNT: 1
CL_QUEUE_PROPERTIES:
CL_QUEUE_OUT_OF_ORDER_EXEC_MODE_ENABLE:Yes
CL_QUEUE_PROFILING_ENABLE: Yes

>>>>>>>> Creating CLInfo program...

>>>>>>>> Building CLInfo program...


Program Properties:
CL_PROGRAM_CONTEXT: 0x0b26e570
CL_PROGRAM_REFERENCE_COUNT: 1
CL_PROGRAM_NUM_DEVICES: 1
CL_PROGRAM_DEVICES: 0x0b26e300
CL_PROGRAM_SOURCE: (Size:60)

------------------ BEGIN --------------------
__kernel void hello() { size_t i = get_global_id(0); }
------------------- END ---------------------

CL_PROGRAM_BINARY_SIZES[0]: 451
CL_PROGRAM_BINARIES:
Device Number 0:

------------------ BEGIN --------------------
SHDR ' p b CL / , i hello
------------------- END ---------------------


Program Build Properties:
CL_PROGRAM_BUILD_STATUS: 0
CL_PROGRAM_BUILD_OPTIONS: ""
CL_PROGRAM_BUILD_LOG: ""

>>>>>>>> Creating CLInfo kernel...


Kernel Properties:
CL_KERNEL_FUNCTION_NAME: "hello"
CL_KERNEL_CONTEXT: 0x0b26e570
CL_KERNEL_PROGRAM: 0x0b26bf20
CL_KERNEL_NUM_ARGS: 0
CL_KERNEL_REFERENCE_COUNT: 1


Kernel Workgroup Properties:
CL_KERNEL_WORK_GROUP_SIZE: 512
CL_KERNEL_PREFERRED_WORK_GROUP_SIZE_MULTIPLE: 8
CL_KERNEL_COMPILE_WORK_GROUP_SIZE: 0
0
0
CL_KERNEL_LOCAL_MEM_SIZE: 0
CL_KERNEL_PRIVATE_MEM_SIZE: 0

>>>>>>>> Releasing CLInfo kernel...

>>>>>>>> Releasing CLInfo program...

>>>>>>>> Releasing CLInfo command queue...

>>>>>>>> Releasing CLInfo context...

>>>>>>>> Exiting...