i.MX 95 NNStreamer C++ Demo NPU Issues Environment | Component | Detail | |-----------|--------| | Board | i.MX 95 | | BSP | lf-6.12.49-2.2.0 | | Kernel | Linux ... 6.12.49-lts-next-gbf3cf0324593 #1 SMP PREEMPT Tue Jun 16 03:46:26 UTC 2026 aarch64 | | NNStreamer | 2.4.2 | | TensorFlow Lite | 2.19.0 | | Neutron Delegate | libneutron_delegate.so reports v1.0.0-f24d08e5, non-zerocp, built Nov 12 2025 | | nnstreamer-examples | v1.6 (SRCREV 062ebd1) and v1.9 (SRCREV 37d3d86) — same behavior | | Camera | OV5640 MIPI via libcamera (imx8-isi), camera ID: /base/soc/bus@42000000/i2c@42530000/ov5640_mipi@3c | Binaries from meta-nxp-demo-experience/recipes-examples/imx-nnstreamer-examples/imx-nnstreamer-examples.bb, installed at /opt/gopoint-apps/scripts/machine_learning/nnstreamer/. Models All models are from the Yocto gopoint-base-apps recipe (downloads.json, lf-6.12.49_2.2.0 branch), hosted at: https://github.com/nxp-imx-support/nxp-demo-experience-assets/raw/lf-6.12.49_2.2.0/models/ Models on target at /opt/gopoint-apps/scripts/machine_learning/nnstreamer/downloads/models/: | Task | CPU model | NPU model | |------|-----------|-----------| | Face Detection | face-detection/ultraface_slim_uint8_float32.tflite | face-detection/ultraface_slim_uint8_float32_neutron.tflite | | Object Detection | object-detection/ssdlite_mobilenet_v2_coco_quant_uint8_float32_no_postprocess.tflite | object-detection/ssdlite_mobilenet_v2_coco_quant_uint8_float32_no_postprocess_neutron.tflite | | Classification | classification/mobilenet_v1_1.0_224_quant_uint8_float32.tflite | classification/mobilenet_v1_1.0_224_quant_uint8_float32_neutron.tflite | Metadata from same source: labels_mobilenet_quant_v1_224.txt, coco_labels_list.txt, box_priors.txt. Required Environment Variables OV5640 on i.MX 95 requires libcamera ISI: export CAMERA_BACKEND=libcamera export LIBCAMERA_CAM_DEVICE='/base/soc/bus@42000000/i2c@42530000/ov5640_mipi@3c'--- Issue 1: Face Detection NPU mode CAMERA_BACKEND=libcamera \ LIBCAMERA_CAM_DEVICE='/base/soc/bus@42000000/i2c@42530000/ov5640_mipi@3c' \ /opt/gopoint-apps/scripts/machine_learning/nnstreamer/face_detection/example_face_detection_tflite \ --backend NPU \ --model_path downloads/models/face-detection/ultraface_slim_uint8_float32_neutron.tflite \ --display_perf time Pipeline log: INFO: Start app... DEBUG: libcamerasrc name=cam_src camera-name=/base/soc/bus@42000000/i2c@42530000/ov5640_mipi@3c \ ! video/x-raw,width=640,height=480,framerate=30/1,format=YUY2 ! queue ! tee name=t \ t. ! queue name=thread-nn max-size-buffers=2 leaky=2 \ ! imxvideoconvert_g2d name=scale_csc_g2d_0 ! video/x-raw,width=320,height=240,format=RGB \ ! tensor_converter ! tensor_filter latency=1 framework=tensorflow-lite \ model=downloads/models/face-detection/ultraface_slim_uint8_float32_neutron.tflite \ custom=Delegate:External,ExtDelegateLib:libneutron_delegate.so name=face_filter \ ! tensor_sink name=tsink_fd \ t. ! queue name=thread-img max-size-buffers=2 leaky=2 ! cairooverlay name=cairooverlay \ ! fpsdisplaysink name=img_tensor text-overlay=false video-sink=waylandsink INFO: NeutronDelegate delegate: 2 nodes delegated out of 49 nodes with 2 partitions. INFO: Neutron delegate version: v1.0.0-f24d08e5, non-zerocp. [libcamera v0.0.0+6194-lf-6.12.49-2.2.0] [ov5640 pipeline: ov5640 -> csidev-4ad30000.csi -> formatter@20 -> crossbar] Camera camera.cpp:1215 configuring streams: (0) 640x480-YUYV/Unset DEBUG: Pipeline state changed from NULL to READY. DEBUG: Pipeline state changed from READY to PAUSED. DEBUG: Pipeline state changed from PAUSED to PLAYING. Result: Bounding boxes garbled. CPU mode — same pipeline, different model and delegate CAMERA_BACKEND=libcamera \ LIBCAMERA_CAM_DEVICE='/base/soc/bus@42000000/i2c@42530000/ov5640_mipi@3c' \ /opt/gopoint-apps/scripts/machine_learning/nnstreamer/face_detection/example_face_detection_tflite \ --backend CPU \ --model_path downloads/models/face-detection/ultraface_slim_uint8_float32.tflite \ --display_perf time Log: INFO: Start app... INFO: Created TensorFlow Lite XNNPACK delegate for CPU. DEBUG: [same camera pipeline, same imxvideoconvert_g2d] ... tensor_converter ! tensor_filter latency=1 framework=tensorflow-lite \ model=downloads/models/face-detection/ultraface_slim_uint8_float32.tflite \ custom=Delegate:XNNPACK,NumThreads:6 ... DEBUG: Pipeline state changed ... PLAYING. Result: ✅ Bounding boxes accurate. Same pipeline, same camera, same imxvideoconvert_g2d YUY2→RGB conversion. Only difference: quantized .tflite + XNNPACK vs _neutron.tflite + neutron delegate. --- Issue 2: Object Detection (SSD MobileNetV2) NPU mode CAMERA_BACKEND=libcamera \ LIBCAMERA_CAM_DEVICE='/base/soc/bus@42000000/i2c@42530000/ov5640_mipi@3c' \ /opt/gopoint-apps/scripts/machine_learning/nnstreamer/object_detection/example_detection_mobilenet_ssd_v2_tflite \ --backend NPU \ --model_path downloads/models/object-detection/ssdlite_mobilenet_v2_coco_quant_uint8_float32_no_postprocess_neutron.tflite \ --labels_path downloads/models/object-detection/coco_labels_list.txt \ --boxes_path downloads/models/object-detection/box_priors.txt \ --display_perf time Pipeline log: INFO: Start app... DEBUG: libcamerasrc name=cam_src camera-name=/base/soc/bus@42000000/i2c@42530000/ov5640_mipi@3c \ ! video/x-raw,width=640,height=480,framerate=30/1,format=YUY2 ! queue ! tee name=t \ t. ! queue name=thread-nn max-size-buffers=2 leaky=2 \ ! imxvideoconvert_g2d name=scale_csc_g2d_0 ! video/x-raw,width=300,height=300,format=RGB \ ! tensor_converter ! tensor_filter latency=1 framework=tensorflow-lite \ model=downloads/models/object-detection/ssdlite_mobilenet_v2_coco_quant_uint8_float32_no_postprocess_neutron.tflite \ custom=Delegate:External,ExtDelegateLib:libneutron_delegate.so name=detection_filter \ ! tensor_decoder name=tensor_decode_bounding_boxes_1 mode=bounding_boxes option1=mobilenet-ssd \ option2=downloads/models/object-detection/coco_labels_list.txt \ option3=downloads/models/object-detection/box_priors.txt \ option4=640:480 option5=300:300 ! imxvideoconvert_g2d ! mix.sink_0 \ t. ! queue name=thread-img max-size-buffers=2 leaky=2 ! mix.sink_1 \ imxcompositor_g2d name=mix sink_0::zorder=2 sink_1::zorder=1 latency=20000000 min-upstream-latency=20000000 \ ! cairooverlay name=perf ! fpsdisplaysink name=img_tensor text-overlay=false video-sink=waylandsink INFO: NeutronDelegate delegate: 1 nodes delegated out of 26 nodes with 1 partitions. INFO: Neutron delegate version: v1.0.0-f24d08e5, non-zerocp. Camera camera.cpp:1215 configuring streams: (0) 640x480-YUYV/Unset DEBUG: Pipeline state changed ... PLAYING. Result: Camera feed area completely black. Bounding boxes inaccurate. CPU mode — same pipeline, different model and delegate CAMERA_BACKEND=libcamera \ LIBCAMERA_CAM_DEVICE='/base/soc/bus@42000000/i2c@42530000/ov5640_mipi@3c' \ /opt/gopoint-apps/scripts/machine_learning/nnstreamer/object_detection/example_detection_mobilenet_ssd_v2_tflite \ --backend CPU \ --model_path downloads/models/object-detection/ssdlite_mobilenet_v2_coco_quant_uint8_float32_no_postprocess.tflite \ --labels_path downloads/models/object-detection/coco_labels_list.txt \ --boxes_path downloads/models/object-detection/box_priors.txt \ --display_perf time Log: INFO: Start app... INFO: Created TensorFlow Lite XNNPACK delegate for CPU. DEBUG: [same pipeline, same imxvideoconvert_g2d, same imxcompositor_g2d, XNNPACK delegate] DEBUG: Pipeline state changed ... PLAYING. Result: Camera feed area still completely black (same as NPU). Bounding boxes correct. Black screen in both CPU and NPU. Face detection and classification (which do not use imxcompositor_g2d) display normal camera feed. --- Issue 3: Classification (MobileNetV1) NPU mode CAMERA_BACKEND=libcamera \ LIBCAMERA_CAM_DEVICE='/base/soc/bus@42000000/i2c@42530000/ov5640_mipi@3c' \ /opt/gopoint-apps/scripts/machine_learning/nnstreamer/classification/example_classification_mobilenet_v1_tflite \ --backend NPU \ --model_path downloads/models/classification/mobilenet_v1_1.0_224_quant_uint8_float32_neutron.tflite \ --labels_path downloads/models/classification/labels_mobilenet_quant_v1_224.txt \ --display_perf time Pipeline log: INFO: Start app... DEBUG: libcamerasrc name=cam_src camera-name=/base/soc/bus@42000000/i2c@42530000/ov5640_mipi@3c \ ! video/x-raw,width=640,height=480,framerate=30/1,format=YUY2 ! queue ! tee name=t \ t. ! queue name=thread-nn max-size-buffers=2 leaky=2 \ ! imxvideoconvert_g2d name=scale_csc_g2d_0 ! video/x-raw,width=224,height=224,format=RGB \ ! tensor_converter ! tensor_filter latency=1 framework=tensorflow-lite \ model=downloads/models/classification/mobilenet_v1_1.0_224_quant_uint8_float32_neutron.tflite \ custom=Delegate:External,ExtDelegateLib:libneutron_delegate.so name=classification_filter \ ! tensor_decoder name=tensor_decode_labeling_1 mode=image_labeling \ option1=downloads/models/classification/labels_mobilenet_quant_v1_224.txt ! overlay.text_sink \ t. ! queue name=thread-img max-size-buffers=2 leaky=2 \ ! textoverlay name=overlay font-desc="Sans, 24" valignment=baseline halignment=center \ ! imxvideoconvert_g2d ! cairooverlay name=perf \ ! fpsdisplaysink name=img_tensor text-overlay=false video-sink=waylandsink INFO: NeutronDelegate delegate: 1 nodes delegated out of 4 nodes with 1 partitions. INFO: Neutron delegate version: v1.0.0-f24d08e5, non-zerocp. Camera camera.cpp:1215 configuring streams: (0) 640x480-YUYV/Unset DEBUG: Pipeline state changed ... PLAYING. Result: Camera feed normal. Label consistently wrong. CPU mode — same pipeline, different model and delegate CAMERA_BACKEND=libcamera \ LIBCAMERA_CAM_DEVICE='/base/soc/bus@42000000/i2c@42530000/ov5640_mipi@3c' \ /opt/gopoint-apps/scripts/machine_learning/nnstreamer/classification/example_classification_mobilenet_v1_tflite \ --backend CPU \ --model_path downloads/models/classification/mobilenet_v1_1.0_224_quant_uint8_float32.tflite \ --labels_path downloads/models/classification/labels_mobilenet_quant_v1_224.txt \ --display_perf time Log: INFO: Start app... INFO: Created TensorFlow Lite XNNPACK delegate for CPU. DEBUG: [same pipeline, same imxvideoconvert_g2d, XNNPACK delegate] DEBUG: Pipeline state changed ... PLAYING. Result: ✅ Camera feed normal. Label correct and responsive to camera scene. --- Summary In all three demos, CPU and NPU use the same camera pipeline, same imxvideoconvert_g2d YUY2→RGB conversion, same GStreamer pipeline structure. The only difference is: | Variable | CPU test | NPU test | |----------|----------|----------| | Model | *.tflite (quantized) | *_neutron.tflite | | Delegate | XNNPACK (Delegate:XNNPACK) | neutron (Delegate:External,ExtDelegateLib:libneutron_delegate.so) | Results: | Demo | CPU (quantized .tflite + XNNPACK) | NPU (_neutron.tflite + neutron) | |------|:---:|:---:| | Face Detection — bounding boxes | ✅ correct | ❌ garbled | | Object Detection — bounding boxes | ✅ correct | ❌ inaccurate | | Classification — label | ✅ correct | ❌ wrong | Object detection also shows black camera feed in both CPU and NPU. This demo uses imxcompositor_g2d for display; face detection and classification use cairooverlay/textoverlay and display normally. Re: i.MX 95 NNStreamer C++ Demo NPU Issues Hi @Chavira : The Board Support Package (BSP) version is 6.12.49. I built nxp-nnstreamer-examples (SRCREV: 062ebd1) separately via Yocto, then deployed the compiled debs onto the target board. Instead of launching the demos via the GoPoint application, I manually downloaded the model files according to the accompanying downloads.json and ran the programs using the commands mentioned earlier. Re: i.MX 95 NNStreamer C++ Demo NPU Issues Hi @BIG_FLY,
Thanks for the information regarding the GoPoint demos. Could you please clarify how you are running the demos?
Are you running them directly using the GoPoint application, or did you cross compile them yourself? Could you describe step by step how you are executing the demos on your side? Are you using BSP version 6.12.49? Which board are you using?
This information will help us better understand your setup and identify any potential issues. Best regards, Chavira
View full article