Hi team,
We are currently evaluating NPU inference and profiling on the MCM-i.MX8M-Plus platform from Compulab.
We had initially raised this query with CompuLab support, and they advised us to contact NXP directly regarding nnshark usage and NPU validation.
We attempted to integrate nnshark into the image and verified that the recipe builds successfully. However, no standalone nnshark executable is available on the target, and only library files such as libgstsharktracers.so and libgstshark.so are present. We would like clarification on whether nnshark is intended to be used only through GStreamer tracers/plugins or if a standalone utility is expected.
Additionally, we tested TensorFlow Lite inference using the VX delegate and profiling enabled. While the delegate loads successfully, we would like guidance on validating proper NPU utilization and understanding expected CPU usage behavior during inference.
I have attached the detailed procedure followed for nnshark integration and inference testing for reference.
Could you please help clarify:
Thank you for your support.
Hi,
Please find the attached log file and NNShark screenshot. Initially, I updated the MMC boot arguments by interrupting U-Boot during startup. After the system booted, I verified the BSP release version and executed the GStreamer pipeline after enabling NNShark profiling through environment variables.
Thank you.
Hi @cris_m
While running the pipeline with the TensorFlow Lite VX delegate, the logs still report “accl = cpu”, which is causing confusion regarding whether inference is actually being offloaded to the NPU or not.
>>>Please share your log file. Including the commands and methods you use to run the model.
Whether nnshark is expected to provide a standalone executable or only GStreamer tracer libraries
>>>NNShark is a GstShark-based analysis tool used to monitor multiple pipeline metrics for evaluating SoC hardware utilization.
>>>On the i.MX8M Plus, NNShark is primarily used for real-time profiling and performance validation of AI pipelines via the GStreamer/NNStreamer tracer. This is typically done by setting the GST_TRACERS and GST_DEBUG environment variables before running the GStreamer pipeline. You can get more details following below link:
https://github.com/nxp-imx/nnshark
How to conclusively verify NPU utilization during inference execution
>>>Which version of BSP are you using?
B.R
Thank you for your response and for sharing the reference document.
I would like to mention that I have already followed the procedure described in Chapter 8.1 (Object detection pipeline example) and have shared the same details in the attached NPU_test.txt document earlier, along with screenshots/logs from the GStreamer pipeline execution.
While running the pipeline with the TensorFlow Lite VX delegate, the logs still report “accl = cpu”, which is causing confusion regarding whether inference is actually being offloaded to the NPU or not.
Additionally, my earlier query regarding nnshark usage was not addressed. Specifically, I would like clarification on:
Please also let me know if I may have missed any step during the setup or testing procedure.
Could you please review the attached procedure/logs and help clarify these points?
Thank you.
Hi @cris_m
Please refer to Chapter 8 (Vision Pipeline with NNStreamer) in the attachment. Also you can find how to run Machine Learning application on i.MX 8M Plus with NPU acceleration.
B.R
Hello,
Could you please review the logs shared earlier and let us know if there are any issues with the procedure that was followed, or if any additional steps are required?
Thank you.
Hi @cris_m
Please run the attachment .sh file. i have tested it on my imx8mp evk board without any problem.
B.R
Yes. It worked.
Thank you.
Best regards