Hello,
I'm working on a custom computer vision model that uses Conv2D layers for feature extraction and a Dense (fully connected) layer for classification.
Here's what I've done so far:
Quantized the model to INT8
Converted it using the Neutron Converter for i.MX95 NPU
After conversion, I noticed that:
Only Conv2D layers are offloaded to the NPU
The fully connected layer remains on the CPU
A SLICE operator appears in the Neutron graph
I reviewed the i.MX95 User Manual, which lists the following constraints for fully connected layers:
Input tensor must be INT8
Weight tensor must be INT8 and constant
Bias tensor must be INT32 and constant
Output tensor must be INT8
Input and output channels must be multiples of NUM_MACS
(Otherwise, the converter adds PAD or SLICE operators)
Since I see a SLICE operator in the graph, it seems the output channel alignment is handled, but the fully connected layer still isn't offloaded to the NPU.
My Questions:
Why is the fully connected layer not being converted to NPU execution?
Is the SLICE operator related to this issue?
Am I missing any other constraints for fully connected layers?
I’ve attached the model file and screenshots of the Neutron graph for reference.
Thanks in advance for your help!
