Mismatch between CPU and NPU in simple Conv2D on i.MX 8M Plus

cancel
Showing results for 
Show  only  | Search instead for 
Did you mean: 

Mismatch between CPU and NPU in simple Conv2D on i.MX 8M Plus

1,134 Views
deneriz
Contributor II

Hi,

We have been working with the VX delegate to execute TFLite models on the NPU of the i.MX 8M Plus which is a VeriSilicon's VIPNano-SI+. Doing so, we have found that there are mismatches between the execution of the model in CPU and in NPU, even with a model with a single Conv2D with a 3x3 kernel and padding 'same'. This plot shows the distribution of this mismatch.

deneriz_0-1724239987095.png

 

Even more, this mismatch errors propagate along different layers across the model. The attached file contains the decomposition of a model with 20 Conv2D layers into 20 models, each of them adding one layer to the previous one, allowing the measurement of the mismatch after each of the layers. The following plot shows this propagation across the model.

deneriz_1-1724240059865.png

 

Is there a way to avoid this mismatch?

We are using TFLite Runtime 2.9.1.1 with delegate version lf-5.15.71_2.2.0.

@robertkalmar 

6 Replies

566 Views
hgaiser
Contributor I

Has there been any update on this? I believe I am experiencing the exact same issue, with the same chip. The version I am using is lf-6.1.55-2.2.2 and tensorflow 2.17.0, on a Compulab board with the same i.MX 8M Plus chip.

0 Kudos
Reply

541 Views
deneriz
Contributor II

Hi @hgaiser

Unfortunately, we were unable to find a viable solution to this issue. Despite reaching out to both NXP and VeriSilicon (see this issue, where they acknowledged that "NPU integer math is not bit-accurate compared to the TFLite CPU implementation"), no resolution was provided. As a result, we have decided to discontinue working with this processor.,

0 Kudos
Reply

536 Views
hgaiser
Contributor I
Hey, thanks for your reply, much appreciated. Sorry to hear your issue wasn't resolved.. My use case is semantic segmentation, so similar to your case there is an embedding representation that gets modified too much by these bit inaccuracies. The result is that nothing gets segmented when executing on the NPU.

Fingers crossed for a way to resolve this, but I'm afraid it'll be difficult.

1,078 Views
brian14
NXP TechSupport
NXP TechSupport

Hi @deneriz

Thank you for contacting NXP Support!

I'm in the researching process about this issue. However, I noticed that the BSP version tested is 5.15.71_2.2.0. Have you tried with our latest BSP version?
Is it the same behavior?

Have a great day!

0 Kudos
Reply

1,037 Views
deneriz
Contributor II

Hi @brian14,

We have to use the version 5.15.71_2.2.0 because we are using the i.MX 8M Plus through the SolidRun System on Module, and this is the latest version they support.

Thanks for your reply, we'll appreciate any help on this.

0 Kudos
Reply

983 Views
brian14
NXP TechSupport
NXP TechSupport

Hi @deneriz,

I couldn't find a documented issue for Conv2D with the i.MX8M Plus. However, since you are using a board and software from other vendor I suggest you to contact with your vendor support.
On my side, I will let you know if I can find more details about this issue.

Have a great day!

 

0 Kudos
Reply