<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>i.MX ProcessorsのトピックRe: Problems execute quant models on NPU with ONNXRuntime</title>
    <link>https://community.nxp.com/t5/i-MX-Processors/Problems-execute-quant-models-on-NPU-with-ONNXRuntime/m-p/1385130#M184292</link>
    <description>&lt;P&gt;The tensor of NPU can't support float input/output.It can support&amp;nbsp;&lt;STRONG&gt;&lt;SPAN style="font-family: inherit;"&gt;8/16-bit integer &lt;/SPAN&gt;&lt;/STRONG&gt;&lt;SPAN style="font-family: inherit;"&gt;Tensor data format and s&lt;/SPAN&gt;&lt;SPAN style="font-family: inherit;"&gt;upport 8, 16, 32-bit integer operations pipeline.&lt;/SPAN&gt;&lt;/P&gt;</description>
    <pubDate>Sat, 11 Dec 2021 05:51:14 GMT</pubDate>
    <dc:creator>Zhiming_Liu</dc:creator>
    <dc:date>2021-12-11T05:51:14Z</dc:date>
    <item>
      <title>Problems execute quant models on NPU with ONNXRuntime</title>
      <link>https://community.nxp.com/t5/i-MX-Processors/Problems-execute-quant-models-on-NPU-with-ONNXRuntime/m-p/1384605#M184238</link>
      <description>&lt;P&gt;Hello there,&lt;/P&gt;&lt;P&gt;I have a problem execute quantized models on the NPU with ONNXRuntime. I downloaded the models mobilenet_v2_1.0_224.tflite, mobilenet_v2_1.0_224_quant.tflite, inception_v3.tflite and inception_v3_quant.tflite from the Machine Learning User's Guide and converted the models with the eIQ model converter.&lt;/P&gt;&lt;P&gt;For all the models I get correct results running it with the CPU_ACL EP. When I run the not quantized models with the Vsi_Npu EP, I get correct results too. But when I run the quantized models with the Vsi_Npu EP, I get wrong results.&lt;/P&gt;&lt;P&gt;I tried the following thing too: Convert the mobilenet_v2.tflite model to a mobilenet_v2.onnx model and quantize it then with float as input and output data type. Then I get wrong results even if I run the model on the CPU.&lt;/P&gt;&lt;P&gt;Is there a problem with the Vsi_Npu EP for running quantized models? Or is there a problem with my converted models (I attach them here)?&lt;BR /&gt;&lt;BR /&gt;Thanks for your help !&lt;/P&gt;&lt;P&gt;If anyone needs more information, please ask.&lt;/P&gt;&lt;P&gt;Kind regards, Chris&lt;/P&gt;</description>
      <pubDate>Fri, 10 Dec 2021 07:03:48 GMT</pubDate>
      <guid>https://community.nxp.com/t5/i-MX-Processors/Problems-execute-quant-models-on-NPU-with-ONNXRuntime/m-p/1384605#M184238</guid>
      <dc:creator>christhi</dc:creator>
      <dc:date>2021-12-10T07:03:48Z</dc:date>
    </item>
    <item>
      <title>Re: Problems execute quant models on NPU with ONNXRuntime</title>
      <link>https://community.nxp.com/t5/i-MX-Processors/Problems-execute-quant-models-on-NPU-with-ONNXRuntime/m-p/1385130#M184292</link>
      <description>&lt;P&gt;The tensor of NPU can't support float input/output.It can support&amp;nbsp;&lt;STRONG&gt;&lt;SPAN style="font-family: inherit;"&gt;8/16-bit integer &lt;/SPAN&gt;&lt;/STRONG&gt;&lt;SPAN style="font-family: inherit;"&gt;Tensor data format and s&lt;/SPAN&gt;&lt;SPAN style="font-family: inherit;"&gt;upport 8, 16, 32-bit integer operations pipeline.&lt;/SPAN&gt;&lt;/P&gt;</description>
      <pubDate>Sat, 11 Dec 2021 05:51:14 GMT</pubDate>
      <guid>https://community.nxp.com/t5/i-MX-Processors/Problems-execute-quant-models-on-NPU-with-ONNXRuntime/m-p/1385130#M184292</guid>
      <dc:creator>Zhiming_Liu</dc:creator>
      <dc:date>2021-12-11T05:51:14Z</dc:date>
    </item>
    <item>
      <title>Re: Problems execute quant models on NPU with ONNXRuntime</title>
      <link>https://community.nxp.com/t5/i-MX-Processors/Problems-execute-quant-models-on-NPU-with-ONNXRuntime/m-p/1385143#M184298</link>
      <description>&lt;P&gt;Hey &lt;a href="https://community.nxp.com/t5/user/viewprofilepage/user-id/151788"&gt;@Zhiming_Liu&lt;/a&gt; , thanks for your help.&lt;/P&gt;&lt;P&gt;Maybe this is correct, but when I run not quantized models with float as input/output on the NPU (using VsiNpu or NNAPI from OnnxRuntime) I get correct results. The NPU don't support the layers but will fall back on the CPU to calculate the results. And the models I attached above are quantized with uint8_t input/output and should be supported by the NPU.Why do I get wrong results for them? I mean, when a layer in one of the models maybe is don't supported, it should fall back to the CPU and brings correct results too.&lt;BR /&gt;For ArmNN and TFLite I don't have problems like that. For both everything works fine with quantized models with uint8_t input and not quantized models with float input. So is there maybe a problem by OnnxRuntime? Or there is a problem in the convertion step from tflite to onnx format? Could you please have a look on the models? Maybe there is something wrong.&lt;/P&gt;&lt;P&gt;Thanks for your help.&lt;/P&gt;&lt;P&gt;Kind regards, Chris&lt;/P&gt;</description>
      <pubDate>Sat, 11 Dec 2021 08:31:22 GMT</pubDate>
      <guid>https://community.nxp.com/t5/i-MX-Processors/Problems-execute-quant-models-on-NPU-with-ONNXRuntime/m-p/1385143#M184298</guid>
      <dc:creator>christhi</dc:creator>
      <dc:date>2021-12-11T08:31:22Z</dc:date>
    </item>
  </channel>
</rss>

