i.MX95 GoPoint DMS Demo: Graph differences and PTQ process for face_detection_ptq.tflite

akashhalli · ‎05-29-2026

Hello,

I am currently running the GoPoint Driver Monitoring System (DMS) demo on an i.MX95 evaluation kit using the lf-6.12.49_2.2.0 release.

While examining the source code comments for the face detection model, it notes that the demo utilizes Google's MediaPipe BlazeFace (short-range) model:

Original Google Model: face_detection_short_range.tflite (MediaPipe Assets URL)
Model Card: BlazeFace Model Card
License: Apache-2.0

However, the demo's downloads.json points to an NXP-hosted, quantized variant:

NXP Asset URL: https://github.com/nxp-imx-support/nxp-demo-experience-assets/raw/lf-6.12.49_2.2.0/models/face_detec...

The Issue / Discrepancy:

When I load and compare both the original Google MediaPipe TFLite model and NXP's face_detection_ptq.tflite in netron.app, I notice that the model graphs are structurally different. They do not look like a simple 1:1 quantization of the exact same network topology.

I have two questions regarding how NXP prepared this asset for the i.MX95 NPU / eIQ stack:

Graph Discrepancies: Why are the model graphs structurally different in Netron? Did NXP modify the network architecture, strip custom MediaPipe TFLite operations (like custom anchors/detections), or substitute certain layers to optimize compatibility with the i.MX95 NPU / eIQ inference engine?
PTQ Implementation Pipeline: If NXP optimized and converted the original Google model, how was Post-Training Quantization (PTQ) applied? Typically, standard TensorFlow optimization pipelines require the original frozen graph (.pb), saved model format, or floating-point Keras/TF definitions to run calibration datasets. Since Google distributes MediaPipe models directly as .tflite files, did NXP apply PTQ directly onto a floating-point .tflite file (e.g., using the tf.lite.TFLiteConverter.from_saved_model pipeline or eIQ tools), or was the model reconstructed from scratch?

Any insight into the exact optimization and quantization workflow used for this demo asset would be highly appreciated!

Thanks in advance.

JosephAtNXP · ‎06-01-2026

Hi,

Thank you for your interest in NXP Semiconductor products,

1. Models often have unsupported operators and lead to change them with a combination of supported operators. The difference you are observing may be because of supported operators to optimize NPU usage.

https://www.nxp.com/docs/en/user-guide/UG10166.pdf

2. PTQ can be applied with a quantized TFLite model using eIQ toolkit prior to converting it to Vela or Neutron models.

https://docs.nxp.com/bundle/EIQTUG/page/topics/quantization.html

Regards