Hello,
We bought an evaluation board for “iMX8m plus” to run a custom post-training quantized (INT8 PTQ) TFLite model doing a semantic segmentation of RGB pictures.
Our custom model uses TF 2.4 with standard TRANSPOSE_CONV that seems to prevent execution…
For information, we use the provided TFLite delegates (i.e., NNAPI) which should be able to handle this kind of operation (https://android.googlesource.com/platform/hardware/interfaces/+/refs/heads/master/neuralnetworks/1.2... ).
The error message is explicit :
WARNING: Operator TRANSPOSE_CONV (v3) refused by NNAPI delegate: OP Version different from 1
Applied NNAPI delegate.
So, it seems our model generate a v3 TRANSPOSE_CONV rather than the v1 expected by the delegates…
We identified in TFLite source code the block responsible of the version number generation (https://github.com/tensorflow/tensorflow/blob/master/tensorflow/lite/tools/versioning/op_version.cc#...
case BuiltinOperator_TRANSPOSE_CONV: {
if (op_sig.inputs.size() == 4 &&
op_sig.inputs.at(3).type != kTfLiteNoType) {
return 3;
}
// If the op takes int8 input, it is version 2.
if (op_sig.inputs.at(1).type == kTfLiteInt8) {
return 2;
}
return 1;
}
Our first question is pretty simple: “How to generate a valid operation?”
And if we can’t generate a v1 operation: “Is it possible to run a INT8 quantized TRANSPOSE_CONV on iMX8m plus?”
To help you reproduce our behavior, before posting this message, we tried to reduce our code to generate the simplest PTQ TFLite file using a TRANSPOSE_CONV:
import tensorflow as tf
import numpy as np
def representative_data_gen():
for _ in range(16):
yield [tf.convert_to_tensor(np.random.rand(1, 16, 16, 1), dtype="float32")]
def main():
keras_model = tf.keras.models.Sequential([
tf.keras.Input(name="model_input", shape=(16, 16, 1), dtype=tf.float32),
tf.keras.layers.Conv2DTranspose(filters=2, kernel_size=3, strides=2, padding="SAME"),
])
keras_model.input.set_shape((1,) + keras_model.input.shape[1:])
converter: tf.lite.TFLiteConverter = tf.lite.TFLiteConverter.from_keras_model(keras_model)
converter.optimizations = [tf.lite.Optimize.DEFAULT]
converter.representative_dataset = representative_data_gen
converter.target_spec.supported_ops = [tf.lite.OpsSet.TFLITE_BUILTINS_INT8]
converter.target_spec.supported_types = [tf.int8]
converter.inference_input_type = tf.uint8
converter.inference_output_type = tf.uint8
with open("model.tflite", "wb") as file:
file.write(converter.convert())
if __name__ == '__main__':
main()
And to run the generated TFLite model, we use the following code on iMX8m plus:
import numpy as np
import tflite_runtime.interpreter as tflite
interpreter: tflite.Interpreter = tflite.Interpreter(model_path="model.tflite")
interpreter.allocate_tensors()
input0 = interpreter.get_input_details()[0]
get_input = interpreter.tensor(input0['index'])
get_input()[:, :, :] = np.zeros(shape=(16, 16, 1), dtype="uint8")
interpreter.invoke()
And, when we run these scripts, we always obtain the following message:
# INFO: Created TensorFlow Lite delegate for NNAPI.
# WARNING: Operator TRANSPOSE_CONV (v3) refused by NNAPI delegate: OP Version different from 1
# Applied NNAPI delegate.
Kind regards,
Guillaume SCHLOTTERBECK
Hello,
Thank you for your answer.
I am using the NXP Yocto BSP from 2021-07-19 by Karo (based on NXP5.10.9_1.0.0):
https://karo-electronics.github.io/docs/yocto-guide/nxp/index.html
I installed a specific yocto for machine learning provided by Karo. There is tflite_runtime 2.4.0 (__git_version__ lf-5.10.y-1.0.0-rc3 according to the repo you sent to me).
Indeed, according to the documentation, TRANSPOSE_CONV should be supported, but my simple TF network (with only one TRANSPOSE_CONV) use the V3 of this op. The NNAPI accept only V1.
I joined scrpits you asked and the simple model.
---
A test with my own model (with FULLY_CONNECTED, CONV_2D and TRANSPOSE_CONV).
I had the warning about TRANSPOSE_CONV with one about FULLY_CONNECTED:
Operator FULLY_CONNECTED (v5) refused by NNAPI delegate: keep_num_dims == true not supported
Operator TRANSPOSE_CONV (v3) refused by NNAPI delegate: OP Version different from 1
After profiling with /usr/bin/tensorflow-lite-2.4.0/examples/benchmark_model, I noticed that all FULLY_CONNECTED and TRANSPOSE_CONV are run on CPU instead NPU, that is coherent with warning I got.
I test to update tflite_runtime with pip according to the documentatin (https://karo-electronics.github.io/docs/software-documentation/tx8/coral.html ).
pip3 install --extra-index-url https://google-coral.github.io/py-repo/ pycoral
After update, inference with NPU are not possible anymore: inference time is longer, profiling shows that all operation are on CPU and warning disappeared.
King regards,
Guillaume SCHLOTTERBECK
HelloGuillaumeSchlo,
It seems there are some attachments missing - please share the simplified model with TRANSPOSE_CONV, the scripts you are mentioning and error message.
Also could you clarify if you are using the NXP yocto BSP and eIQ, if yes which version? Although it sounds like you are trying to implement custom TFLite support; in this case, following the eIQ tensorflow-imx implementation should help: https://source.codeaurora.org/external/imx/tensorflow-imx/tree/?h=lf-5.10.y_1.0.0. The ML User's guide mentions TRANSPOSE_CONV as being supported (Table 8).
Regards
Hello,
it is planned to add initial support for TRANSPOSE_CONV operator in LF5.10.52-2.1.0.
Regards