i have Quantized lstm model by setting input and output inference type to uint8.
by following script :
import tensorflow as tf
import numpy as np
from tqdm import tqdm
import random
dat = np.load("data_train.npy")
len = dat.shape[0]
def representative_data_gen():
for i in tqdm(range(10000)):
p = random.choice(dat)
da = np.expand_dims(p,axis=0)
yield [np.float32(da)]
model = tf.saved_model.load("./model_dir")
concrete_func = model.signatures[
tf.saved_model.DEFAULT_SERVING_SIGNATURE_DEF_KEY]
concrete_func.inputs[0].set_shape([1, ?, ?])
concrete_func.outputs[0].set_shape([1, ?, ?, ?])
converter = tf.lite.TFLiteConverter.from_concrete_functions([concrete_func])
converter.optimizations = [tf.lite.Optimize.DEFAULT]
converter.representative_dataset = representative_data_gen
converter.target_spec.supported_ops = [tf.lite.OpsSet.TFLITE_BUILTINS_INT8]
converter.target_spec.supported_ops = [tf.lite.OpsSet.TFLITE_BUILTINS, tf.lite.OpsSet.SELECT_TF_OPS]
converter.inference_input_type = tf.uint8
converter.inference_output_type = tf.uint8
converter._experimental_lower_tensor_list_ops = False
tflite_quant_model = converter.convert()
with open("./quant_model.tflite","wb") as f:
f.write(tflite_quant_model)
Now when i am executing the quant_model.tflite using the benchmark script
USE_GPU_INFERENCE=0 /usr/bin/tensorflow-lite-2.6.0/examples/benchmark_model --graph=quant_model.tflite --num_runs=50 --external_delegate_path=/usr/lib/libvx_delegate.so
Facing error :
-----
-----
ERROR: Fallback unsupported op 32 to TfLite
Explicitly applied EXTERNAL delegate, and the model graph will be partially executed by the delegate w/ 13 delegate kernels.
ERROR: Regular TensorFlow ops are not supported by this interpreter. Make sure you apply/link the Flex delegate before inference.
ERROR: Node number 5 (FlexTensorListFromTensor) failed to prepare.
ERROR: Failed to apply the default TensorFlow Lite delegate indexed at 0.
Failed to allocate tensors!
Benchmarking failed.