Default layout inference pass

spacy · ‎09-23-2024

Hi, I am testing a quantized model on the i.MX 8M Plus using the VX delegate. When running the model, I receive the following two warnings:

W [HandleLayoutInfer:291]Op 162: default layout inference pass.
W [HandleLayoutInfer:291]Op 56: default layout inference pass.

Could you provide any documentation or information regarding what Op 162 and Op 56 refer to?

The model runs faster on the CPU than on the NPU, which I suspect may be related to these warnings. What does "default layout inference pass" mean in practice?

daehong · ‎02-25-2025

Hello, I'm experiencing the same issue and wanted to ask if you were able to resolve it.

I'm curious if you found a solution for the HandleLayoutInfer:257 issue. If so, I would really appreciate it if you could share how you fixed it.

For reference, I noticed that the log contains the message:
W [HandleLayoutInfer:257] Op 18: default layout inference pass.

This occurs while following Section 2.7.1 "Running the example on the i.MX 8 platform hardware accelerator" in the i.MX 8M+ User Guide.

Thanks in advance for any help!

Bio_TICFSL · ‎09-24-2024

Hello,

There is no documentation about the warning, One of the issue for this warning is that the interpreter is being created each time in the function. You need to initialize the interpreter once and then pass it into the function, rather than creating it a new each time.

regards

spacy · ‎09-24-2024

I'm not sure I understand, I am testing the model like this:

import time
import numpy as np
import tflite_runtime.interpreter as tflite

model_path = "model.tflite"
ext_delegate = [tflite.load_delegate("/usr/lib/libvx_delegate.so")]
interpreter = tflite.Interpreter(
    model_path=str(model_path),
    experimental_delegates=ext_delegate,
    num_threads=1,
)

interpreter.allocate_tensors()
input_details = interpreter.get_input_details()
output_details = interpreter.get_output_details()

for _ in range(5):
    start = time.time()
    for ins_meta in input_details:
        ins = np.random.randint(0, 256, size=ins_meta["shape"], dtype=np.uint8)
        interpreter.set_tensor(ins_meta["index"], ins)
    interpreter.invoke()
    out_dict = {o["name"]: interpreter.get_tensor(o["index"]) for o in output_details}
    print("Time:", time.time() - start)

Is that incorrect?