Hello Zu,
if you want to optimize memory consumption, I suggest you look into model quantization. TensorFlow supports various quantization methods.
Furthermore, NXP supports Glow and TF Lite Micro, which both allow for memory consumption and performance optimizations.
Keep in mind, that memory consumption is very dependent on the model you use in your application. The model has to be stored in flash and all intermediate results and weights must be stored in ram during inference.
If your original issue was resolved, please mark this thread as resolved. If you run into new problems or have new questions, feel free to create a new thread.
Good luck with your development,
David