Hi,
What is your RT derivative?
It will depends of many parameters such as your screen resolution, framebuffer encoding (24bpp or 8bpp with LUT does not have the same memory footprint). You can reduce the 24bpp to 16bpp offline without that much degradation in quality if you do spatial dithering (https://en.wikipedia.org/wiki/Dither ) with ImageMagick for instance.(https://legacy.imagemagick.org/Usage/quantize/ ) for instance on you graphics data.
You may need to optimize you GUI, for instance on i.MXRT1170 encoding with LUT save a lot of memory, keep you YUV format for camera input and convert it with the display controller will help, in addition it is recommended to do XIP from QSPI/Octal NOR to save RAM space, etc...
Graphics is a question of memory space and memory bandwidth, so you have to use basic tricks to reduce them.
BR
V.