Hi There,
I have few questions regarding using 8M Plus for running LLMs & CV Models.
1. Since 8M Plus offers 2.3 TOPS of AI performance which generally is just enough to run CNN models or smaller models like Bert or CV models like MobileNet. Just looking at the TOPS is it really possible to run a Q4 quantized model like DeepSeek R1 1.5B which is almost 1 GB in Q4 GGUF format? (even tflite conversion will be quite heavy size, I believe)
2. On the other hand the conversion process of LLM models like DeepSeek r1 1.5B is not straight forward, gives errors. Makes me hard to believe this could be converted even successfully, has someone did that before?
3. Looks like the devices which can give 50+ TOPS could be considered only for running these models in order to have a normal inference performance.
Please help me on this.
IMX8MPLUS