hi, i am working on IMX515(CORTEX A8) processor. We have ported one image processing algorithm but it runs very slowly so i request you all please give some basic ideas about the optimization.
1. as i read cortex a8 has 13 stage pipeline . but i would like to have the pipeline information .
gcc compiler doesnt give any info regarding this.
2. I guess it has SDMA . how to implement this sdma. i would like to send the data from external memory to internal memory.
3. I tried the NEON but as my code doesnt have any serial excution so it doesnt give the good performance.
if any one has any idea regarding the above question, please reply at your convenience