Help with MCF5475 speed problem.

取消
显示结果 
显示  仅  | 搜索替代 
您的意思是: 
已解决

Help with MCF5475 speed problem.

跳至解决方案
2,417 次查看
dimepu
Contributor I

Hi, I'm using a Coldfire V4e MCF5475 and I have a problem. It seems like the core speed isn't as fast as it should be, 
I've been working in several Colfire MCU from MCF51QE128, MCF5213, MCF5249 and this is the only time such problem as occur. 

I'm just using the Coldfire Core and the DMA controller as the masters of the XL bus, the FlexBus, PCI, and other peripheral are disable. The CF core is the master of the XL bus and the DMA acts like the secondary master, all code and user variables are loaded into the Core SRAM memory, the clock ratio is set as default 1:4. 

The code is very simple, just putting a GPIO signal in High and Low state to check speed, also using a very simple filtering routine to check the operations speed. But the results are very poor, the GPIO only gets 1MHz while in the MCF51QE128 gets 4MHz, the filter routine takes 17us while in the MCF5249 takes 7us, I´m using exactly the same code for all MCU. 

 

I hope you could help me, If I'm doing something wrong or maybe something I've not considerate yet. 

 

I'll apreciate any kind of help.

标签 (1)
0 项奖励
1 解答
1,134 次查看
dimepu
Contributor I

Thanks  for you answer, I use to toggle the GPIO only as a primarial test, anyway I found the problem and It was only enabling the Cache, in 5475 it's disable by defect. The code for enabling the cache is in the Reference Manual Rev.5 section 7.13 page 7-30.

 

The only thing I had to do was copy that code in my starcf.c code at the begining of the _startup section, changing the flash adress by the internal RAM adress and that's all, the Coldfire 5475 is running as fast as it should be, I guess the disable cache was causing the code to be stored in one of the external memories, and the CPU spend more  time reading the code from there.

 

Thanks again for your interest.

 

Diego

在原帖中查看解决方案

0 项奖励
3 回复数
1,134 次查看
aersek
Contributor I

Toggling of GPIO pin isn't good way to make speed benchmark on ColdFire processors. On V2 processors all accesses to GPIO registers spends I think 12 wait states. On V4 it may be even worse. It's poor documented in every reference manual, so for benchamrk you should use internal timer to measure execution time of code.

 

Best regards

 

Andrija

0 项奖励
1,135 次查看
dimepu
Contributor I

Thanks  for you answer, I use to toggle the GPIO only as a primarial test, anyway I found the problem and It was only enabling the Cache, in 5475 it's disable by defect. The code for enabling the cache is in the Reference Manual Rev.5 section 7.13 page 7-30.

 

The only thing I had to do was copy that code in my starcf.c code at the begining of the _startup section, changing the flash adress by the internal RAM adress and that's all, the Coldfire 5475 is running as fast as it should be, I guess the disable cache was causing the code to be stored in one of the external memories, and the CPU spend more  time reading the code from there.

 

Thanks again for your interest.

 

Diego

0 项奖励
1,134 次查看
TomE
Specialist II

 


dimepu wrote: I guess the disable cache was causing the code to be stored in one of the external memories, and the CPU spend more  time reading the code from there.

 


 

 

The point about I/O pin speed is very important. Last year I was working on a Marvell PXA320. To get to an I/O pin takes TWO HUNDRED WAIT STATES on that chip!

 

So you program an internal timer, and then monitor it with code and wave an I/O pin around at 1/65536 of the timer rate. Then measure that to check your clock and CPU speed programming. Once that's set up it should remain OK.

 

The cache setting can't control what memory the code is loaded into. That's up to where you link it.

 

My experience is with the MCF5329, but there should be similarities. I'm listing what it does in case that helps you to find equivalent sections in your reference manual.

 

1 - The external FLASH we have on the 5329 has to be set up with the right number of wait states - it defaults to a very slow rate otherwise - 63 wait states at 80MHz, or 189 times slower than the CPU.

 

2 - We then configure the external FLASH to run at NINE wait-states at 80MHz, so it is running at 1/27 of the CPU clock rate. Code running from there is glacial.

 

3 - For the SRAM we have to set RAMBAR, to Enable, set the Address and allow CPU Direct Access to the SRAM. Without the latter the CPU gets to the SRAM by the "Crossbar Back Door" very slowly. And you have to program the Crossbar if you have one too. There's even a register somewhere to allow "burst access" to the memory which defaults to off.

 

4 - Set up the cache properly. We're using "Write-through" and find that faster with our application mix than "Write Back",


Then you'll have fun trying to get any DMA working as that either has to have all of its buffers in uncached SRAM or you need to set up "uncached buffer regions" or use cache flushes in the right places.

 

Have fun.

 

0 项奖励