PPC 8270: Running out of IMMR RAM (in order to run with ECC enabled)

Karl_H · ‎12-02-2009

I am copying a short program into IMMR RAM and trying to execute it there. It doesn't work. No opcodes are executed and the program counter runs backward (as if there were a "b *-4" instruction in every location).

Is there a way to make this happen properly, or is running code out of IMMR RAM completely verboten?

The reason I am attempting such an unusual thing is this. I am trying to run with the ECC feature turned on using our development system under CodeWarrior. The program is running out of SDRAM, but this is the same chip select that the ECC affects. The program load is done with ECC disabled. Even if I enable it in the config file, the JTAG program load still does not set the syndrome bits correctly. So what happens is that as soon as I enable ECC, the code-space is exposed to incorrectly set syndrome bits, and the HW starts trying to "correct" my opcodes. Within a few instructions of enabling ECC, I get check-stopped for illegal op-code.

So what I'm trying to do is run a short program out of IMMR RAM (which is not affected by ECC) that will copy all of my code-space to itself, reading with ECC disabled and writing with ECC enabled. The intent is that this will set all the syndrome bits in the code-space correctly. But without the ability to do instruction fetches out of IMMR RAM, this plan is not very useful.

If anybody can think of a plan B to run out of SDRAM with ECC enabled, please let me know.

Karl

Karl_H · ‎12-04-2009

Got ECC working yesterday afternoon. There was a jumper on the board that was set incorrectly for ECC to work. Once I fixed that, the ECC worked exactly as expected.

Karl

在原帖中查看解决方案

genuap · ‎12-03-2009

You should be able to run out of IMMR RAM. It probably has to do with your MMU settings - if you have the MMU turned on.

As for syndrome bits, the typical way to get this going is to enable ECC in your bootloader. Then write a pattern to the entire memory region (i.e. clear the entire memory with syndrome enabled). This will set the syndrome bits for the memory region. You could use DMA to do this. Then write your application over to memory.

In your case, you're:

running out of DRAM:

then switch to IMMR:

then copy from DRAM to temp space (where??)

turn on ECC

then write to DRAM

The problem could be where the temp space is. Or it could be that you have things in the cache or core pipeline that point to the old space. Try this all with caches disabled. Also make sure you do an isync before you do any jumps.

... Paul

Karl_H · ‎12-02-2009

I got the the program to run out of IMMR RAM by disabling instruction caching. But I am still not achieving the bigger picture of getting the syndrome bits set correctly. I still see illegal opcodes in my code as soon as I enable ECC.

Is there any way to get past this?

Karl

Karl_H · ‎12-04-2009

Got ECC working yesterday afternoon. There was a jumper on the board that was set incorrectly for ECC to work. Once I fixed that, the ECC worked exactly as expected.

Karl

Karl_H · ‎12-03-2009

It now appears that our eval board that we got from FreeScale has a defect in its ECC. Below is the email I sent to the FAE on this subject ...

ECC Whack-a-mole

I think the eval board has a stuck data bit (or some other defect) in the ECC syndrome RAM. Here's the evidence.

I am running this code from IMMR RAM with instruction cache turned off. With ECC off, it copies data from the SDRAM to a buffer in IMMR RAM (1024 bytes). Then it turns ECC on and copies the data from the IMMR RAM buffer back to SDRAM 32 bits at a time. Most of the words get copied correctly. Somewhere between 5% and 10% have 1 bit errors. But here's the interesting part. As you know, the ECC is done on double words (64 bits). Where a bad word occurs on write-back, we write the hi word first. At that point I suddenly see a 1 bit error appear in the lo word. The hi word is correct. Then when it writes to the lo word, a 1 bit error appears in the hi word and the lo word becomes correct.

I can repeat this operation on the same double word and the pattern persists -- the last word written is always correct but its partner has the 1 bit error.

When ECC is turned off again, any 1 bit errors that appeared during the copy operation persist.

Not only that, but whether or not there is an error in a double word is dependent upon the data, not on the address. That is, if I have a troublesome double word that exhibits this error pattern, and I write that same data to a new double word boundary, it still has the same 1 bit error at the new address.

BTW the eval board (PQ2FADS-ZU) we have is marked "prototype" and its layout is not consistent with the diagram in the manual (that is the various jumpers are not where they are shown to be on page 8 of the manual). So it seems entirely possible that this board might have defects, since prototypes often have problems.