AnsweredAssumed Answered

TPMxCNT coherency mechanism behavior in CW10.4

Question asked by Doug Paulsen on Sep 12, 2018


Found some interesting behavior reading the MC9S08QE32 micros TPMxCNT register in a CodeWarrior 10.4 project that might be of interest.


The idea was to use the low order 5 bits of the free running TPM3CNT counter register to obtain a "random" number for a delay function.


In the code a local variable was defined as:


                volatile word              Delay100US;


Later, the following line was executed:


                Delay100US = TPM3CNTL & 0x1F;


That is, one is reading a 16-bit register, ANDing for the low bits, and writing the results to an unsigned 16-bit variable.

This resulted in erratic readings from TPM3CNT a millisecond or so later (post delay) in the code.  To make a VERY long story short, the code line quoted above disassembles to this:


  449:       Delay100US = TPM3CNT & 0x1F;

  00000048 B601        LDA    TPM3CNT

  0000004A A41F        AND    #0x1F

  0000004C 8C            CLRH  

  0000004D 97            TAX   

  0000004E 9EFF05   STHX   5,SP


Note the CW10 code uses the 8-bit LDA instruction to read TPM3CNT.


Section 16.3.2 of the MC9S08QE32 user manual states the following regarding the TMP counter register coherency mechanism :


“Reading either byte (TPMxCNTH or TPMxCNTL) latches the contents of both bytes into a buffer where they remain latched until the other half is read.”


In our case the next TPM3CNT read was some interval later to establish an operation elapse time.  The MSByte, however, appears to have been latched per the coherency mechanism on the previous 8-bit read for a delay interval and remained latched thereafter, thus rendering the subsequent TPM3CNT value obtained quite bogus.


Through trial and much error, we found that, unlike the manuals declaration, it actually took two-three TPM3CNT reads to get a proper count out of the register again. Whatever that coherency mechanism is, leaving it in place for an extended period (>1 ms) really gummed up the TPM count register internals.  You could see three consecutive TPM3CNT reads (on three lines of code) jumping huge count numbers on each read.  Only by the third read did the elapse interval calculated from the TPM3CNT numbers finally match 'scoped interval measurements. 


Early on, we found the following code solved the problem, but didn’t inspect the assembly code until much later to realize what was going on:


    451:       Delay100US = TPM3CNT;

  00000051 5500           LDHX   TPM3CNT

  00000053 9EFF05      STHX   5,SP

    452:       Delay100US &= 0x1F;

  00000056 9EFE05     LDHX   5,SP

  00000059 9F             TXA   

  0000005A A41F        AND    #0x1F

  0000005C 8C           CLRH  

  0000005D 97           TAX   

  0000005E 9EFF05   STHX   5,SP


By breaking the original, single line of C code into two lines as shown here, the compiler caused a 16-bit TPM3CNT read in the first place so the coherency mechanism was simply never a factor.


It looks a bit like compiler optimization outsmarted itself in the original code, at least as far as the special case of reading the TPM counter register with its coherency system in place.


The lesson learned is to make sure all TPM count register reads are a full 16-bits.


Are there other registers which one should take similar care with?