Testing for an overflow condition

キャンセル
次の結果を表示 
表示  限定  | 次の代わりに検索 
もしかして: 

Testing for an overflow condition

ソリューションへジャンプ
3,296件の閲覧回数
Beaker
Contributor I
How do you infer the BVS (Branch if overflow set) instruction using C?
 
Example:
I have two signed 16 bit variables that I need to add together.  If, however, the addition will cause the result to overflow a 16 bit value, the code needs to do something different.  Currently my code looks something like:
 
INT16  temp1, temp2;
 
if (((INT32)(temp1 + temp2) > 32767) || ((INT32)(temp1+ temp2) < -32768))
{
}
else
{
}
 
The code compiles without warnings, but it generates excessive assembly instructions.  I want my C code to effectively compile down to an ADD instruction followed by an BVS instruction.
 
The reason I care, is because I am porting an Intel 196 design written in assembly to a 9S12E design written in C.  So far, I'm getting quite a bit of code bloat associated with the conversion to C.  I know that I could just inline some assembly instructions, but this defeats our goal of code portability.  My manager is also assembly code phobic and whines that it is hard to understand.  I, on the other hand, enjoy the control and efficiency that assembly affords.
 
Since my C programming expertise is minimal, I thought I'd ask for help.
 
Thanks -
 
A reluctant C programmer 
ラベル(1)
タグ(1)
0 件の賞賛
返信
1 解決策
1,162件の閲覧回数
kef
Specialist I
Speaking about BVS/BVC. CPU12 manual tells us what will happen after ADDD instruction, when it sets V or clears it.
 

V: D15 M15 R15 + D15 M15 R15

 

If both args are negative but result positive or if both args are positive but result negative... 15th bit is sign bit, isn't it? This directly translates to something like this

  if( (a<0 && b<0 && (a+b)>=0) || (a>=0 && b>=0 && (a+b)<0) ) {
   }


 
I tried compiling it in CW4.5 and it looks much lighter than "if( ((long)a +b) > 32767 || ((long)a+b)<-32768)"

here's the disassembled code

   5:     if( (a<0 && b<0 && (a+b)>=0) || (a>=0 && b>=0 && (a+b)<0) ) {
  0002 fc0000       [3]     LDD   a
  0005 2c0d         [3/1]   BGE   *+15 ;abs = 0014
  0007 fe0000       [3]     LDX   b
  000a 2c08         [3/1]   BGE   *+10 ;abs = 0014
  000c f30000       [3]     ADDD  b
  000f 8c0000       [2]     CPD   #0
  0012 2c12         [3/1]   BGE   *+20 ;abs = 0026
  0014 fc0000       [3]     LDD   a
  0017 2d10         [3/1]   BLT   *+18 ;abs = 0029
  0019 fe0000       [3]     LDX   b
  001c 2d0b         [3/1]   BLT   *+13 ;abs = 0029
  001e f30000       [3]     ADDD  b
  0021 8c0000       [2]     CPD   #0
  0024 2c03         [3/1]   BGE   *+5 ;abs = 0029


 
Actually "if(a<0)" and "if(a>=0) could be compiled using BMI/BPL branches. BLT/BGT after LDD is the same as BMI/BGT. But BMI/BGT could eliminate the need to CPD #0 :

if ((a+b) < 0)

LDD   a
ADDD  b
BPL   sumispositive     ; branch to >=0 case
BMI   sumisnegative     ; branch to <0 case


 

元の投稿で解決策を見る

0 件の賞賛
返信
4 返答(返信)
1,163件の閲覧回数
kef
Specialist I
Speaking about BVS/BVC. CPU12 manual tells us what will happen after ADDD instruction, when it sets V or clears it.
 

V: D15 M15 R15 + D15 M15 R15

 

If both args are negative but result positive or if both args are positive but result negative... 15th bit is sign bit, isn't it? This directly translates to something like this

  if( (a<0 && b<0 && (a+b)>=0) || (a>=0 && b>=0 && (a+b)<0) ) {
   }


 
I tried compiling it in CW4.5 and it looks much lighter than "if( ((long)a +b) > 32767 || ((long)a+b)<-32768)"

here's the disassembled code

   5:     if( (a<0 && b<0 && (a+b)>=0) || (a>=0 && b>=0 && (a+b)<0) ) {
  0002 fc0000       [3]     LDD   a
  0005 2c0d         [3/1]   BGE   *+15 ;abs = 0014
  0007 fe0000       [3]     LDX   b
  000a 2c08         [3/1]   BGE   *+10 ;abs = 0014
  000c f30000       [3]     ADDD  b
  000f 8c0000       [2]     CPD   #0
  0012 2c12         [3/1]   BGE   *+20 ;abs = 0026
  0014 fc0000       [3]     LDD   a
  0017 2d10         [3/1]   BLT   *+18 ;abs = 0029
  0019 fe0000       [3]     LDX   b
  001c 2d0b         [3/1]   BLT   *+13 ;abs = 0029
  001e f30000       [3]     ADDD  b
  0021 8c0000       [2]     CPD   #0
  0024 2c03         [3/1]   BGE   *+5 ;abs = 0029


 
Actually "if(a<0)" and "if(a>=0) could be compiled using BMI/BPL branches. BLT/BGT after LDD is the same as BMI/BGT. But BMI/BGT could eliminate the need to CPD #0 :

if ((a+b) < 0)

LDD   a
ADDD  b
BPL   sumispositive     ; branch to >=0 case
BMI   sumisnegative     ; branch to <0 case


 

0 件の賞賛
返信
1,162件の閲覧回数
Beaker
Contributor I
Thanks for your help as well.
 
Your suggestion does compile down a little nicer than my original C code, but I'm still not happy with the amount of code that is generated.  For this reason, I just decided to inline some assembly instructions and add some good comments.
 
My code now looks something like this:
__asm{   LDD  temp2      // put temp2 into the D accumulator   ADDD temp1      // add temp1 to temp2   BVC  noOverFlow // did the addition cause an overflow—   CLRB            // yes, so defer this addition until temp2 gets smaller   RTS             //   the calling routine checks the B accumulatornoOverFlow:        //   Therefore CLRB followed by RTS is the same as a return(0) C instruction    STD  temp2      // no, so store the result into temp2}
 
I'm guessing that a number of the assembly instructions that check status bits are never used by the complier when writing in C.
 
Thanks again -
0 件の賞賛
返信
1,162件の閲覧回数
CompilerGuru
NXP Employee
NXP Employee
There is no direct, portable construct which maps to the overflow condition in C.
Therefore your options are:
- write something in C which does the same thing.
  When not using long's it would run a bit more efficient, well, but then it is not so straight forward anymore.
- use HLI (high level inline assembly). I know this is not portable, but if your want to be as efficient as assembly, then this is a case you will have to do this.

First note that in the code you show below, the overflow test does not actually work (I assume your INT16 is signed). The reason is that for (temp1 + temp2), the computation is done on 16 bit and then you convert the already overflowed result to a long. So the result of (temp1 + temp2) is always in the 16 bit range.
So a simple (and slow) version could be:
INT16  temp1, temp2;
INT32 r= (long)temp1 +
(long)temp2;
if (r >= 0x8000 || r < -0x8000L) ....

A faster version would do the operation with 16 bits and then detect the overflow  somehow else.
For example if the sign of temp1 and  temp2 is not the same, it cannot overflow.
If it is the same, the positive positive case is simple with unsigned arithmetic, the negative/negative case has the special -0x8000+-0x8000 overflow not to miss too.

What I usually write (not looking at the efficiency, just for simplicity)
INT16  temp1, temp2;
INT32 r= (long)temp1 +
(long)temp2;
if (r != (INT16)r) {
...
}


Oh, and BTW, here a snippet for unsigned arithmetic I once wrote.
It's to implement saturation for the addition for unsigned arithmetic. For unsigned the case it a bit simpler as it only can overflow in one direction.

unsigned int AddWithSaturation(unsigned int a, unsigned int b, unsigned int maxN) {
   unsigned int res;
   if (maxN <= b || maxN - b < a) {
      res = maxN;
   } else {
      res = a + b;
   }
   return res;
}


Daniel

1,162件の閲覧回数
Beaker
Contributor I
Thank you for your detailed reply.
 
You are correct about my example code not working.  You are also correct that INT16 is defined as a signed 16 bit word (aka. short).  When I originally disassembled it, I thought I was ok since every instruction was using the D accumulator.  I, however, forgot that the D (aka. double) accumulator was the concatenation of two 8 bit accumulators and not two 16 bit accumulators.
 
Now with the code looking like:
 
if ((((INT32)temp1 + (INT32)temp2) > 32767) || (((INT32)temp1 + (INT32)temp2) < -32768))
 
It literally compiles down to 40 assembly instructions.
 
I think I'll try some of your suggestions and see how effecient they are.  This is quite silly, as I think some assembly code similar to this would work:
 
LDD   temp1
ADDD  temp2
BVS   x
 
I was hoping that the complier would be able to figure out that I was testing for an overflow condition by looking at the magic numbers 32767 and -32768, and therefore use the BVS instruction.
 
Thanks again for your help -
0 件の賞賛
返信