Assembly programmers

cancel
Showing results for 
Search instead for 
Did you mean: 

Assembly programmers

11,899 Views
mke_et
Contributor IV
I was just wondering how many people here are using assembly programming as opposed to C or other languages.
 
The reason I ask, is that there seems to be a 'divide' between the C people and their forum needs and the 'hardware' people that isn't quite filling the needs of assembly programmers.
 
Ok, for example...  When I first started with the 9S12 it was with the '9S12 Badge' which came equipped with a 9S12DP256B part.  I got one at a 'seminar' on the Star12.  At that time, the 'rep' warned us about certain 'issues' with the Star12 series and 'tricks' we had to use to do certain things.  (I remember one of them was with lost com interrupts)  Anyway, I wrote my routines in assembly and I have yet to see the problem alluded to in the seminar.  So, I can't help but wonder if the problem is really an issue that the 'C' code was written to how the part was SUPPOSED to work in developement, but my assembly was written to how the part actually DOES work from the spec sheets.  Does that make sense?

Also, I don't use ANY of the templates or other tools in CodeWarrior.  For that matter, I'm running CodeWarrior SE.  The 'early' one that came with the BDM pod.  Originally I was running 1.0, and the ONLY reason I upgraded to 3.x was to get support for the USB-BDM pod.  It also sorta forced me to upgrade my CW 2.0 that I was using for some KX8 stuff.  Well, I 'could' have stayed with CW2.0 for the KX8, but I took the opportunity to upgrade that as well, even though I still have to use the LPT interface for the MON08-MultiLink. 
 
Another issue is interfacing.  I did my own flash and NVRAM routines.  In assembly of course.  And even I2C routines.  (I'm using a different version part on another project and when it came time to do the I2C, I just wrote my own routines for it so I could develope the original code with my old badge wire-wrapped to a prototype.)
 
Bottom line is that I think there's a bit of a different 'mindset' for the assembly people than the C people, and just wondered how many other people do assembly programming here.
 
Mike
 
Labels (1)
0 Kudos
50 Replies

421 Views
mke_et
Contributor IV
Rhino said...

"Ok, C is absolutely un-equivocably(sp?) better than assy for arithmetic. No question, you win that one. I really detest having to do arith in assy."

Well, that may be true. But have you tried interfacing to the 32 bit floating math routines (HC11FP Copyright 1986 by Gordon Doughman) that used to be provided on the Motorola web site for the 68HC11? Took just a little bit of work get them converted over to the Star12 and make them build correctly in CW, but once they were set up to work they work fine. Had quite a few gotchas in the conversion though, but the work was worth it

They were public domain in as much as they could be used in products for sale as object code. Distrubution of source code for money is prohibited.
0 Kudos

421 Views
bigmac
Specialist III
Hello,
 
I realise that assembler is most suited to some tasks, and C most suited to other tasks.  I have attached some assembly code where I think neither method would have a distinct advantage, and where there is no dependency on hardware or peripherals.  The routines require many cycles, so might be a good comparitive test of coding efficiency.
 
The code consists of two sub-routines associated with generating POCSAG paging codewords.  The first sub-routine computes the BCH (31,21) check bits required, and the second routine computes a final even parity bit for the 32 bit codeword.  The attached code is written in HC705 assembly.
 
The challenge for those interested is to adapt these two routines to HC08 assembler, HC12 assembler, or C code compiled for HC08 and/or HC12.  We can then compare bytes consumed and cycles required.
 
Regards,
Mac
 
0 Kudos

421 Views
rhinoceroshead
Contributor I
Okay, here is a skeleton program and an ISR for an SCI receive with buffer and XOFF.  I did not handle framing or noise errors and I'm assuming that the transmitter is not busy (I polled the transmitter ready bit, which I wouldn't normally do in an ISR).
 
Code:
#include <hidef.h> /* for EnableInterrupts macro */#include "derivative.h" /* include peripheral declarations */#define RX_BUF_SIZE 16#define RX_TOP_THRESH 14#define RX_BOT_THRESH 2void init(void);void rxISR(void);unsigned char rxBuffer[RX_BUF_SIZE];             // receive bufferunsigned char rx_read_index, rx_write_index;     // read & write indexesunsigned char rx_buff_rem = RX_BUF_SIZE;         // free buffer remainingunsigned char xon_xoff;                          // xon/xoff statusvoid main(void) {  init();  EnableInterrupts;  while (1) {    __RESET_WATCHDOG();    asm wait;  }}void init() {  SCC1 = 0x40;                                   // enable SCI  SCC2 = 0x2C;                                   // enable rx, tx, rx int  SCBR = 0x03;                                   // set BAUD}#pragma TRAP_PROCvoid rxISR() {  SCS1;                                          // dummy read SCRF flag to clear  rxBuffer[rx_write_index++] = SCDR;             // write incoming byte to buffer  rx_buff_rem--;                                 // decrement buffer remaining  if (rx_write_index == RX_BUF_SIZE)             // if end of buffer is reached    rx_write_index = 0;                          // then start at beginning  if (rx_buff_rem == RX_BOT_THRESH) {            // if buffer size reaches xoff threshold    while((SCS1 & 0x80) == 0x00);                // wait for Tx ready    SCDR = 0x13;                                 // send XOFF signal  }}

 And here is the disassembled HC08 code for the ISR:  (KX8)
 
Code:
  0000 8b       [2]             PSHH     34:    SCS1;                                          // dummy read SCRF flag to clear  0001 b600     [3]             LDA   _SCS1   35:    rxBuffer[rx_write_index++] = SCDR;             // write incoming byte to buffer  0003 be00     [3]             LDX   rx_write_index  0005 b600     [3]             LDA   _SCDR  0007 8c       [1]             CLRH    0008 e700     [3]             STA   @rxBuffer,X  000a 5c       [1]             INCX    000b bf00     [3]             STX   rx_write_index   36:    rx_buff_rem--;                                 // decrement buffer remaining  000d 3a00     [4]             DEC   rx_buff_rem   37:    if (rx_write_index == RX_BUF_SIZE)             // if end of buffer is reached  000f a310     [2]             CPX   #16  0011 2602     [3]             BNE   L15 ;abs = 0015   38:      rx_write_index = 0;                          // then start at beginning  0013 3f00     [3]             CLR   rx_write_index  0015          [5]     L15:       39:    if (rx_buff_rem == RX_BOT_THRESH) {            // if buffer size reaches xoff threshold  0015 b600     [3]             LDA   rx_buff_rem  0017 a102     [2]             CMP   #2  0019 2606     [3]             BNE   L21 ;abs = 0021  001b          [5]     L1B:       40:      while((SCS1 & 0x80) == 0x00);                // wait for Tx ready  001b 0f00fd   [5]             BRCLR 7,_SCS1,L1B ;abs = 001b   41:      SCDR = 0x13;                                 // send XOFF signal  001e 6e1300   [4]             MOV   #19,_SCDR  0021          [5]     L21:       42:    }   43:  }  0021 8a       [2]             PULH    0022 80       [7]             RTI  

I wouldn't call myself an HC08 assembly expert, but this looks pretty good to me.  I count 35 bytes, and in the most common case there will be 57 CPU cycles (counting the 16 interrupt overhead cycles - so really 41 cycles).

I'm now expecting a severe lashing from the Freescale community.  Smiley Happy

0 Kudos

421 Views
imajeff
Contributor III
P.S. Sorry, I forgot to ask before posting this, which target MCU is it? I forgot that maybe the HC08 families are named different and I've ported mine for MC9S12. That would nulify some of my concern below...

rhinoceroshead, I guess your C is written in CodeWarrior(?)
Why do they have such alien names for hardware registers? This is what I would write:

SCICR1 = 0x40; // enable SCI
SCICR2 = 0x2C; // enable rx, tx, rx int
SCIBD = 500000/9600; // set BAUD (8MHz bus)

Your same code is certainly not using the register names even close to the device guides from Freescale. Are they trying to confuse us, or just make it non-portable to other compilers?

SCC1 = 0x40; // enable SCI
SCC2 = 0x2C; // enable rx, tx, rx int
SCBR = 0x03; // set BAUD

..BR instead of ..BD for SCIBD? It's like they are trying real hard to complicate the learning curve.

Message Edited by imajeff on 2006-06-08 11:53 AM

0 Kudos

421 Views
bigmac
Specialist III

Hello Jefferson,

The register names are correct for use with HC08 assembler.  So they are different than the HC9S12 register names, but so are the assembly instructions.  We just need to learn the differences.

Regards,
Mac

Message Edited by bigmac on 2006-06-09 04:08 AM

0 Kudos

421 Views
rhinoceroshead
Contributor I

Yes.  This was CodeWarrior and the register names are consistent with the HC08 KX8 datasheet.  Rocco asked for it to be HC08 assembly so he could compare it with something he already had written.  Sorry about posting HC08 code in the 16 bit forum, but the topic itself could have gone in either place.

I didn't finish writing a full implementation of XON/XOFF either.  To be a fair comparison I should really do the whole thing.  From what is there, however, I think I would have done something very similar to what the compiler did - had I done it in assembly.

0 Kudos

421 Views
imajeff
Contributor III
Thanks for clearing that up for me. I wasn't sure who was going to change their target to make it even. I think the 8-bit is best since that's where there are less resources, and that's where I think familiarity with Assembly pays off for most. Since I knew Asm fluently for the HC05, I never needed any more MCU power until the supply of windowed chips was diminishing.

Anyway, I got the rhino example ported for GCC/S12. It demonstrates some of my problems with C compilers.

They say C is portable, but I had to change syntax to work on the GCC compiler. In large general-purpose computers with common device drivers, C can be portable. Just not for the diversity embedded systems.

They may say it's great that one can copy/paste C source to another environment at least for the same target (like from CW to GCC), but it will likely compile differently. Since 8-bit are more limited, it makes more difference that compiled code takes more bytes. I say if you can't afford to be ignorant about it, ya might as well write it in Asm.

The question might be, rhinoceroshead, is this code bloating because it is suddenly not well-written anymore, or because one cannot rely on just any C compiler? So while C programmers are paying top-dollar for the most efficient C compiler suites, the Asm programmers can use the free one but still get max efficiency.

00004038 :
    4038: 18 0b 40 00  movb #64, ca __bss_size+0xb1>
    403c: ca
    403d: 18 0b 2c 00  movb #44, cb __bss_size+0xb2>
    4041: cb
    4042: 18 03 00 34  movw #34 __bss_size+0x1b>, c8 __bss_size+0xaf>
    4046: 00 c8
    4048: 3d           rts

00004049 :
    4049: 18 01 ae 10  movw 1002 _.tmp>, 2,-SP
    404d: 02
    404e: 18 01 ae 10  movw 1004 _.z>, 2,-SP
    4052: 04
    4053: 18 01 ae 10  movw 1006 _.xy>, 2,-SP
    4057: 06
    4058: f6 00 cc     ldab cc __bss_size+0xb3>
    405b: cd 10 09     ldy #1009
    405e: f6 10 1a     ldab 101a
    4061: 19 ed        aby
    4063: 18 09 40 00  movb cf __bss_size+0xb6>, 0,Y
    4067: cf
    4068: 72 10 1a     inc 101a
    406b: 73 10 00     dec 1000 __data_section_start>
    406e: f6 10 1a     ldab 101a
    4071: c1 10        cmpb #16
    4073: 26 03        bne 4078
    4075: 79 10 1a     clr 101a
    4078: f6 10 00     ldab 1000 __data_section_start>
    407b: c1 02        cmpb #2
    407d: 26 0a        bne 4089
    407f: f6 00 cc     ldab cc __bss_size+0xb3>
    4082: 2a fb        bpl 407f
    4084: 18 0b 13 00  movb #19, cf __bss_size+0xb6>
    4088: cf
    4089: 18 05 b1 10  movw 2,SP+, 1006 _.xy>
    408d: 06
    408e: 18 05 b1 10  movw 2,SP+, 1004 _.z>
    4092: 04
    4093: 18 05 b1 10  movw 2,SP+, 1002 _.tmp>
    4097: 02
    4098: 0b           rti
0 Kudos

421 Views
rhinoceroshead
Contributor I

Yes, the C has suddenly become poorly written :-)

Seriously, what did you have to change to get it to compile?  There are 3 pushes to the stack that happen at the beginning of the ISR and 3 pulls at the end that seem to serve no purpose.  The rest of it is bloated and messy for sure.  I'm not impressed.

The complaint about the compilers requiring slightly different syntax is a fair one - but I thought that all of the ANSI C compilers should use exactly the same syntax.  I'm not sure if the code I posted is ANSI C or not, but I thought it was.  Where did you get the GCC compiler?  I was not aware of it.

0 Kudos

421 Views
imajeff
Contributor III

rhinoceroshead wrote:

Yes, the C has suddenly become poorly written :smileyhappy:


Seriously, what did you have to change to get it to compile? There are 3 pushes to the stack that happen at the beginning of the ISR and 3 pulls at the end that seem to serve no purpose. The rest of it is bloated and messy for sure. I'm not impressed.



Specifically, poorly written for current compiler.

Welcome to the world of multiple C compilers. Each has it's own personality. Here are some things typically different for different compilers
  • how to define interrupts
  • specify certain class/type of memory to use
  • built-in functions or device-support functions
  • other device-dependent aspects
  • some areas are stronger/weaker for optimization
To elaborate on the last item, this compiles much cleaner if I don't use indexed arrays. Using char pointers rules in GCC. Although CW seems better in that case, it seems worse in other cases or compared to yet another compiler. Too often people are limited by their tool, and spend too much time changing or learning tools instead of improving applications.

The complaint about the compilers requiring slightly different syntax is a fair one - but I thought that all of the ANSI C compilers should use exactly the same syntax. I'm not sure if the code I posted is ANSI C or not, but I thought it was. Where did you get the GCC compiler? I was not aware of it.



Syntax is a big one, no matter how ANSI it is, because embedded devices are not the system C had in mind. Is there a standard for how to specify which interrupt vector, or which Flash bank to use? If so, this demonstrates that C compilers don't always use it. GCC claims certain standard compliance, but I admit this one has had trouble satisfying some common expectations. It needs more help developing.

The extra pushes you wonder about are because GCC was made to use significantly more hardware registers than HC11/12 has. That's why people don't even want to port it to HC08. It defined "softreg" variables to solve it. Then since ISR could need the temporary softregs, it decided to push/pop them and that created the mess. I've apparently convinced the author to use local stack space instead of other softregs _.d1 to _.d32 (16-bits each). These were considered optional but did not compile if it didn't know any other way. One reason that certain of these are still pushed even if not used in any function is likely for consistency. Some applications require to know how many bytes were pushed on the stack for the ISR call. I would like to eliminate all these softregs, but it would of course make even more YAVs (yet another variation).
*See docs
*YAV: Yet Another Variation

But all in all, I like this C compiler once I learn to avoid these hurdles because it's just like you said. C code has to be "well written" to be practical. Hope you learned alot you didn't know about "C" compilers, and there's more yet :smileywink:

Here is the link for GCC

Message Edited by imajeff on 2006-06-0908:37 AM



* added "device-support" item to list

Message Edited by imajeff on 2006-06-09 08:42 AM

Message Edited by imajeff on 2006-06-09 08:44 AM

0 Kudos

421 Views
pittbull
Contributor III
imajeff wrote:
But all in all, I like this C compiler once I learn to avoid these hurdles because it's just like you said. C code has to be "well written" to be practical.

GCC is not the best choice for embedded devices. It is one of those GNU tools that are primarily designed to support unix platforms and unix-like clones such linux.
Some scaring examples:
- The GCC for Windows (MingGW) generates big and slow running executables, much bigger than Microsoft's Visual C does.
- GCC for NEC V850 generates code that is nearly twice the size that Greenhill's MULTI generates.

C compilers for MCU's must be well designed for those specific platforms and must make no compromises to be 'ANSI' compliant with might and main. In my opinion only commercial tools like CodeWarrior, Keil, Cosmic, IAR ... have high-class code generators that allow programmers to do most things without having to use assembly.

BTW: Also avoid the ICC12. The code it generates is nearly 1.5 times bigger than CodeWarrior does. It is a variation of LCC, an retargetable hobbyist compiler project. In some cases it makes wrong code for 'array[index++] = x;' that crashes immediately. One way to go around this is to split the statement into two parts 'array[index] = x; index++;'

Cheers,
pittbull
0 Kudos

421 Views
glork
Contributor I
All.
The last several posts on this thread have made it absolutely clear to me (although it was already) why I disregard C or other 'high-levels' for the work I do. All of this esoteric discussion of which C compiler is best leaves me cold. The goal is to program the application (move something, measure something, cut something, whatever), test it, deliver it, get paid for it and go fishing.

A well-written assembler is the 'natural' language of the cpu. Except for $%^&* arithmetic there is no improving it (well, you can improve the instruction set).
ron
0 Kudos

421 Views
rhinoceroshead
Contributor I
Well I stand corrected then.  I've used only the CodeWarrior compiler and I've always been pleased with the program footprints and run times that it gives - with the exception of when I've tried to use floats, but that's understandable.
 
I recently took a computer architecture class taught by a compiler programmer who worked for Cray, and he did his fair share of bragging up compilers.  Perhaps I took what he said without a grain of salt.  He said C compilers used to be written very poorly but nowadays even the best assembly programmers can rarely beat them because the compilers have gotten so good.  He wasn't talking about embedded processors but I assumed that a compiler for a machine with only a few registers would be easier to write than one with say, 32 registers.  I suppose writing assembly with 32 registers could get confusing really fast though.
 
I got a sense that there was some assembly elitism happening here when this thread started and perhaps I was mistaken.  I certainly don't believe that programming in assembly makes one a superior progammer.  Assembly or C, we are still communicating our will to a machine and must think like a machine to be effective.  There is a level of translation occuring when you use C, but as long as we know and trust our translator, we can accomplish the same tasks in a similar manner, however, we may communicate faster and occasionally with slight misinterpretation.
0 Kudos

421 Views
imajeff
Contributor III

rhinoceroshead wrote:
. . . He said C compilers used to be written very poorly but nowadays even the best assembly programmers can rarely beat them because the compilers have gotten so good.

Well here is an important reason why that is incorrect or at least misleading. The purpose of C (or other high level languages) is so that you can define the general process flow quickly, according to how C definition intended it. It tends to take away special low level capability of the individual CPU, or other process flow that was not compatible with C. Many are sort of brainwashed into thinking only "like C", and don't realize what kind of power they give up by choosing C for their app. Or next they write special "C" which is not fully C because it adds syntaxt to control special features. More to learn, more complication.

Typically the assembly programmer is capable of rewriting the application taking advantage of low-level, i.e.
  • carry and overflow bits other than in typical mathematics
  • incompatible program flow such as special composition of conditional code
  • special handling and starting of a relocatable function, such as copying to RAM and executing on the stack

They are probably not doing so because they don't see that the added benefit would outweigh the cost of developing the whole app twice (I agree). Then the questions to ask,
  • Did it really cost less building with C limitations than if it were built only in Asm by someone familiar with Asm applications?
  • Has the developer selected a more expensive part (overkill) just to ensure room to program in C, increasing the cost of every copy of the softare sold?

I've been wondering in the last few posts which one you were arguing for: C, or Asm. I had forgot to add the third option, "or just CodeWarrior" :smileyhappy: I think it's pretty clear now that you are saying "pay them what it takes and rely on CW". We as a society are good at creating "Microsoft-like" companies.

Really, I'm not trying to attack any one C compiler (that was not the subject)... however, note that it is no excuse for GCC to not improve optimization just because it is geared toward implementing a major OS on the target MCU. More so, this makes optimization more necessary. So I think the problem there is that the optimizations are just not done. Who to trust, and for what?
0 Kudos

421 Views
rocco
Senior Contributor II

rhinoceroshead wrote:
. . . He said C compilers used to be written very poorly but nowadays even the best assembly programmers can rarely beat them because the compilers have gotten so good.
He is somewhat correct, depending on the architecture.

I had an FAE pitch a high-end, 32-bit RISC architecture a few years back. It was so heavily pipelined, that when an instruction loads a register, the data would not be available for use in that register for three more instructions. Also, the instruction following a branch instruction would always be executed, whether the branch was taken or not.

The C compiler knew of the intricacies of the pipeline, and interleaved the generated C code to make the most efficient use of it. As a programmer, that is not the type of detail I want to be involved with.

As someone coming from a Cray background, I would bet that was your instructor's perspective.
0 Kudos

421 Views
rocco
Senior Contributor II
Hi,

Well, this might not be as basic as it should be, but it's what I got. If I get the time, I may strip it down some.

There is a 100 byte transmit buffer and a 32 byte receive buffer. XON/XOFF is supported for the transmitter. The routines to place data into the transmit buffer, and to remove data from the receive buffer are included. The buffers assume data that is line-terminated (by a carriage-return, [CR]), and the parser is scheduled when a [CR] is detected.

The receiver has routines to get characters, and also has an un-get, similar to C. You can also test is a full command exists in the buffer.

There are three ISRs, one for the transmitter, one for the receiver, and one for errors.
0 Kudos

421 Views
imajeff
Contributor III
Concerning both the mentioned SCI errata and Assembly programming, here is something I think is cool. I program in GCC (http://m68hc11.serveftp.org/). I have a good Assembler startup which I've uploaded to "community files", and I'll try attaching here.

It is experimental code to excersize the SCI port fairly fast using interrupts. It happens to show the 1K79X errata immediately after transmitting the fisrt byte. It can either work with a physical serial loopback, or a config option in the source code to enable the internal LOOPS.

Some people have tried and failed to use GCC (gas), but I think this setup works well. It even lets me debug visually with NoICE.

(hey, I attached the file before, but I think "Preview" removed it. trying again..)

Message Edited by imajeff on 06-01-2006 03:28 PM

0 Kudos

421 Views
eeetee
Contributor I
hi:
 
me too for assembly programmming! i'm not real big on code warrior as it's way too huge of a development tool than i require. been using mini-ide as a tool, but it doesn't have the xgate support. is there an alternative that includes xgate support? basically i just want to type code and run it.
 
i must confess that a buddy got a reasonably well coded (in c) multi serial port comm's program up and running (yeah, it did run on the bench but it was never tested for error detection/correction etc.) in an hour.
 
regards,
 
ed
0 Kudos

421 Views
peg
Senior Contributor IV

Hi,

I always use assembly too.

I am always amused by the many posts to this forum that start with lines like:

I am having trouble implementing this in C or trouble with processor expert, beans, initialisation blah blah blah to do such and such.

At this point I usually mumble to myself:

Well I could show you how to do it in a few lines of assembler, but I would not want to slow down/corrupt your rapid, abstract development methodology by dirtying it with horrible low level assembly language.

Regards David

 

0 Kudos

425 Views
Frousouna
Contributor I

Thank you,

    I would also say I'm old school, and by way of hardward designer.  I have been using the 68HC912 for a few years now, and had always written in assembly.  I just need to feel comforatable about instruction versus action.  I got the DEMO9S12NE64 with the notion that I would be up and running in a couple of days.  I must confess, I still haven't been able to load code to toggle an I/O, oh   yea, using ColdWorrior. 

   I would like to open the browser and have the target communicate through http.  Seems to me that this would ease the programming effort on the PC.  I'll go through the TCP/IP stact learning curve, and stick with assembly.

George

  

0 Kudos

425 Views
glork
Contributor I
Hi Frousouna.
I also have the NE64 kit (which I haven't fired up yet). I have an application coming up soon. I don't expect any serious trouble with the application itself till I get to the end, at which time I have to 'ethernet-enable' it.

In any case the only significant learning curve will be the ethernet part, and I hope to just license a tcp/ip stack that I can integrate into my main ASSEMBLY program. Any thoughts on this?
ron
0 Kudos