EUG PD


Assembly Language From Scratch

 
Author: Peter Holmes
Published in EUG #49

Having driven a computer before, you will be aware that something exists called 'machine code' and another thing called 'assembler'. Quite simply machine code is the language that the micro-processor chip in your computer understands, while assembly language is just the same thing in a more readable form. As an example, take a simple addition sum - adding 12 and 72. In English, you would say:
      Add twelve to seventy two - what's the answer?

In BASIC, you might say something like:

      10 A=12
      20 B=72
      30 C=A+B
      40 PRINT C

In 6502 machine code, you could say:

      A2 0C
      8E AC 0D
      A9 48
      6D AC 0D
      8D 00 64

This is pretty-well unintelligible, isn't it? Well, that's why we use assembly language. The same problem is given below in assembly language along with a brief comment on each line.

      LDX#  12           Load X register with a '12'.
      STX   3500         Store the content of X in memory location 3500.
      LDA#  72           Load the accumulator with a '72'.
      ADC   3500         Add to the accumulator, the contents of memory
                         location 3500.
      STA   25600        Store contents of accumulator in 25600 (i.e. on
                         screen).

It's easier to read that than the machine code, isn't it? With an assembler you can enter your program in assembly language and be able to read and understand it readily. All the assembler does is change the assembly language into machine code. Thus, when it sees 'LDX#', it changes it into an 'A2' and puts this into memory in the right place. Of course, it is possible to enter machine code directly into memory!

The heart of the 6502 is the accumulator (or 'A' for short) through which almost all of your data has to flow. It is basically an eight bit store that can hold a number up to 255. 6502 instructions allow you to write directly into this store using the instruction "LoaD a number into the Accumulator using the immediate mode" (This is a different sort of 'immediate' mode from that which you're used to in BASIC; it just means load 'the number immediately following'). The mnemonic for this is LDA# ('LDA hash').

LDA# LoaD a number into the Accumulator using immediate mode

Another instruction allows you to transfer a number from this store to any specified memory location. If this memory location is between 24576 and 32767, the number taken from the store will be displayed on screen. The instruction is:

STA STore contents of Accumulator in the address specified

You should note that all the STore and LoaD commands you will meet should really be thought of as COPY commands as they create a second copy of the data, leaving the source unaltered.

Let's have a go then at running a machine code program!

This program will put a number into the accumulator and then transfer it to a position near the bottom of the screen, i.e. location 25600.

One word about the assembler in the Electron machine. This is a part of the Electron's BASIC interpreter and can be entered simply by putting a square bracket '[' as a BASIC statement. The end of the assembly is marked by a right-hand square bracket ']'. The assembler needs to be told, when you assemble the program, where you want it to be put in the memory. You do this by setting the variable P% to the 'start address'. The memory locations starting at 8192 are used for the examples in this series.

Our assembly programs will need to be embedded within a BASIC program and hence will be written in the form:

       10 P%=8192
       20 [
       30 ......assembly statements
        . ......assembly statements
        . ......assembly statements
      100 ......assembly statements
      110 ]

N.B. You should use Mode 6 unless instructed otherwise.

When you have put in your assembly language program you must first 'assemble' it, and this is done by RUNning the BASIC program containing the assembly code. Up will come on the screen a list of your assembly program and each statement will be preceded by its machine code counterpart; this machine code being called the 'object program', the constituent parts being the 'object code'. This object code could be entered directly into memory and would yield the same results as the program you typed in; the assembly language only helps you to compile the program in the first place. The program should end in the normal way by displaying the usual basic prompt '>'.

On typing...

      CALL 8192   (8192 is the first memory location of the machine
                  code program)

the machine code will be executed. At the end of any machine code program we always tell the machine code to return to the BASIC monitor. The command that does this is ReTurn from machine code Subroutine, or RTS. (In order to avoid confusion, the word 'run' is used in connection with a BASIC program and 'execute' in connection with a machine code program) If you find this explanation confusing, don't worry, just follow through the steps one by one.

Right now, to write our first program, we must:-

  • Tell the assembler that the start address is 8192, i.e. P%=8192
  • Tell the assembler to start assembly, i.e. [
  • LoaD number '255' into accumulator A using immediate mode. The mnemonic for this is LDA# + number to be entered, i.e. LDA#255
  • STore at a specified Address, the contents of the Accumulator. The mnemonic for this is STA. After this we must tell the 6502 what the address is, i.e. STA 25600
  • ReTurn to the BASIC Subroutine, i.e. RTS
  • Tell the assembler to finish assembly, i.e. ]

Or

      10 P%=8192
      20 [
      30 LDA#255
      40 STA 25600
      50 RTS
      60 ]
      70 PRINT"ASSEMBLY COMPLETED"                         (Program 1.1)

Now, to enter the assembly program just type in the above. (You should be in Mode 6 - if you are not, type MODE6 <RETURN> now!)

Run this program and in the twinkle of an eye, your assembly code will be converted into a machine code program starting at 8192. (If you make a mistake you should get a helpful error message, just like BASIC! You can edit your program, just as you would with any BASIC program, using the <COPY> and LIST to list your program.)

To execute the machine code program, we just use the CALL statement in immediate mode. (Before you execute this you must clear the screen using CLS <RETURN>.) Now type:

      CALL 8182 (RETURN)   (8192 is the location of the start of the program)
On the top left of your screen will appear a dash. If it does not appear instantly, press <BREAK> then type OLD <RETURN>, LIST <RETURN>. Your program should be listed - if it isn't I'm afraid you will have to retype it. Sorry! The reason your program did not work will be a fault in your assembly code - check that the numbers were correct and that the letter combinations (mnemonics) were correct. Check that you did call 8192 and not some other address; check that you did not forget RTS in the program.

So far then the achievement has been to load a number '255' into the 6502's on chip memory (the accumulator) and then to store this somewhere on the screen. As it happened, we chose the top left but it could have been anywhere. As an experiment, try changing the address specified in line 50 but you must keep it between 24576 and 32767 or you will not be on the screen at all! (You will not be in screen memory but in some other part of memory.)

Let's have a look at another instruction and use this in a program. As stated earlier, the accumulator is the repository of most 'answers' and this new instructions 'does a sum' and leaves the answer into the accumulator.

ADC ADd with Carry, the contents of the specified memory location to the accumulator

To use ADC, we have to add two extra lines to the program that gets the 6502 ready for adding. Don't worry what they mean for now - just type them in and follow the instructions! Oh yes, and don't worry about the 'with Carry' bit yet. That will become clear later on - honest!

One other point about jargon! The term instruction is used to describe an executable machine code statement. Thus, it could consist of LDA# or just RTS. However, the term is also used to refer to the mnemonic alone, as when one says the 6502 instruction set. In this series, the term command is used to refer to the mnemonic part of an instruction when this precision is required. For instance, in the instruction LDA#65, the LDA part may be referred to as the 'command'.

Let's look at the stages:

       10 P%=8192      Gives address for beginning of program
       20 [            Starts assmbling
       30 CLD          } Gets the 6502
       40 CLC          } ready for adding
       50 LDA#1        LoaD '1' into the Accumulator in immediate mode
       60 STA 25607    STore the contents of Accumulator (1) in 25607
       70 LDA#128      LoaD '128' into the Accumulator in immediate mode
       80 ADC 25607    ADd with Carry, the contents of 25607 (i.e. 1) to
                       the accumulator (i.e. 128)
       90 STA 25600    STore the contents of Accumulator (now 129) in
                       25600
      100 RTS          ReTurn from machine code Subroutine
      110 ]            End assembly

Right then, type it in!

Now run what you have just typed in and your assembly program will be assembled. (Don't forget to CLS!)

Type CALL 8192 <RETURN> and your screen will display a single dot and a pair of dots at the screen positions corresponding to 25607 and 25600 respectively. If they do not appear instantly, press <BREAK> then type OLD <RETURN>, LIST <RETURN>. Your program should appear.

Another way of looking at the two lines:

       50 LDA#1
and    60 STA 25607

is as a way of putting a '1' into memory or of putting a single dot on the screen. (The special 8-bit code for 1 is 00000001 in binary. Note that the first dot pattern has only one dot which corresponds to the '1' in the binary number 00000001. The second dot pattern has two dots which correspond to the two '1's' in the binary for 129 i.e. 10000001. These will be covered in great detail later in the course.

The 6502 has two index registers in addition to its accumulator and these are referred to as Index registers X and Y, and can each store one 8-bit number. The arrangement of these or, as the jargon has it, the architecture of the 6502, is shown below (in part) in Figure 1.1.

                                      ________________
                                     |                |           DATA
________  ______________  ___________|___  ______ ____|_________________
    ____||____      ____||____      _|___||_     |    |
   |          |    |          |    |        |   | \  / |          BUS
   | X        |    | Y        |    | ACCUM- |   |  \/  |
   | REGISTER |    | REGISTER |    | ULATOR |   | ALU  |
   |__________|    |__________|    |________|   |______|      Figure 1.1

In this figure, the X and Y registers are shown identically, although they do differ slightly. Nevertheless, they are both indexed registers and can thus have the value stored in them incremented or decremented (increased or decreased) in steps on 1. To the right of the figure is the ALU, or Arithmetic Logic Unit, in which arithmetic and logical operations are carried out. It has two inputs for the data that it manipulates and the output from the operation is fed into the accumulator. For this reason, almost all data flows through the accumulator, making this a key feature of the 6502. Data flows between various registers along the data bus which a common trackway for communication within the 6502. For talking to devices beyond the chip, this data bus is extended to access memory also.

In the remainder of this part, we will look at these registers and the ways that data can be fed in, out and between them.

First of all we'll have a go at using the X-register - so to load this we use the instruction:

LDX LoaD index register X with the data in the specified address

Thus, LDX 9000 means LoaD index register X with the data in memory location 9000. LDX differs from the earlier LDA# (apart from one loading the accumulator and one the X-register) in that the LDA# command is an immediate (mode) load command. When the 6502 sees this it looks for what's immediately following the instruction and loads that - as data - into the Accumulator. With the new command above LDX, the 6502 looks for what follows and this specifies the address of the data.

Thus, with the instruction:

      LDX 9000

the 6502 goes to memory location 9000 to extract a copy of the data stored there and then loads that into the X-register. This instruction (as are all the register instructions) is really a copy as the data put into the X-register is copied to it and the original data remains in the original memory location.

To store the data we may use the instruction:

STX STore the contents of the specified address in the register X.

Thus, STX 25600 means STore contents of X-register in memory location 25600. Right, here goes. Here's the program!

       10 P%=8192      Gives address for beginning of program
       20 [            Start assembly
       30 LDA#85       Load 85 into the accumulator
       40 STA 25600    Store contents of accumulator in 25600
       50 LDX 25600    Load into X-register, contents of memory location
                       25600 (i.e. 85)
       60 STX 25607    Store contents of X-register in 25607 (i.e. 85)
       70 RTS          Return from machine code subroutine
       80 ]            End assembly

Now run this program and CALL 8192 in the usual way (remembering to clear the screen first of course!). The screen should display two rows of four dots. The first row corresponds to 25600 and the right most row to 25607. (The special binary code for 85 is 01010101. The four ones correspond to the four dots.)

By now you should be able to write simple programs so, as an exercise try the following:-

Exercise 1.1

Load the accumulator directly with 81, display this in 25600. The answer appears at the bottom of this article.

Don't forget to put in the RTS at the end. If you do forget, then the 6502 will run on to see what it can find and try to execute this. If you're unlucky it will find something that crashes the system. Also make sure you clear the screen before you call machine code programs or you may find that things appear at lower or higher places on the screen than you expected, due to the Electron's method of scrolling text.

Exercise 1.2

Display two dashes one above the other like an equals sign (Try addresses 26833 and 26837). One possible answer appears at the end of this article.

The Load and Store instructions that we have met so far are complemented by the corresponding Y-register instructions.

LDY LoaD register Y with data at specified address
LDY#LoaD register Y with data specified
STYSTore the data in the Y-register at the address specified

You should now know or able to interpret the following:

      LDA             ADC             STX             STY
      LDA#            ADC#            LDY             RTS
      STA             LDX             LDY#          
                      LDX#

For many operations but not all, the X and Y registers can be treated interchangeably; for instance, Program 1.3 could be written:

       10 P%=8192                         10 P%=8192
       20 [                               20 [
       30 LDA#85                          30 LDA#85
       40 STA 25600                       40 STA 25600
       50 LDX 25600                       50 LDY 25600
       60 STX 25600                       60 STY 25607
       70 RTS           Program 1.3       70 RTS           Program 1.3
       80 ]      (using X register)       80 ]      (using Y register)

Because of this interchangeability, and the need to swap data rapidly between registers during the run of a program, several instructions exist to do this automatically. They are typified by:

TAX Transfer the contents of the Accumulator into the index register X

Using this command in Program 1.3 (to produce Program 1.4) shortens it, while maintaining the same function.

       10 P%=8192
       20 [
       30 LDA#85
       40 STA 25600
       50 TAX
       60 STX 25607
       70 RTS
       80 ]                                                Program 1.4

When run, the screen should display two dot patterns, one corresponds to 25600 and the other to 25607. Descriptions given of the codes so far have been spelled out in detail. However, as you are getting more used to the jargon, it is reasonable now to begin to abbreviate. From now on, instead of "the contents of the X-register", we will just refer to X and similarly so with the Y-register Y and accumulator A. Thus, a summary of the transfer instructions is:-

TAXTransfer A into X.
TAYTransfer A into Y.
TXATransfer X into A.
TYATransfer Y into A.

Next month, we'll move onto machine code subroutines, conditional jumps, the program counter and certain machine code flags.

Solutions To Part 1 Exercises

Exercise 1.1

      10 P%=8192
      20 [
      30 LDA#81
      40 STA 25600
      50 RTS
      60 ]

Exercise 1.2

      10 P%=8192
      20 [
      30 LDA#255
      40 STA 26833
      50 STA 26837
      60 RTS
      70 ]

First Published EUG #49
(C) Honeyfold, Standfast House, Bath Place, Barnet, LONDON

Peter Holmes