Acorn User
1st September 1983
Author: Paul Beverley
Published in Acorn User #014
Paul Beverley finds the Electron is a lot slower than the BBC Micro, but has some ideas on the problem
Speed: The Big Difference
Benchmarks on the Electron show up one of its major differences compared with the BBC Micro - the timings vary between graphics modes. Table 1 shows that the time taken to execute the benchmarks is considerably greater in one of the higher resolution graphics modes. In the same table are the results of a machine code benchmark which is simply a set of nested delay loops. As you can see, in the worst case, the Electron takes 4.3 times as long to run the same program as the BBC Micro.
The differences in speed between the two computers are accounted for by various factors. First, although the 6502A processor is capable of running at 2MHz, it is only working at full speed when accessing ROM. As soon as it accesses the RAM, it effectively slows down to 1MHz. The reason for this is that the read/write memory is arranged in four 64K by 1 bit chips, each of which therefore contains two bits of information for each byte.
Therefore, to get the full eight bits, you have to take two sets of bits from RAM, which means two accesses for each read or write operation. This 1MHz RAM access is confirmed by the timing of the machine code program in modes 4, 5 and 6 which is exactly twice that of the BBC Micro. The difference in speed in the other modes is affected by the constraints of generating the high resolution video display.
To get video information out of RAM and turn it into colour information for the colour monitor, you need to be continuously accessing the information which is in RAM and serialising it. That is, the information in each byte has to be sent out to the VDU as a series of dots which appear on the screen as the cathode ray tube beam scan across it. The rate at which this information needs to be sent out to the screen depends on the density of these dots.
Let us consider mode 0 which, being a two-colour mode, is the simplest to understand. Since it only needs one bit of memory to represent each screen dot, and since there are 640 dots per line, 640 bits of information have to be sent out during each line scan which takes 40 microseconds. Now 640 bits is 80 bytes of information, which means you have to get information out at the rate of 80/40 bytes per microsecond, or 2MHz.
In modes 1 and 2, although there is a smaller number of dots across the screen, the information still has to be accessed from RAM at the same rate since each dot is represented by a larger number of bits to give the colour information. In mode 3, there are blank lines inbetween the rows of characters which contain no video information and therefore the situation is not quite so bad as in the higher modes. In modes 4, 5 and 6, with half the number of dots across the screen, the speed of access which the video processor in the ULA has to make to RAM is only 1MHz and so the processing speed is not affected.
How then, does the Electron cope with putting out information at 2MHz? In the lower graphics modes, the RAM access of the video processing section of the ULA is interleaved with the access of the 6502A processor. In other words, during one phase of the system clock cycle the 6502A accesses the RAM, and during the other half cycle, the video processor does its accessing. However, because of the higher speed needed, in modes 0 to 3 the ULA has to take over the RAM entirely during the active portion of the line scan - that is for 40us out of every 64. The result is that for 40us the processor is stalled and does no processing.
This has an implication for interfacing. Although there are address, data and control lines available on the external edge connector, it is in Acorn's own words, 'a non-trivial interfacing problem'. The reason is that the clock signal available on the edge connector will sometimes by 2MHz, sometimes 1MHz, and sometimes totally stalled for 40ms.
You will notice from the benchmark timings, that mode 3 is not as bad as modes 0, 1 and 2. The reason for this is that the processor can, in fact, continue processing during the inactive lines between the rows of characters. To explain this a little further, if you change to mode 3, and execute a VDU 19,0,4,0,0,0 to change the background colour to blue, you will see the screen appears as a set of blue lines on a black background. If you type in some characters, you will see that they only appear in the blue lines, and not in the interleaving black areas. Therefore, while the dot is scanning these black lines, there is no information being taken from RAM, and therefore the processor can continue processing.
This is all very interesting to the technically minded, but how does it help if you want to improve the speed of a program run on the Electron? If you have a program which uses the higher modes of graphics but which has a large amount of calculation to be done, there is a simple method of improving its speed. It is shown in figure 1 as two procedures, one which switches to a fast mode of processing and the other which returns you to a slow mode of processing. To achieve this fast mode, what you do is switch the ULA into mode 6 by poking a number into one of its registers. (Yes, I know, I'm a complete hypocrite after all I've said about using the OSBYTE calls! But then, there are no calls to do this as far as I am aware.)
The effect of this poke is to produce a rather strange effect on the screen since the information in RAM is arranged for whichever of the higher modes of graphics you are using, whilst the upper 8K of that information is being displayed by the ULA as if it were in Mode 6. However, this means the processing speed is the maximum of which the computer is capable, and although the display is distorted, it is simply a matter of using PROCslow to switch back to the original mode of graphics which will restore the display to normal. The register in the ULA used to set the mode of graphics (&FE07) is a write-only register, so the Operating System has to keep a copy of what it has put in there for testing by various OSBYTE routines. This is kept in memory location &0282, and therefore PROCslow simply takes the contents of that location and puts it into &FE07.
If you want to do any drawing on the screen, and yet still want to work in the fast processing mode, there is no problem. What happens when you do the drawing is that the Operating System looks at &0282 to find out which mode of graphics it is in and then changes the contents of RAM for the appropriate draw or plot. Therefore, the drawing or plotting continues normally, even though it produces a rather strange effect on the screen display which is apparently mode 6, but as soon as you execute PROCslow, the display returns with all the lines you drew displayed normally.
To give an idea of how much this speeds things up, the Persian program given in the BBC and Electron manuals takes 34.1s to run on the BBC whereas it takes 105.1s on the Electron (3.1 times as long). However, if you add the PROCfast and PROCslow commands, it reduces the time to 50.8s.
10000DEFPROCfast
10010?&FE07=&B0
10020ENDPROC
10030
10040DEFPROCslow
10050?&FE07=?&0282
10060ENDPROC
Figure 1. Two simple procedures for switching speeds by putting the display into mode 6 without clearing RAM, and then restoring the mode as recognised by the MOS
Test | Any mode | Mode 0 | Mode 1 | Mode 2 | Mode 3 | Modes 4,5,6 |
BM1 | 0.6 | 1.8 | 1.8 | 1.8 | 1.4 | 0.8 |
BM2 | 2.7 | 7.6 | 7.6 | 7.7 | 6.2 | 3.7 |
BM3 | 7.8 | 22.2 | 22.3 | 22.5 | 18.1 | 10.8 |
BM4 | 8.4 | 23.8 | 23.9 | 24.1 | 19.2 | 11.4 |
BM5 | 8.8 | 24.9 | 25.0 | 25.2 | 20.1 | 11.9 |
BM6 | 13.2 | 37.7 | 37.8 | 38.2 | 30.5 | 18.1 |
BM7 | 20.7 | 57.9 | 58.1 | 58.7 | 47.1 | 28.0 |
BM8 | 5.0 | 14.9 | 14.9 | 15.1 | 12.0 | 7.1 |
BM7+8 | 25.7 | 72.8 | 73.0 | 73.8 | 69.1 | 35.1 |
Factor | x2.83 | x2.84 | x2.87 | x2.69 | x1.37 | |
MC loop | 27.5 | 118.0 | 118.2 | 119.5 | 94.3 | 55.0 |
Factor | x4.29 | x4.30 | x4.35 | x3.42 | x2.00 |
Table 1. PCW benchmarks and machine code loop timings in different graphics modes compared with the BBC timings. (Timings for modes 4, 5 and 6 are virtually identical.)
Bookshelf
The Electron comes complete with two books - a user guide and a programming guide. The two books are designed to complement each other; the second being an easy introduction which makes extensive reference to the first.
The programming book starts off with a simple introduction to the Electron and goes on to cover sound, graphics, arithmetic, problem solving, games and most of the techniques needed to write programs.
Its style is chatty, illustrated with cartoons, and is most definitely on the side of structured programming - not a single GOTO in sight! Procedures and functions abound with long variable names, and all listings have been dumped using LISTO7 on a daisywheel.
Four listings take up the final 21 pages of the book's 138 pages. These are turtle graphics program, which links up with a maze solver, a greeting program, and Rivergame - the old chicken, fox, grain problem.
The publishers, Addison-Wesley, are to release the book to the general public at £6.95. The author Masoud Yazdani, also appears to have several follow-up books in store which will no doubt transfer onto the BBC machine.
This article was converted to a web page from the following pages of Acorn User #014.