Comparison of Forth and C for the Commodore 64 (was Poll: What topic should come next?)

If nobody press the vote button, a C++ random generator is started which determines by it’s own which subject sounds interesting. The sourcecode is:

#include <iostream>
#include <random>

class Randomgenerator {
  auto get_integer(int min, int max) {
    return min + rand() % (( max + 1 ) - min);
  Randomgenerator() {
    srand((int) time(0));
int main()
  Randomgenerator myrand;

Update 2018-10-10
The poll is closed. “Forth on the Commodore 64” has won. Here is the article:

The 8-bit Commodore homecomputer from the 1980s is a great study object to explain the differences in programming languages. In contrast to modern IBM PC, the system was programmed in Assembly language as default so it is a good starting point to explain the details downto machine level instructions. All the major programming languages are available for the C64, but today I only selected two of them: Super-Forth-64 and Super-C.

Both are available in sourcecode and include a great handbook. Let us read through the manual to explain the strength and weakness. The good news is, that Forth and C are very different. Super-Forth-64 works with a so called stack and the idea is to combine assembly languages into symbolic words. Programming in Forth is equal to using a macroassembler. The user creates a new word, writes 10 assembly commands into the word, and from now on he can execute the Forth word and doesn’t type in assembly commands. This allows the user to create complex software which is as fast as assembly language but easier to maintain and needs less memory.

The Super-C language has a different perspective. C comes with a library of predefined commands for graphical output and disk access. The idea behind the C programming language is, that the user ignores the underlying computer and programs for the C compiler. That means, the sourcecode is the important work area and the code can be wrong or right. From a technical point of view the needed RAM is higher. Most users agree, that C is easier to understand than Forth.

The reason why C was successful in the computer history but Forth not, has to do, that the C language is standardized. That means, the library is equal on each computer system and it is not needed for the user to understand assembly language. Forth user are always Assembly programmers too. If a Forth word is producing an error, the debugging process is about analyzing the word on assembly level. It is not possible to program in Forth without understand the underlying hardware. Forth can be seen as the more intelligent language. If someone understands Forth he is the better programmer. That means, he knows more about the Commodore 64 than a C programmer.

Compiler friendly CPU

Super C was not used very often for programming the Commodore 64. The reason was, that the language was slower than assembly, and was not optimized for the 6502 CPU. It is interesting to see the upraising of the C language in the 1990s. What has changed was not C itself, instead the hardware manufacturers were forced to create a new generations of CPUs. The MOS 6502 had only a small number of registers and it was hard to write a a compiler for it. Later, Intel based CPU were designed as default as C compiler friendly. In the beginning, these kind of machines were called Unix processor to make clear that they support not only assembly language, but were designed especially for the purpose to run operating systems and the C programming language. In contrast, the 8bit CPU in the Commodore 64 can be called a unix unfriendly and c-unfriendly cpu.

What we have seen was, that the software developers have taken control over hardware manufactoring. Software was not longer programmed for a given CPU, instead the hardware followed the needs of OS development and applications programmers. That means, because the 6502 doesn’t work very well together with a C compiler, the CPU has lost the race. Nobody is using it anymore.

Let us describe the situation from the bottom up. Suppose, a certain hardware is given. That means the instruction set is fixed and the aim of the programmer is to use the Forth language to create on top of the hardware a meaningful program. Some programmers are able to do so, other not. C programmers describe the world from a different perspective. They have at first their c compiler and then ask for a cpu which is able to run the code. And if the code doesn’t run well, not the c language is wrong, but the hardware.

Forth made easy for beginners

Understanding Forth is difficult for most c programmer. They see the language, read the manual but they doesn’t grap the idea behind it. To understand Forth more quickly it is a good idea to observe the Commodore 64. This machine was programmed mostly in Assembly language. The UNIX c dialect was never a great success on 8bit homecomputers. And focus on machine language is a good starting point for grasp Forth.

Suppose, we are programming the C64 in Assembly language. We are typing in a hello world program from a computer magazine and assemble the code into binary code. Is it possible to simplify the task? And here comes Forth into the game. A look into the Durexforth implementation (which is a modern Forth for the Commodore 64) shows us, that a Forth command will execute low level Assembly code at runtime. That means a forth word is interpreted like a BASIC command and gets executed by the Forth interpreter.

Can we increase the speed of a Forth program a bit? Yes, the answer is called threaded code, or to be more specific “a virtual machine”. That is a technique for increasing the runtime speed. From Java the term “just in time compiler” is well known. And exactly the same technique can be used to create the assembly code faster. On github, there is a Just in time compiler for Forth available, It is not running on the Commodore 64 but on the PC. The most advantage of Firmforth (the name of the software) over existing Gforth interpreter is, that it is faster. That means, the assembly code is generated on a different way which is more efficient.

Forth vs C on the Commodore 64

C is not best language for the Commodore 64. In most cases a cross compiler is needed and even a modern one like GCC-6502 is not available to generate fast assembly code. The reason is, that the 6502 cpu was designed to program it directly in assembly code. The better way in using the hardware efficient, is using low level op-codes in assembler or the more comfortable way to use a C64 Forth like Volksforth or Durexforth.

But what exactly is the reason that the C programming language doesn’t run very well on the Commdore 64? C was developed with programmers need in mind. There is language specification what C is, and which statements are available, and then the compiler will generate the assembly code for it. A c programmer ignores the underlying hardware. He is not interested in programming for the 6502 cpu or for Intel x86 processor, but he is focussed only on the sourcecode itself. That means, a c programmer thinks in subroutines, variables and arrays but not in registers, stackpointers or the need of hardware.

Forth is different. Forth is a bottom up language which has its root in hardware. A Forth programmer thinks at foremost about the cpu and about assembly language and creates from below the code which is using a processor efficient. Forth doesn’t provide words for creating arrays or subfunctions and Forth sourcecode is not dedicated to programmers needs. Instead the more important question is, which needs has a certain cpu and the programmer has to obey. Forth works very good together with a 8bit cpu, because the programmer is in charge to provide the appropriate sourcecode. If he can’t do so, he has a problem not Forth. In contrast, a c programmer who is not able to compile his C code for the 6502 cpu comes to the conclusion that his computer is wrong. He will ask for a better cpu which is 32 bit, has more registers and which is called compiler friendly, because the c sourcecode compiles very well for the platform.


4 thoughts on “Comparison of Forth and C for the Commodore 64 (was Poll: What topic should come next?)

  1. Man koennte den Zufallsgenerator auch mit dem Aufruf der Startseite koppeln (randomseed wird dabei aus Datum, Uhrzeit und ein paar HTTP-Informationen des Requests berechnet). Vielleicht kann man so eine Art unterbewussten (Nicht-)Willen der Leser ermitteln.
    Sollte der esotherische Detektor funktionieren, wird man es jedoch vermutlich und leider nicht feststellen koennen, ebenso wenig ob man nun, falls er funktioniert, nun den Willen oder Nichtwillen detektiert hat, es sei denn, er funktioniert so gut, dass Du dadurch einen Superansturm an Lesern samt Lobpreisungen bekommst oder aus Zorn die Seite gehackt wird. :-)


    • At the moment, I’m not in fear that hundreds of people will press the button. But I know from the history of the blog, that times can change. A while ago, some guy posted a link to this blog into the German forum and at the next day around 300 people have watched the website. Some of them had a different opinion than me. It is hard to predict such kind of traffic increase, but in case of Forth related blogposts it’s sure, that the number of visitors will stay small. Even postings at Stackoverflow with the tag Forth are not able to motivate the people to write something. Apropos Stackoverflow: The website was down yesterday for 10 minutes at least in the U.S. For many hardcore users of the shortage of the Stackexchange network was a major problem. One of them posted a SOS signal to the twitter stream, other users reported that the end of the world is near.


  2. Zum Artikel:
    Ist der 6502/10 wirklich so C/Pascal/…-compiler-unfreundlich wenn man die Programmierung in Assembler gegenueberstellt?
    Kommt drauf an, wuerde ich sagen.
    Man hat ja seinerzeit fuer schnelleren Code gerne mal rumgeschmuddelt, also Dinge getan, die eigentlich in sauberem Code verboten sind, die aber bei der 6502/10 und den rund 1 MHz sauber zu langsam oder evtl. auch zu gross geworden waeren.
    In sauberem Assembler hatte man, denke ich, ganz aehnliche Probleme wie der C-Compiler UND man sollte nicht ausser Acht lassen, dass beim 6502/10 die Zero-Page auch als eine Art Registerfeld verstanden wurde – so gab es eben den langsamen Speicher, die etwas schnellere Zero-Page (weil eben kuerzere Adressen) und die wenigen aber noch schnelleren echten Register.
    Je nach dem kam man auch nicht umhin einen groesseren Stack zu bauen, den man dann aber auch “manuell” verwalten musste, weil die popeligen 256 Bytes des Original-Stacks ja schon etwas knapp werden konnten – wenn man seine Parameter nicht ueber globale Variablen an die Routinen uebergab (Zero-Page oder auch nicht). … in FORTH mit seinem Return-Stack ja sowieso.
    Ich denke, der benoetigte Verwaltungsaufwand der C/Pascal-Compiler war einfach zu hoch; die Ressourcen haette man fuer dringend noetige Optimierungen benoetigt, die man dafuer aber nicht grossartig hergeben konnte weil sonst der Compiler an sich nicht genug gehabt haette – eine Zwickmuehle also.
    Vielleicht haette man mit einem Mehrpass-Compiler und entsprechend mehrfachem Diskettenwechsel sogar guten Code erzeugen koennen (schnell und/oder klein), aber dazu waren wohl die Leute vor der Maschine schon zu verwoehnt oder auch zu ungeduldig. Das haette ja eine wirklich sorgfaeltige Programmplanung usw. verlangt, wenn man fuer eine Compilation nicht nur eine, sondern vielleicht sogar ein paar Stunden veranschlagen haette muessen… und dann auch zwei teure Diskettenlaufwerke (sonst wird’s noch furchtbarer).
    Den 6502/10 mit dem 8086/88 zu vergleichen ist etwas unfair, denn letzterer hat ja nicht nur mehr Register, sondern braucht als 16-Bit-CPU auch wesentlich weniger Befehle fuer die gleiche 16-Bit-Rechnung, d. h. selbst wenn der 8086/88 nur drei Register + SP gehabt haette, waere der einfacher handzuhaben gewesen [1], mal abgesehen von dem Mehr an Speicher. Das alte x86-TurboPascal bis zu den 3er-Versionen hatte ja 64 KB fuer den Quelltext, 64 KB fuer das Compilat, 64 KB fuer den Compiler und 64 KB fuer die Verwaltung (Symboltabelle und das ganze Gedoens) zur Verfuegung (wenn man nicht eh auf Disk compilierte und nicht auch noch Overlays nutzte) – davon kann ein C64-Compiler freilich nur traeumen. :-)

    Mit dem Gedanken “FORTH als Makroassembler”, koennte ich mich sogar noch anfreunden – bislang sah ich FORTH schon mehr als (unbequeme) Hochsprache, in der man also Assembler moeglichst vermeidet um eine gewisse Portabilitaet zu erleichtern.
    An vielen (den meisten?) FORTH-Assemblern nerven mich aber vor allem zwei Dinge:
    1. die FORTH-typische verdrehte Befehls-Operanden-Reihenfolge
    2. das ist noch nerviger: die Art wie Spruenge vorgenommen werden MUESSEN, also dass es bei den meisten(?) FORTH-Assembler einen Zwang zur Strukturierung gibt und dass man keinen Spaghetti-Code mit Labels proggen kann, also jedenfalls nicht ohne erhoehtem Aufwand. Das wuerde allerdings, wenn man es wie in anderen Assemblern auch machen koennte, die Portierung von “gewoehnlichem” Assembler-Code erleichtern, ggf. auch die automatische Generierung von solchem Assembler-Code.
    Nicht, dass man mich falsch versteht (verdammt, meine Space-Taste reagiert nicht immer! wohl Nudel-Sauce von gestern rein geraten?): Die EXISTENZ der Strukturbefehle im FORTH-Assembler ist schon ok, nur die mangelnde Alternative zum Verzicht nicht. Klar, eingefleischte Hardcore-FORTH-Programmierer, die grundsaetzlich nur FORTH-aehnlich denken, moegen das anders sehen. :-)

    [1] Beispiel:
    add ax, globalvar
    braucht 4 Byte.
    AX im folgenden in X und Y um den Vergleich halbwegs fair zu halten:
    adc globalvar
    adc globalvar + 1
    waeren schon stolze12 Bytes, wobei die Regel eher so aussieht
    lda ax ; zero-page
    adc globalvar
    sta ax
    lda ax + 1 ; zero page
    adc globalvar + 1
    sta ax + 1
    waeren schon 15 Bytes.
    Um Platz zu sparen kann man das je nach dem als Routine implementieren, was dann aber langsamer wird.
    Und folgendes geht ja im 6502/10 auch nicht:
    add globalvar, cx ; cl stellvertretend fuer 6502-X-Register
    adc globalvar + 1, ch ; ch stellvertretend fuer 6502/10-Y-Register
    waeren nur 8 Bytes im Gegensatz zu den ersteren 12 Bytes und das obwohl beim 8086/88 dem Befehlsbyte noch ein mod-reg-rm-Byte folgt, jedenfalls bei obiger Verwendungsart.
    Auch nicht vergessen sollte man die Tauglichkeit der Bitbreiten, also was man schoenes anstellen kann, wenn man 8, 16, 32, 64 Bit zur Verfuegung hat:
    Bitbreite : kann im Alltag aufnehmen
    1 : gar nix… da muss man alles seriell machen…
    2 : eigentlich auch…
    4 : oeh ja pfff,… Mini-Befehlscode, Hex-Ziffern und so… :-)
    8 : Textzeichen, grobe Farbcodes, weitgehend verstaendliche Audio-Samples,…
    16 : Speicheradressen, 16-Bit-Unicode, Hausnummern, Personengewichte, …
    32 : Geldbetraege im privaten/kleinbetrieblichen Umfang (+Pfennige), Zwischenwerte bei Signalverarbeitung, …
    64 : reicht ja schon fuer fast alles
    Bei 32 Bit wird es eigentlich erst interessant (jenseits von C64-Spielen und Garagentor-Steuerungen) und 128 waeren wohl fast schon wieder zu viel, abgesehen von den auch immer groesser werdenden Speicheradressen und von irgendwelchen astronomischen Berechnungen oder, Gott bewahre, fuer den Welt-Computer, der wohl nicht genug Bitbreite haben kann, wenn ueberhaupt…
    Ich bin in der Fussnote etwas abgeschwiffen… naja.


    • To summarize the information a bit: there are at least 6 possible ways for programming the Commodore 64:

      1. Forth

      2. Assembly language, which means the usage of the 6502 instruction set plus zeropage

      3. C compiler, PASCAL compiler

      4. BASIC Interpreter

      5. Crosscompiler cc65

      6. C++ with crosscompiler and instruction converter

      The good news is, that with today’s knowledge it is easier to investigate each possibility in detail. The sad news is, that it is unclear which of them is the perfect one. My guess is that option 1 (Forth) is the advanced way in programming the machine. But I’m not sure how exactly Forth is different from an assembler.

      As far as i know, an assembler converts Mnemonics into machinecode. But does that mean, that the CPU is hardwired at runtime like an FPGA chip? I mean, what is the difference between hardware and software?


Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.