Benchmarking Forth

That Forth is a superior programming model is obvious. But how exactly is the economical benefit for using Forth in real applications? That is very hard to guess. Let us first research a case in which Forth makes no advantages and even slow down the computer. If somebody is using his normal x86 computer and runs on top a gforth he will recognize that the algorithm runs slower than with a C-Compiler. A program written in C and compiled to assembly needs 100 seconds, while the same algorithm implemented in gforth takes on the same hardware at least 500 seconds. So it makes no sense to switch to forth only for performance reasons.

Now we repeat the experiment with a setup which is more fair to forth. We construct a dedidcated Forth chip in an FPGA and testing out the performance. According to the pure Forth version is 6x times faster then the same algorithm written in C and compiled to assembly. That is a lot, if the algorithm takes with a compiler 100 seconds to run, that Forth version would only take 17 seconds. But, it is possible to improve the advantage for Forth further. Because Forth chips are easier to develop than the x86 CPU it is possible to use the increase by the factor 6x only as the starting point and develop a more efficient CPU which uses parallel architecture, optimizer and so forth. This would be a longterm goal, because such adavanced CPU are not available today.

So we have explained the benefit of Forth, but there are also disadvantages. At first, a Forth engine can only calculate the same data as a normal C program can do. Both a turing-complete machines. That means a poor written algorithm in Forth is no improvement compared to the same algorithm in C. The second problem with Forth is, that no good tutorials are available. And if a book describes the inner working of Forth it is hard to understand for beginners. And the third problem is, that programming in Forth is more difficult than programming in C. Around the C programming language there is huge library of ready-to-run programs which were developed over years, for example, operating systems, compilers and user applications. In theory the same is possible with forth, but that is something which has to build from scratch. Efforts from the past to promote Forth for example the “ACM Sigforth newsletter” were not very successful, and todays Forth community like the SVFIG from Silicon Valley is unknown by computer mainstream. There are doubt, that it is really possible to decrease the costs with Forth, perhaps in the longterm future, but not today.

Jupiter Ace

From the raw numbers the Jupiter Ace is amazing, it is technical a superior machine. The reason why the Ace was economical not a success had three reasons: software, software and software. After turn the machine on, only a Forth interpreter was shown. If the user wants to make something useful with it, he have to program the system in Forth. The other option was to buy commercial software written in Forth, which was not available. So it was very usual that nobody want’s to buy the Ace. Other homecomputers from that time, had after turning on a BASIC interpreter which is very easy to program, because it has variables and for spending money other commercial software was available.

Today the problem is the same. From a technical point of view and from computer science perspective systems like the GA144 chip are a dream machine. They are Forth in hardware which runs extreme well. But software is missing too. Commercial software in Forth is not available and programming an operating system from scratch is out of reach for most users.

Let us watch a remarkable hybrid system. The “Java Optimized Processor”. It is a stackmachine designed for running the Java virtual machine. Java is language widely used in mainstream computing, and the JOP CPUs runs the system faster. Surprisingly, the JOP was also a failure, nobody is interested in the chip. The problem is, that apart from Java sourcecode the CPU can not run any other program for example, which was written in C++. A writing a C++ to Java compiler is a hard task. It is the same problem like writing a compiler for a forth CPU. The problem is, that both systems are very different. And a c compiler usually needs lots of registers, which isn’t provided by a stackmachine.

Let us go a step back. What is the problem? After turning a computer on, the user want’s to do with the machine. For example play a game or surfing in the internet. Doing so is only possible with software. A short look to github shows us, that 99% of all programs are written in C/C#/C++/Java and only 1% or less are written in a stackbased programming language. Why is the number of sourcecode in Forth not higher? This has to do with education. The programmers are not familiar with it, it isn’t teached in the university. So the general question is: how to write software in Forth?

Running software which is already written is easy. There are many Forth machines out there and lots of Forth CPUs. But without the sourcecode they are not useful. As far as i know, there is no commercial company like Microsoft out there who is programming operating systems, games and office applications in Forth. And even Forth experts, who can write sourcecode in this language are preferring other language for writing code.

Lets take a look into the details why programming in Forth is hard. The first fact is, that a Forth program has the same lines of code like a c program. Forth is a high-level-language for implementing an algorithm. If a jump’n’run game takes in C++ 1000 lines of code, than the same games takes in Forth also 1000 lines of code. The reason is, that all the routines like gui, keyinput, game-engine must be available in both programs. Reducing the number of lines is in theory possible but the average programmer is not able to do so. Even if Forth is a powerful language it has no magic trick to express with fewer lines of code the same.

On the Internet is a famous Forth game available called “Darkstar” which is very well programmed. If somebody want’s to port it to C the resulting sourcecode has the same length. That means, from the number of codelines Forth and C are equal. And if somebody is fluent in Forth he would take the same amount of time to code the game. And this is perhaps the main reason why most programmers are not using Forth. They have no advantage with the language. And in case of doubt, Forth programming is more complicated because it is different.

The elegance and the simpilcity of Forth is its biggest weakness. Programming in Forth means, program a computer. The most users doesn’t want this. They want to program software, that means for a virtual machine, which is in their imagination. The use the C language for communicating with the machine. Not with a real CPU or a real computer, but with an interface provided by C/C++. This interface is called structured programming or object-oriented programming. Not using it is from a technical point of view possible, but it is harder.

In a previous blogpost I have compared Forth with a fixed gear bike. This is true. A fixie is the superior bicycle, that means it is state of art in mechanics. But, 99% of all people are not interesting in fixed gear. They understand perhaps the difference, but they do not want to ride a bike, they want to drive around. Sometimes ago they have learned to use a bike, and this learning implied that every bike has a gear, which means it is possible to leave the feet on the bike. Somebody may argue, that the have learned the wrong thing, but in reality it is not possible to learn to drive with a fixed gear bike. It is an economical problem and in some countries it is forbidden to use fixies in real traffic.

Forth migration project

Let us talk about a Forth migration project. The Windows 10 operating system has around 40 million lines of code. Mostly written in C and C++. If we want to have the same software in Forth, the same number of codelines are nessary. That results into a Forth project containing 40 million lines of code. The average programmer writes around 10 lines of code per day. This results into 11000 man-years of programming. The average programmer costs 50k US$ per year, which results into 0.55 billion US$ total. This new Forth project wouuldn’t run on normal CPUs very well, so we need new hardware which costs also many amount of US$. And what is the benefit? Right, there is no benefit. Because the new Forth operating system would have the same problems like todays Operating systems. .

I have described above two major bottlenecks. At first, Forth sourcecode needs the same amount of codelines like C sourcecode and secondly, the productivity of a programmer is with 10 Loc/day the same. From my knowledge, there is no way to overcome the problems. Program more lines of code per day is not possible. In reality, programming in Forth would be a bit slower. And secondly, it is not possible to get the same software with less lines of code. It is right, that todays Forth programs are very small, but the reason is not because they are so well written, it is because that their inventor have no time for programming bigger applications with GUIs and similar things.

I see the future of Forth not in the commercial sector but for educational purposes. It makes sense to use Forth for teaching about compilers, stackmachines and turing-machines. Perhaps in the same way, Pascal was used for creating the PL/0 virtual machine. That means, Forth is well suited in a clean environment for leaving out the details. But I’m in doubt that it is possible to use Forth in huge projects for programming games or operating systems.


Definition of “Unix processor”

A “UNIX processor” is a CISC cpu, for example the Motorola 68000. In contrast to LISP CPUs like the “TI Explorer” or Forth CPUs like the J1 it has a very complex design, which contains lots of registers, stackframes and microcode. UNIX CPU’s are sometimes called “compiler friendly” because it is easy to write a C and C++ compiler for that hardware. A C++ compiler for example, needs many registers which is not needed by the Forth language or by LISP. The machine code of CISC CPUs has usually a two-adress-scheme, that means the typical assembly opcode like “mov $04, $02”

Was Tanenbaum right?

In the year 1992 Tanenbaum started a flamewar in the internet with the title “Linux is obsolete” in which he criticized the Linux kernel as outdated. The mainstream media has came to the conclusion that Tanenbaum was right, because Linux is programmed in the C programming language and runs only on UNIX processors. The problem with CISC architecture is mainly, that it is something which has to be overcome. It was over decades a nice tool, but a modern computersystem looks different. Also, the programming techniques of implementing Linux in C looks as amateur work compared to what the Forth community is doing.

Nothing against the Linux/Unix project, it is a wonderful community, but it is somethiing which is located in the past. It slows down the development. The future looks different from the harvard architecture. The future is based on stackmachines written in Forth only. How exactly the future in computing will look like is unclear, but it is important to determine which people are leading the revolution and which not. Linux is something which is to slow for future computing, it is a failed project. Linux has lost the race. In contrast, the Forth community is in front. Forth is according to the name, the 4th computergeneration, which is better than the previous one. It is a new kind of hardware, software and programming style with increased productivity.

Multitasking in Forth

In the hope to improve my knowledge about Forth, I’ve recognized a special topic which can be called controversial. It is called “Multitasking in Forth”. In Mainstream programming languages and CPUs multitasking is no big thing. CISC CPUs like the Motorola 68000 have such a feature integrated as default, and C compiler and Unix operating systems are supporting the feature. Multitasking is one of such things, which makes sourcecode big, and programming in C so interesting.

But we do not want programming in C. Implementing Multitasking in Forth seems a bit more complicated. What I’ve found so far is the statement, that:

“Multi-tasking needs one stack per task (eh, two in Forth: a data and a return stack)”,

Which is obvious, because after executing the parallel task the CPU wants to go back to the original task. The problem is, that the educational model of a twostack pushdown automaton has only the standard-stack of datastack and returnstack. That means, the total number is 2 not more.

Let us take a look of how the Forth community is realising multitasking. They have the word PAUSE, which is written in assembly language and ´saves the state of the register. The sourcecode of PAUSE does what somebody expects. It saves the current stack to memory and switches the task. The pause command works only on Forth system which are running on x86 cpus, because such hardware have enough memory to save the stack. But, we don’t want run Forth on CISC CPUs we want a Forth CPU.

How multitasking can be realized there is not clear. As far as i know, the J1 minimal Forth CPU has no memory, it has only the datastack and the return stack. How to realize Multitasking there?

Here is the sourcecode for the PAUSE command in Mecrisp, It seems, that the stacks are saved in a variable in memory.

Let us investigate how variables are used in Forth. Writing to a memory cell is done with: “23 i !”, and reading the i-variable is done with “i @”. What we need to do is to save the stack to a variable: “ i !”. Then we are jumping to the parallel task.

The multitasking feature is usually part of an Real-time operating system. .That means, we need a Forth based RTOS which runs on a stackmachine.


We can now describe how Multitasking looks like in Forth. At first, there is no physical stack but only a section in RAM, which is reserved as stack. 10 tasks who are running in parallel need 10×2 stacks. If a stack is 10 cells deep we need around 200 cells in memory. Which of the stacks is used right now is determined by the stackpointer. A context switch is done with the pause word, which changes the stackpointer to another position.

Can we call this setup a stackmachine anymore? I have doubts. It is a normal CISC based system which is using lots of cache. Implementing this on a typical Forth CPU like the J1 is not possible. Because the J1 only has two physical stacks, but not access to larger amount of RAM for creating virtual stacks.

In reality, most Forth programmer are not using Forth CPUs, instead they have normal CISC CPUs like an Arduino which they are programming with Forth. So they can use either the context switch of that CPU or program their own PAUSE command in Forth. But, this setup won’t work on a J1 CPU, because like i mentioned above taskswitching is equal to use different stacks for each task.

Game-loop based multitasking

A possible alternative to real multitasking is to use a concept which is used in game programming. Usually, there is a main-loop which iterates 30 frames per seconds. To call a function every seconds inside the gameloop an if statement is used: “if (frame%30==0) then subtask. That means, the game loop stops and the subtask is executed. Running this program results into a multitask like workflow. That means, there are running two programs at the same the same time.

The context switch is surprisingly not existing. That means, on top of the old stack new data are stored, because it is equal to call a normal routine.

: sleep ms ;
: main 10 0 do i . 1000 sleep loop ;

The Forth code is a for loop (the game loop) which calls a subroutine sleep. From a certain point of view both routines are working in parallel.

Stackmachines are hidden register machines

A look into the problem of context switching on stackmachines have results into a remarkable fact. In reality, all so called stackmachines are using a stack buffer. That is an area in memory which is referenced by a pointer. If the operating system switches between the tasks, so called context-switching, the pointer to the stackbuffer is adjusted. So the new task gets his own datastack and return stack. What does that mean for reality? It mean, that real Forth CPUs have a way more than only a datastack and a return stack, they have a cache witch many stacks.

Sometimes in literature it was told, that stackmachines are in reality normal register machines. And here is the reason why. If a stackmachine really provides two stacks for data and return addresses, the machine is not able for context switch. Let us investigate the problem on real Forth implementations. Nearly all x86 based Forth Virtual machines are implementing the PAUSE command. Pause is using the capabilities of a register cpu for context switches. It saves the current stack to memory and retrieves the new one. That is one of the reason why Forth is slow.

If Forth is used not an x86 machines but on Stackmachines, the context switching works different. There are two options. The first one is, that the programmer is not recognizing the problem, because he only runs one program at the same time. Or he needs context switching, than the underlying cpu must support it. How exactly this is done is a bit like voodoo magic. But in reality, Forth CPUs like the GA144 have such feature integrated, which means it is not really a stackcomputer but a registermachine which calls themself a stackmachine.

Apart from real machines we can research the problem in theory. Suppose we are constructing a two stackpushdown automaton with pen & paper. .This machine is able to run Forth, but the machine can’t switch between tasks. For doing so, we must first extend the pushdown automaton to a Von-neuman-architecture, which is equal that it is no longer a Forth CPU. Only then we can switch between the tasks.

What I want to explain is, that stackmachines are an illusion. It is not possible to build them in reality.

Limits of Forth

In contrast to the public assumption Forth is not so powerful like it looks like. For describing the weakness in Forth it helps to observe projects which are dedicated to the idea of writing a operating system in Forth. I’ve found as a good example, the project is called “Forth OS”. The feature list shows everything, what Forth usually not have and which was implemented in the Forth OS:

– Floating points

– harddrive access

– graphics support

– Multitasking

– USB access

– Audio output

The Forth OS project is great. The user can easily draw a line on the screen and he can even start many programs at once. But, all the feature are implemented in sourcecode. For getting access to it, the Forth OS software has to be downloaded first, which is arround 1.4 MB in size and consists of many assembly programs. It contradicts the story of Forth as small and beautiful language, instead the user get a MenuetOS like bloatware system. And it is very likely that future iteration of the Forth OS project will increase the filesize further. So perhaps in 2 years, the occupied discspace will around 3 MB.

Let us compare Forth OS with a normal Forth. But what is a normal Forth? Perhaps retroforth, gforth or VFX Forth? In reality, most Forth projects are more than only a two stack machine, they can be classified as operating system with an integrated language. Where is the difference between Forth and a mini-operating-system written in C? I don’t know, but let us investigate a minimalistic Forth. Has this incarnation a multitasking feature or high-resolution graphics? No, it hasn’t. That is the reason why the Forth OS project was founded, to extend Forth with needed features.

I’m not sure what this can tell us about Forth in general, but it seems that a minimalistic Zen like Forth works only in theory, if no real computer is used. In reality, most Forth users are using Forth together with an existing operating system, they are installing VFX Forth like mega-IDEs or they are writing a Forth OS like the above cited project.

Let us investigate the sourcecode in detail. The Forth OS project has around 200 kb Assembly written code for the graphics, multitasking, usb and all the other features. In the comment, such a system is called “Forth”, but in reality it is a more an operating system written in Assembly which has an integrated Forth scripting language. The first question is, has the maintainer of the project done something wrong? I don’t think so. The code looks efficient and it works great. I do not see how to make the same feature with less sourcecode.

It seems, that a demand for many features results into a huge operating system. That is not only the case for Forth but for other projects too. I think, the problem is something else. In the description what Forth is, usually a minimalistic stackmachine is described which can be iimplemented in around 2 kb ROM on tiny computers. But I have doubt, that such a Forth is useful in reality. Real Forth projects like gforth, Forth OS and VFX have more features than only dup and +. And as a consequence the occupied discspace is much higher. I would guess, that a good forth project which is doing something useful is nearly the same like a C programming project for programming an operating system. That means, between C and forth is no difference, except that C programmers know from the beginning that their system is bloatware and needs a register-machine.

Let us describe the vision behind Forth. The idea of Chuck Moore was to program tiny software, which is not more than 500 bytes. And indeed it is possible to write Forth code in that size. But the same is true for assembly, Brainfuck or C. If the aim is to program real application, than Forth has no advantage. That means it uses 1 MB or more like a mainstream programming language.

Is it fair to describe Forth as a stack-based programming language? It think this description is wrong. A stack-based programming langauge is something which is teached in computer courses at the university. They are based on a theoretical computer, called pushdown automaton with two stacks which is indeed very simple and elegant. But realizing this machine in reality is not possible. An Forth operating system for beginners is around 1 MB in size, a professional Forth IDE has around 10 MB. And in most cases such systems doesn’t even run on Forth CPUs but they need Intel x86 machines.

The misconception is, that the newbee has the hope that he can program a minimalistic Forth CPU with no operating system in Forth, only with the language itself and this result into high effiencient code. This assumption is not practical. In reality, there are not Forth languages but only Forth operating systems. That are projects like the retroforth, gforth or eforth.

Let us take a look on other Forth implementations, They are usuually 500 kb in size. Is that a mistake, are the Forth environments in reality much smaller and most of the sourcecode is only documentation? No, the only thing what is wrong is the comparison between reality and vision. The vision of Forth is, that it something which is not bigger than 1 kb or maximal 2 kb with addons. The reality is, that existing Forth projects are similar to operating systems. They are mostly written in assembly language and after booting them on a pc or on a microcontroller a Forth like language is only used as a userinterface, like the UNIX shell.

Let us compare Forth with Unix. Is Unix equal to the bourne shell? Consists Unix of pipes which are connected together and for loops for iterating? No, it is only the shell. Unix in reality is an operating system which is at least 200 kb or in newer versions much more bigger. .And the same is true for Forth. A real Forth like the bigforth project consists of around 50 MB sourcecode. Forth as a userinterface and shell is only a small part of it.

Let us take a look into Wikipedia what the definition of a shell is. A shell is a command line interpreter for executing external programs. For example, somebody can call the grep tool, or the df program. A shell without external tools is nothing. The combination between shell and programs is called operating system. Now let us take a look into the definition of Forth. The mainstream definition of Forth is, that is a concative stack-based programming language. This definition is wrong. Forth is only the commandline interpreter in a much bigger Forth operating system which consists of a multitasker, graphics interface and harddrive driver.

Let us describe what we can do with the Forth language itself. If it is executed under a normal operating system, the Forth language can access existing C routines. If the Forth interpreter is executed inside an Forth operating system it can execute code written there. But, if the Forth language is run alone without an operating system it can execute nothing. What the user can do, is extend a bare-metal forth to an operating system, but he will end with an operating system too which is around 1 MB in size and contains subroutines for graphics, harddrive and GUI.

Advantages of Forth

The literature about Forth is huge. The main problem for beginners is the question, why he should take a look on it, if he can program in C/C++ fluent? Answering the question can be done on an academic level. Let us start our journey to Forth with a turing machine.

In the busy beaver challange, the turing machine consists of tape which is the memory for the computer and a program, which is stored in the head of the machine. The head drives around the tape and reads 1 and 0s. So nice so good. Data is separated from code. The data is on the tape, and the next instruction comes from the head.

But what if, we want to store data and program on the same tape? Then we get a so called “two stack Pushdown automaton”. According to the figure on top of the blogpost, the tape contains not only data on which the program operates but has also the program. It is a Forth dialect which pushes the 3 to the stack, then the 4 to the stack and then adds both. The contrast to a normal turing-machine is, that there is no separate program stored in the head, but there is only one tape which holds all the information.

Lost at Forth? Maybe boolean algebra is the answer

Many people have problems to understand the Zen of Forth. They are not convinced, that a minimal computer need a huge 10 byte long stack or even a program counter. What they need is smaller device which is easier to program. Perhaps the Boolearn algebra is for them the right direction. It is not a computer, it is something which can be extend to a computer. The main difference is, that boolean algebra can’t be programmed, instead it works like a neural turing machine with hardwired logic. A good introduction to the topic is given by which provides a diagram for a full-adder and (very important) has also the boolean expression for that device.

Explaining memory with boolearn logic is a bit more complicated. has on page 13 a diagram for the SR flipflop. The idea is, that the flipflop has a state of 0. And a certain inputsignal (1 0) brings him to 1. If we want to flip the the device back to zero, another input signal (0 1) is used. The concept behind is called “sequential circuit with a clock”.