The best way in understanding Forth is to think about programming a virtual machine interpreter. A virtual machine can be realized in many way. Such a system is needed if somebody want’s to parse a computer language. Usually the VM provides some commands like print, goto, for and if. The VM takes a program and executes it. Before it’s possible doing so, the VM needs some helper variables, notably an instruction pointer for referencing the current position in the program and a registerset for storing call-back adreses and loop variables. The funny thing is, that in most cases a VM is constructed according to the Forth principle or at least it will be very similar to Forth. That means, the VM provides the following elements:
– opcodes which can be called from the outside
– instruction pointer
– datastack, returnstack
The VM interpreter gets the sourcecode and executes the commands in linear order. After an iteration the instruction pointer is increased. If a jump to a subfucntion is needed, the callback address is stored in the stack. That means, a minimal Virtual Machine interpreter is equal to a Forth system.
The open question is, what useful things can be done with such a Virtual machine. The answer is, that at the same time a VM is a powerful concept but it’s also a trivial concept. A VM itself isn’t solving a problem but is creating a problem. Suppose, we have create a VM for executing language. The follow up problem will be to construct the hardware for executing the VM. WIthout a computer, a VM can’t be executed. But also from a software side a VM creates many new open problems. Suppose we have a VM which can handle a small program. What should be included in the program, which subfunctions are needed? This question remains also open, the VM itself can’t answer it.
Virtual machine interpreters in general and Forth in detail are not an answer to a problem, they are the question mark. The reason why both is fascinating has to do, that it’s unclear what somebody can do with it. A VM is comparable with a C++ compiler. It’s possible to create anything or nothing with it. It possible to write an operating system, a game or whatever. The difference between a C++ compiler and Forth is, that Forth provides this problem more direct. That means, if somebody is writing a small VM in 100 lines of code the team will have lots of trouble. Because he asks for the hardware, he asks for software, for algorithm and so on.
The main reason why Forth is ignored by Mainstream programmers is because it doesn’t provide answers but new questions. For example, the Linux operating system provides lots of answers. It explains to the user how use the harddrive and how to establish an internet connection. The Python programming language also provides lots of answers. It contains of a powerful library, object-oriented features and variable types which allows to simplify programming. In contrast, Forth doesn’t provide anything. It’s the VM and nothing else. Forth is some kind of computer literacy test, if somebody is able to program the operating system from scratch, and invents the algorithms from scratch. If somebody is not familiar with computers he will be lost at Forth.
What we can say about Forth for sure is, that it’s some kind of virtual machine definition. Any Forth contains of opcodes, instruction pointer and a stack for the return address. A virtual machine is needed if a computer language should be executed. It’s not possible to run a program without a virtual machine.
It’s possible to create a virtual machine interpreter which isn’t Forth. The addition is to replace the stack with registers, everything else remains the same. And voila, it can’t be longer called a Forth system. The funny thing is, that the performance of such a VM remains the same, only minor differences are there. Also the programming of a register based VM looks very similar to a stackbased version. Let us give an example. A stackbased Virtual machine would provide an opcode like:
4 5 add
The result will be 9, because both numbers are put onto the stack and the add operator gets executed. Now the add opcode in a registerbased virtual machine:
The syntax of the opcode looks more like the C programming language. Like in the stackbased example, it’s possible to give the function a parameter, but this time the parameter have a certain type. Parsing register-based opcodes is a bit more complicated. That means, the parser is written in more lines of code. In exchange, the programmer will have less trouble in daily use, because the syntax looks similar to Python and C syntax. In another example the difference becomes more clear. Suppose, we want to give a parameter to an opcode which isn’t a number but an array. In a minimalistic stackbased virtual machine, the opcode would like this:
$1000 5 print
The Virtual machine parser will read the stack and interprets the first parameter as the starting point of the string, then comes the length and last the opcode. After executing the command the string is put to the screen. And now the print opcode in a register based virtual machine which isn’t Forth:
This statement looks similar to the Python syntax. It’s easier to write for the beginner programmer but harder to parse by the virtual machine. Writing a VM which is able to execute the second command is a harder task. It’s possible but the number of codelines is bigger. The reason, why C and not Forth is the dominant programming language has to do with that most programmer like the second syntax more.
What we can say for sure, is that the “print ‘hello’” statement is not allowed in Forth. There are two reasons against it. First, the print command has to be at last and not at first, otherwise the parser is not able to execute the correct opcode, and secondly is not allowed to put a string to the stack. Instead the pointer to the string must be put on the stack. Both constraints can’t be made undone by better programming, they are built into Forth. The only way to bypass the bottleneck is to use a different kind of Virtual machine interpreter. For example, a register based VM which has a more elaborated parser which can handle strings as well.
According to the self-definition of Forth the language is extreme flexible because of the absence of any standard. In reality, there is a standard, called minimalism. This principle forbids, that somebody extends Forth with libraries, creates new libraries or writes a different kind of parser. I want to give an example. In theory it’s possible to transform a stackbased VM into one which can parse the “print ‘hello’” statement, so that the programmer doesn’t have to change his programming style. The reason is, that in Forth any program can be created, so it’s possible to create in Forth a python interpreter too. This will work like a normal python interpreter. But in reality, no such a project was done in the past, because the result will not look like Forth anymore, but like Python. The same problem is there for object-oriented extensions to the Forth language. There are many of them available which allows to program the sourcecode in OOP style. But, in reality, nobody is using mini-oop.fs and similar extension because it violates the standard, that no standard is allowed.
The Forth rule can be explained with minimalism, that means anything is allowed which reduces complexity, but anything is forbidden which increases complexity. The open question is, what can be done with Forth if everything what will make life easier is forbidden? The answer to the question is given by the Forth community. They are exploring things inside the Forth standard, which looks very esoteric, but it make sense from a certain point of view.