The Virtual machine interpreter in “Karel the robot”

The “Karel the robot” software is an interesting project. It was developed in the 1980s and many clones are made. The original program was written in Pascal, but there are versions available in Javascript, C#, C and many other languages. In recent years not only 2d versions were implemented, but also a 3D version is available somewhere. But what exactly is “Karel the robot”? It’s more than only a teaching environment for learning programming, it’s an environment in Artificial Intelligence.

Let us focus on the inner core of the software, which is a so called virtual machine interpreter. That is a module which takes a self-created program and executes it step by step. The interesting feature is, that the VM can stopped and the “Karel the robot” program doesn’t halt. It’s some kind of box in a box. Perhaps we should go a step backward.

A game-engine is known from implementing a computergame. A game engine takes a command from the player, for example “jump” and updates the internal state of the game. That means, the game engine reacts to the command with adjusting the absolute coordinates of the player and it also redraw the player sprite on the screen. What a game engine can’t do is executing longer programs. So there is a difference between a game-engine and a virtual machine. A virtual machine can be seen as fullblown interpreter. It takes a command, but it can parse a complete sequence of commands too. The VM has an instruction pointer, a datastack and a returnstack. The last one is needed for jumps in the program.

“Karel the robot” doesn’t only provide a game engine, but a sophisticated Virtual machine interpreter. This makes the software unique. Between a VM and an Artificial Intelligence is a difference. A VM needs a program, while an Artificial Intelligence can run by it’s own. Or to speak more directly: the program which is feed into the VM is the Artificial Intelligence. That means, if the user of “Karel the robot” has written a sophisticated control program he has created an AI. He can start the program in the interpreter and the robot will move on the screen. And the user’s created program can fail too. That means, if the program doesn’t make sense, the robot will not reach the goal in the maze. The question is now: how does look the program which control “Karel the robot”?

This is not answered by the project. Figuring it out has to with learning something about Artificial Intelligence. If somebody is able to deliver a program to solve a task, he has understood what Artificial Intelligence is. And that makes “Karel the robot” an interesting game.

While reading through the literature, i’ve found many promising techniques to create a program which controls “Karel”. The first one is manually coding. That means, the programmer types in some commands, and observers the result. Then he modifies the code until the robot is doing what he should. But there are many other alternatives to create new programs for example: AI planning, hierarchical reinforcement learning, genetic programming or partial order planning. The list is endless. A general technique in developing working code for “Karel the robot” is to focus on the vocabulary list. That means to read carefully through the existing commands and invent new one which are powerful. If a list of words is available it is much easier to put them into a working program, either manually or by automatic planning. In contrast, if the grammar consists only of lowlevel actions like “moveleft, moveup, movedown, moveright”. It’s harder to create ontop of these words a powerful program.

Graph grammar

What I want to explain is, that the user has to invent his own “Karel the robot” mini-language. He takes the predefined grammar and extends it with new action commands. In the figure, the default grammar is shown in a graph structure. In contains of action and sensor words. Sometimes, the grammar is called an API (Application programming interface) because it provides a library of functions. But the term API is focused on the implementation in sourcecode, the literature calls it a graph grammar because it defines a language. The words in the language can be used together to solve problem. They can put in a program and are extended by default words like if, for, while and goto. Like i mentioned above, the amount of words in the grammar is equal how powerful the grammar is. A more complex grammar is equal to more advanced programming.


Why “Karel the robot” is the best introduction into Artificial Intelligence

According to the self-definition, “Karel the robot” was invented as a programming game to teach students to become familiar with Java. But in reality, nobody has problems to write programs in Java, Forth, or in C. Because programming itself contains only of some well known statements which are put together into functions and classes. The reason why “Karel the robot” is interesting is somewhere else. At foremost, it’s a domain specific language for robot control. In it’s standard package, Karel comes with predefined words like move, turnleft and stop. But the user can create additional subfunctions, for example “walk-on-a-line” or “walk-in-a-cycle”. The ability to extend the vocabulary with new functions is the best practice method in realizing artificial intelligence. The concept can be extended to control any kind of robot, for example biped walking robots or pick&place grasping arms.

Let us go a step backward and describe the idea behind “Karel the robot”. It’s an environment for running a program and it’s a list of predefined motion primitives. The interesting aspect is, what “Karel” is not: it’s not a classical AI. That means, after the game is started nothing will happen. It’s only the IDE for creating a program and the user has to decide which statements are put into the file and in which sequence they have to be run. That means, the user can create the AI from scratch.

The famous SHRDLU program from Terry Winograd can be interpreted as a predecessor to “Karel the robot”. Like Karel it provides a text-to-animation interface. The difference is, that SHRUDLU is not able to run a complete program but works only the interactive mode. That means, the user can enter a single command. In “Karel” he can enter a complete program which includes loops, which makes “Karel” the more powerful environment. The limitation of “Karel” is, that the world is restricted to a 2d maze. No physics engines runs in the background, and no 3d projection is available.

Which kind of technique can be used to realize a concrete robot program? It contains of two parts: a vocabulary list and the program itself. The vocabulary list is called by computer scientists a grammar. It’s a list like “movetotarget”, “moveleft”, “opendoor” and so on. Upon this grammar a computer program has to be constructed. On the first look it seems to be a complicated way to realize a robot, but in reality it’s surprisingly easy. It is some kind of interactive computing, that means, the user needs to open the “Karel” simulator and has to try some examples to become familiar with the technique.

Solving simple tasks like a robot which walks in a maze can be done with a small vocabulary list of not more than 20 words. If the task is more complex, for example to create a soccer playing robot which works in a team, the wordlist needs to be larger. Creating such a vocabulary is the core-task in robot programming. That means, the grammar has to adapt to a certain domain.

Artificial Intelligence with informed search methods

Algorithm in Artificial intelligence can be separated into uninformed search which is equal to blind brute-force search in the state-space and informed search. Uninformed algorithms are easier to implement. They are known as RRT, A* and depth-first search. The best example is a TicTacToe solver which is testing out all possibilities a user has and then determines the best decision.

If the aim is to solve only toy problems like TicTacToe or chess, uninformed search is a here-to-stay. All what is needed is a fast cpu, a simple to solve game and the solver will generate the plan by it’s own. The problem is there if, the available cpu isn’t very fast and the game is more serious. For example, a robot control system. That blind brute-force search fails.

The better alternative is the informed search, better known as heuristic search. Unfortunately this topic isn’t researched very well because it’s more complicated. Explaining how a brute-force search in the graph works is easy, but explaining how to formalize domain knowledge in an algorithm is much harder. A good starting point for creating an informed sampling algorithm is a domain specific language. In some literature, this is called a mini-language and a famous example is “Karel the robot”. Karel the robot is a programming environment in which a robot walks through a maze. The interesting feature are twofold. At first, an integrated runtime environment is there. That means, the user can run his created program in a single step mode. And secondly, a robot-behavior language which contains words like “walk” “takeobject”, “moveleft” is available. This robot-language is equal to a domain specific language. That means, it’s not a classical programming language which contains of assembly commands or which contains “print and goto statements”, but the words in the mini-language have to do with the domain of robot-movement.

The reason why informed search is so powerful has to do with reducing the state-space. If the options which are available are meaningful, even random actions make sense. Let’s make an example. A random generator creates for the mini-language “Karel the robot” a plan. He figures out the sequence: walk, movenorth, takeobject, walk. This pleasant sequence has to do with the task. Even if the sequence is not correct and will guide the robot in the wrong direction, the action sequence has to do with the task. In contrast a action sequence like “print “hi”, goto 10, print “world”, for i in range(2)” has nothing do with moving a robot in a maze. Like the first example, it’s a randomly generated program but it’s far away from a desired robot movement. The reason is, that in the second example, the underlying language is wrong. That means, actions like “print” and “goto” are not domain-specific.

Informed search algorithm are created with a domain-specific language. This language is used to create pid-controller and other high-level functions. These functions are arranged manually or by a solver into a program. The programming technique is similar to the STRIPS planning systems but is more flexible. The idea is, that without domain-specific knowledge it makes no sense to try to solve a problem. That means, before a mini-language or a STRIPS description isn’t available the solver can’t be started. Informed search means to abstract the problem instead of trying to solve it in the first trial.

Definition of Artificial Intelligence

A modern definition of AI would go into the direction of an intelligent machine. But the classical definition in the 1950s is more interesting. In that time, Artificial Intelligence was not a technology but a philosophical debate. The main idea goes into the direction, that the normal people have no access to a computer and have no access to written papers about Artificial Intelligence so there is a need to communicate to the public what AI is. Defining AI in the 1950s was done by withhold information. For example: “I have heard, that somewhere Computers are available. How these wonderful machines are working internally is unclear. These machines are able to think. The programmers are calling it Artificial Intelligence. How this thinking works technically is a secret, only a few people worldwide know about it.”

Per sure, this is not a definition of AI, but it is a speech from a priest to the normal population. In the 1950s this was quite common. Like i mentioned in the beginning, most people never have seen a computer and they would believe anything. And an AI software which runs on top of the computer goes far beyond the imagination of ordinary man. So there was a need to mediate between both side. That means, in the 1950s AI was communicated in a certain way without mentioned the details. The technology in that time was not very advanced. Normal IBM mainframe computers were used and Lisp software which was able to search in a game tree runs on top of these machines. The interesting aspect was, that this details were not communicated in the 1950s. Instead the public debate around AI was dominated by a different kind of language which was a language for people who were unfamiliar with computers.

The definition of AI depends on the target audience. If we want to explain to somebody who is not familiar with computers what Artificial Intelligence is we would use a certain vocabulary. If the target audience are programmers who can already program software but doesn’t have experience with robotics, the definition would become different. The most accurate definition is created if the debate is held in a peer-group. That means, if one AI expert explains to another AI expert what AI is. Such kind of definitions are new because the number of AI experts who are debating about the details in the public is small.

The last AI Winter was right

The last big AI winter was there from the beginning of the 1990s. It was a counter-movement to the AI revolution in the 1980s, notably the upraising of LISP machines and Prolog. The AI Winter since the early 1990s can be summarized with that Artificial Intelligence didn’t worked, that LISP machines are broken and that AI has to be ignored. From today’s perspective this was the best decision ever. Let us take a look, what AI experts of the 1980s wanted to sell to the customer.

The AI prophets of the 1980s wanted to make clear, that the future contains of expert system, neural networks and machine learning. With newly developed programming systems like Prolog it will become possible to create a new type of software which is able to do intelligent things. This is the general idea behind the publication of the late 1980s in which LISP machines were marketed to the customer. The only sense-making reaction to the effort to promoting AI was the AI Winter. That means, that the customer declares everything what has to do with AI as not wanted and try out something which is less developed.

The interesting decision since the early 1990s was, that the mainstream computing business made a step backward. That means, not LISP machines and Prolog compilers was bought by the customer, but more traditional tools like 32bit CPUs, GUI tools in C++ and computergames without any Artificial Intelligence inside. That means, the technology in the early 1990s was much higher developed, but the customer decided that he is not interested.

Let us observer what outdated technology from the 4th computer generation can do. With unix servers it is possible to host websites, with C++ GUI it is possible to program business applications and with Turbo pascal it is possible to educate people in structured programming. Nothing of these techniques have to do with neural networks and Artificial Intelligence. It wasn’t the 5th generation but only the 4th generation. But in summary it was a powerful toolset which started the internet-revolution and the expansion of the homecomputer market. So called non-AI technolgy is mainstream today, that means, a standard PC which can do everything except thinking is accepted widely as a useful tool. The upraising of computer technology in the 1990s and 2000s was the result of ignoring Artificial Intelligence. That means, the customer and large tech-companies have focussed on less advanced but useful standard computers. The problem which were solved had nothing to do with expert systems and Prolog but with normal computer programming, for example how to connect a printer to an operating system, how to program a small computer game and how to use MS-Word for writing a letter.

Since the 1990s, Artificial Intelligence is the elephant in the room. That means, everybody know of him, but he is ignored. If we are taking a look in recent published books and take a look what companies are doing all the day, we will find that it has to do with computer hardware, software and programming languages but not with Artificial Intelligence. That means, mankind ignored collectively that Artificial Intelligence thing, they set different priorities.

Since the 1990s, the AI topic was put on the last place and other topics like normal databases, internet connectivity, GUI applications get the most attention.They were discussed in universities, they were asked by the customer and the economy was build around it. From today’s perspective it was a great decision because it allows to develop computing in general forward. AI would need to much energy. In the 1990s it was not possible to bring normal computing forward and Robotics at the same time. If resources are limited, a decision has to be made which is more important. and prioritize Artificial Intelligence on the last rank was a great decision.

Ray Solomonoff vs. Lego Mindstorms

Ray Solomonoff (1926-2009) was a mathematician, computer scientist and Artificial Intelligence visionary His working area was probabilistic theory which includes inductive inference. He was involved in university education and held many lectures at conferences. For the younger generation,

Solomonoff is hard to understand. His mathematics comes from a time before the invention of first computers. He was fascinated by analog computers and most of his work was done on a theoretical base without using any computer. And perhaps the most interesting aspect is, that an alternative to the writing and the thinking of Solomonoff is possible. The opposite is simply called Lego Mindstorms. It has also to do with Mathematics and Artificial Intelligence, but has a different approach. The main idea behind Mindstorms and robotics for student is, that no previous knowledge about stochastic is needed, but a must have to start with the programming is the robot itself and a laptop to programming the device.

I would call the approach of Ray Solomonoff and Lego Mindstorms antagonist. That means, if Solomonoff would go to north, mIndstorms would go to south. After a while they are don’t meet at the same point, instead they are going away from each other. From a perspective of computing history, Solomonoff and his ideas represent the past, while Lego Mindstorms and similar projects are the future. I wouldn’t call Solomonoff’s work wrong, it is simply no longer relevant for current education. That means, Lego Mindstorms is from one perspective the same because it has to do with thinking machines. But on the second look, it is completely new.

But what exactly is the difference between induction theory of Solomonoff and Lego Mindstorms robotics? The answer is, that from the subject of Artificial Intelligence only the part is used which has to do with games and programming, and this is used in the context of robotics. All the other parts, which have to do with understanding of mathematics itself, with greek symbols, literature references and theoretical models is ignored. The funny thing is, that game-based AI works great. A game is some kind of sandbox in which newbies and experts can try out what they want and everything which is too complicated is not part of the game. The result are not only games, but easy to understand games. And this allows non-mathematicians and non-scientists to become familiar with advanced technology.

AI discussions forums online

In the internet many AI-related discussions forums are available: (4200 questions) (2200 questions) (5000 questions) (4000 questions)

All of the forums are part of the Stackexchange network, but the questions are distributed between many forums. The main one is SE.AI, but also the other forums are providing lots of content. In the Stackoverflow forum itself, AI is mostly downvoted as offtopic. The problem is, that AI topics are using lots of academic background knowledge and it is rare that a simple code snippet can answer the question. In the Crossvalidated Forum (stats.stackexchange) many AI related questions are available. It is some kind of mainstream machine learning forum. The tag ML has 11000 questions, neural network has 4000 and deep learning has 1800 questions.

Overall i would estimate that in the Stackexchange networks around 30k-50k questions are available. That is smaller than the 16 million questions from stackoverflow itself, but is a huge amount of content.