Tools for robotics programming

In the last time, the number of postings dedicated to robotics was small. The reason was the transition from German-language to English and as a result the lower overall output. It is still the case that writing in English is a bit slower than writing the same text on German. So I have an excuse why the main topic in this blog, artificial intelligence, is currently in the background.

Today I want to heal this vexation with an introduction in general techniques of how to program a robot. I’m not sure, if the following tools are widely known but repetition is always good. At the beginning I want to start with a survey in programming languages. To make it a short: C++ is the best programming language. The syntax is easy to read, the execution speed is extreme high, the language can used for lowlevel- and highlevel-tasks and last but not least the object-oriented style is fully supported. I admit, that learning for beginners isn’t easy. The tutorial on is a bit longer and consists of bit-manipulation up to STL templates all, what the C++ expert needs. For most beginners this tutorial is perhaps a nightmare and I understand everybody who prefers Python over C++. On the other hand it is possible to use only a subset of the language, which is similar to normal python code, so that in reality even newbies will see no difference between both languages.

Here is a short example of a minimal C++ program which looks friendly:

#include <iostream>
#include <string>
#include <SFML/Graphics.hpp>

class GUI {
  sf::Event event;
  std::string selecttool="left";
  void run() {
    mysettings.window.create(sf::VideoMode(800, 600), "SFML");
  void out() {
    mysettings.window.clear(sf::Color::White); // clear
    mysettings.guimessage.push_back("selecttool "+selecttool);

int main()
  GUI mygui;;  
  return 0;

I have not tested the code for compilation and perhaps it will produce errors, but from the first impression it is not more complicated than Java, Python or C#. At the bottom is the main-function and above is a GUI class which consists of some methods. Thanks SFML most routines for drawing windows are implemented so it is possible to describe C++ as an easy to learn programming language. It is not more complicated to build the first application from scratch than with any other language.

What programming makes hard, is not the syntax of a concrete language it is more the right usage of external libraries. For example if not SFML but a 3D framework is needed plus a physics engine and some kind of animation routines, this would result in a more complicated software. But this is also true for non-C++ languages. At a conclusion my advice for novice is to switch as fast as possible to C++ because there is no language out there which is more powerful.

I have read some online discussions about what the best programming language is. Most arguments for example in “Perl vs. Python” are right and it is funny to follow such discussion. One detail in such comparisons is remarkable. Until now, i had never read a comparison between C++ and another language where at the end C++ was in disadvantage. Sometimes the discussion is held with “C++ vs. C#” in mind, but the arguments for C# are weak. Even hardcore C# users in the context of Unity3d are not so briskly that the claim that C# is superior. So it seems, that until now C++ is the queen of all languages and any other language accept this. I can say, that it is 99,9% sure that the next ten years nobody will start a thread where he seriously doubts the dominance the brainchild of Bjarne Stroustrup.

Apropos most dominant alpha-tool. Another software which is known as nearly perfect is the versioncontrol system “git”. Even programmer who are using something different are secretly fans of it. And it is not only used in Linux environment but also under Mac OS and Windows is git the most powerful software out there. To explain the benefits is not so easy as it looks. I would describe the software as a backup tool where every snapshot get a comment in natural language. Especially the multi-user feature of git makes it indispensable for every serious programmer.

After this short introduction of how to program a software in general now comes the part which has to do with robotics. What is robotics? The best answer to this question was given by Brosl Hasslacher and Mark W Tilden. Their essay “Living machines” from 1995 is the definitive introduction in the subject. What they left out is a detailed description of the nv-neurons, which are controlling the BEAM robots. nv-neuron looks similar to a positronic brain in Startrek, it is some kind of science-fiction which is not clearly defined in science. Some people say, a nervous network is a neural network like the LSTM network which was invented by Jürgen Schmidhuber, other say, that it is Robot-Control-System like ROS from Willowgarage. Perhaps the answer is somewhere between both?

Something is sure: with a neural network, plus a robot-control-system plus a robot-development-environment it is possible to matering robotics. That starts from walking robots, goes over flying robots, working robots, household robots and ends at social robots which are looking very cute. The discipline overall is fascinating because it is so widespread. Realizing a robot consists of endless subtasks which goes over all academic-topics. From mechanics, electronics, programming, language, humanities, algorithm, human-brain-interfaces up to biology is all necessary. So robotics and artificial intelligence is a meta-science which consists of everything. In the 1950’s the claim was named under the term Cybernetics which was a synonym for all and nothing at the same time. Today’s robotics has the same goal in mind, but this time it works.

At the end i want to betray what dedicated newbies can do, who have no experience with robotics or programming but want to create something. A good starting point is to program an Aimbot with the AutoIt language. How this can be done is explained in youtube-clips. The interesting aspect is, that Aimbots are normally not recognized as real robots, but they are. I would call them highly-developed examples for artificial intelligence, because it is possible to extend an Aimbot easily to an advanced robot.


In computing and especially in artificial intelligence the dominant form of getting feedback is negative connates. That means, what a programmer strives for is a bug or an error. That is the only possibility to learn. Writing a program means normally to producing a compiler error. And if the compiler says, the program is working than you need at least one feature problem, so that the programmer can write a bug report. Or to describe the dilemma colloquial: Nobody at stackoverflow wants to read of a working project, what the guys are interested in is a robot which doesn’t walk, a for-loop who doesn’t work or a problem which is unsolved.


Recent progress in robotics,5

A search query at google scholar for the well known “LSTM neural network” shows, that in the last 1-2 years, the number of papers is exploded. More then 10k papers were published, and perhaps it is more because some of them are behind the paywalls. But LSTM isn’t the only hot-topic in AI, another subject which is also interesting is “language grounding”. Both topics combined together realizing nothing less than the first-working Artificial Intelligence. This kind of software is capable of controlling a robot.

But why is the community so fascinated of LSTM and language grounding? At first, the LSTM-network is the most developed neural network to date. It is more powerful than any other neural network. LSTMs are not so perfect, like manual programmed code and some problems like finding prime numbers is difficult to formulate with LSTMs, but for many daily life problems like speech-recognition, image recognition and event-parsing LSTM is good enough. LSTM is not the same as a neural turing machine, so it is not a wonder-power for solving all computerscience-problems, but it is possible to do remarkable things with it.

The main reason why LSTM networks are often used together with language grounding is, that with natural language it is possible to divide complex problems into smaller ones. I want to give an example: If in a research project the robot should be trained with a neural network, to grasp the ball, move around the obstacle and put it into the basket, perhaps the project will fail. Because it takes to much training steps and the problem space is too big, for finding the right LSTM parameters. But with language grounding it is possible to solve each task separately. The network must only learn how to grasp, how to move and how to ungrasp and then the modules can be combined. Sometimes this concept is called multi-modal learning.

Another side-effect is, that the system, even if it was trained with neural networks, remains in control of the human operator. Without any natural commands, the network is doing nothing. Only if the operator types in “grasp”, the neural network is activated. So the system is not really autonomous which must be stopped by the red emergency button, instead it can communicate with the operator via the language which the LSTM network has learned. That makes it easier for finding problems. And if one subtask is too complex for mastering it with an LSTM network, that part can be programmed manual with normal C++ programming language.

“Beating Atari with Natural Language Guided Reinforcement Learning, 2017”,

In the last paper (Beating Atari) is a project described, which is capable for solving the game “MONTEZUMA’S REVENGE” which was in former DeepLearning projects not solvable by AI. What the researcher has done is combining an Neural Network with language grounding and voila, they get a well trainable and high-intelligent Bot.

Open Questions to Robothespian

The Robothespian robot is an amazing piece of hardware. It looks like „Titan the robot“ except, that no man is inside the machine. Robothespian can not walk, but currently the engineers are working on a walking prototype. Perhaps they use the Simbicon controller or something similar to enable the gait pattern. But the main mystery of Robothespian is the question, if openHardware was used or not. According to my memory, there is a video on Youtube which shows a fab where the Robothespian models were developed, and for 2 seconds a interesting machine was shown, which can produce out of a white liquid a Robothespian shape for the face. If the machine has a name, perhaps „Molecular assembler“ is the right one.

Why is that so important? Because without 3D Printing the parts of Robothespian, the price tag of the final machine is very high. From the atlas robot which was used in the DARPA robotics challenge, it is know that the hardware alone costs about 1 million US$. If it is possible to reduce this …

There are to open question which can not be answered with todays information: 1. which software is used on Robothespian? 2. how much is the price of the hardware?

If the software is advanced, and the reproduction costs of hardware are low then the invention is not only called robot, instead it is called clone army. That sort of technology is capable not only running without human in the loop, but also without money-in-the-loop.

Semi-autonomous Robot

It is unclear if Entertainment robots like „Nox the robot“ or „Titan the robot“ are real robots or not. But, if it is really true that inside Nox there is no real person, than that person must be outside. Every good robot is remote controlled, there is no alternative. Even if the technology is highly advanced and make use of Neuroevolution.

A simple example of a semi-autonomous entertainment robot is give in the video It is a hobby projekt of a biped battle droid which can destroy ballons with a laser beam. And the important fact is that this is not an autonomous robot, instead a classical joystick is used.

At the first impression it’s the same if the movement of the robots are preprogrammed or not. On parts of the video you see only the robot and not the human operator. So it doesn’t matter if there is a human in the loop or not. The difference is, that without a human, the robots behaviour would be not robotic. That’s a paradox to explain the fact, that a good robot acts like a human would do.

The hypothesis is, that at first a semi-autonomous control must be used, and that every other technology decisions will follow after. Why? Because in the worst-case it is not possible to create a sophisticated software for full fill the project goals. And if the remote control is missing than a lack of artificial intelligence will result in a failure project. On the other way, if a human-in-the-loop is used by default, the system will works under every condition. Even if some parts of electronics will broken or even the software is not perfect.

At second step it is useful to improve the software until a given level. Easy programming means to use „Scripting AI“ which results in the behaviour like that in the video with the laser-bot. The algorithm is called „procedural animation“ which means, that the motion trajectory for the feet are calculated by a pid-controller. And for higher development needs the „procedural animation“ system can be replaced by something which is capable of learning like neuroevolution. The overall design of the robot will be identical: a human is controlling a robot. But this time, the gait pattern is not given by a programmer it is evolved as a neural network.

The main reason for preferring a semi-autonomous robot is, that you will get a „Fake robot“ by default. Like this one here:

On that video a toy robot, called Batman, is used. It is absolutely lowtech, some people would it call „crap“. But this toy robot works. That means, it is possible to set the robot in scene, to fake something, to build an illusion. If the remote-control is broken the robot itself would be broken. There is no longer the illusion of a „real“ robot, which means that is possible to stage a robot.

That to understand is a little bit complicate, i know. But without the possibility to use a fake robot (which means a robot that is not truly autonomous) there is no real robot. Real means, that the robot acts not like a machine which is programmed, real means that the robot acts like a person.

The difference is, that with a remote-control system it is possible to get complex tasks (doing something useful, react to environment and many more). An example for a teleoperated remote biped robot is given here:

The main idea of that video is, that it is not real robot, instead it is fake robot. Which means, that this model is teleoperated. But, on the other hand, the robot is accepted social as a real robot, which means that the other persons try to speak with the system and they know that the system will understand them. In the video there is another interesting aspect of robotics: the other person try to kick the robot. If this robot is only a toy made of electric wires than makes the idea to kick it no sense. Because robots have no feelings, robots don’t know what kicking means. But under the precondition that it’s a fake robot, „kick the machine“ makes sense. The operator understands that gesture.

If somebody kicks a real autonomous robot with a sophisticated AI written in LISP nothing will happens. The machine will be broken and that’s all. It is only a joke to say, that the robot will one day kick you back. Even the robot understands, that he was kicked by a human, he will never have to goal to take revenge. But, if a remote controlled robot is kicked by a human, even the robot is not very advanced, the robot will eventually kick back one day. The result is, that kicking a robot is under some circumstances a social act which is the same like a real crime. Not in the sense that a person was kicked, but in the sense that its a hate crime against a real person.

Actroid Robots running a neural network

The Japanese Actroid Robots are also known as kokoros-dream. The latest model is the Actroid DER3 which can be seen on youtube. The additional Website looks like the „Hubo Market“ from „Real Humans“. The remaining question is: what software is running on these machines?

The answer is simple: the same software which is also running on HRP-4c, on HRP-2, on Honda Asimo and on icub — a cognitive architecture. It is a combination of many different neural networks like MTRNN, CNN, SOM, SOINN and many more. An example for this kind of Artificial Intelligence can be downloaded from sourceforge under the name „Aquila 2.0 Cognitive Robotics“. The software-package is around 1 MB big.

How powerful such software is, can be verified by some demonstrations. The HRP-3 robot is for example capable of using tools, other machines can resist to push-ups from the side while keep on walking.

How exactly the „Aquila 2.0 Software“ must be configured to get it running on an „Actroid DER3“ robot is unclear. There is no documentation, but probably at first the neural network must be trained with the help of a nvidia gpu cluster.

Überlegungen zum neuen Atlas Roboter

Das neue Video von Boston Dynamics ist draußen, angeglickt haben es allein auf dem offiziellen Kanal inzwischen mehr als 7,8 Mio, Tendenz steigend. Hinzu kommen die zahlreichen Zugriffe auf die Kopien die zeitgleicht seit gestern ins Internet eingespielt wurden. Nahezu alle Tageszeitungen und die Tech-Portale sowieso berichten über den Roboter und erwähnen, dass er jetzt auch einen Salto machen kann. Eigentlich dachte ich immer, die NASA würde es auf magische weise hinbekommen die Leute zum Anschauen ihrer Videos zu animieren aber Boston Dynamics ist wohl noch besser in der Lage, Aufmerksamkeit zu generieren.

Aber gehen wir etwas mehr auf den Inhalt ein. Zu sehen ist ein Mannöver, was zumindest als Computeranimation schon in einem Paper aus dem Jahr 1993 beschrieben wurde: „MH Raibert: Animation of legged maneuvers: jumps, somersaults, and gait transitions“. Leider verrät uns Boston Dynamics nicht, wie es programmtechnisch gelöst wurde. Aber anhand der verfügbaren Informationen kann man raten: vermutlich war die Software ROS im Einsatz (Lidar für Umgebungserkennung) und zusätzlich noch ein Behavior Planner wie er auch in der Software JACK eingesetzt wurde. Es handelt sich um eine handprogrammierte Software mit der man textuelle Kommandos in Animationen übersetzen kann. Es gibt darin Befehle für jedes Detail aus der Domäne, inkl. Umsetzen des Fußes zur Balance-Steuerung, Sprungvorbereitung, Sprungausführung usw. Boston Dynamics hat also — so meine These — den Ablauf in Software nachprogrammiert und führt diesen dann in Echtzeit auf der Hardware aus.

In der Überschrift zu diesem Blogpost habe ich nochmal ein Youtube-Video verlinkt was nicht etwa einen hochentwickelten Roboter zeigt, sondern sich etwas näher mit der Sportart Parcurs beschäftigt. Es handelt sich dabei um eine Akrobatik, die eine Mischung ist aus Kraft und Geschicklichkeit, die wenn sie von Profis ausgeführt wird, regelmäßig Erstaunen erzeugt. Nur zu gerne würde ich an dieser Stelle erklären, dass Roboter niemals werden auf dieses Niveau kommen, doch die Erfahrung mit dem Schachroboter aus dem Jahr 1997 und der IBM Watson Vorführung aus dem Jahr 2011 haben gezeigt, dass Computer generell Menschen schlagen können. Es ist also nur eine Frage der Zeit dass Roboter auch im Parcurs laufen besser sind als der amtierende menschliche Sportler. Wie sowas aussieht wollen wir glaube ich gar nicht so genau wissen. Irgendwann wird es auch davon Youtube Videos geben.

Aber was bedeutet das auf philosophischer Ebene? Auch das ist leicht zu sagen. Der Singularity Diskurs wird nicht erst seit gestern geführt sondern hält eine umfangreiche Auswahl an Literatur bereit der genau dieser Frage nachgeht was passiert, wenn Maschinen schlauer werden als Menschen. Mit schlauer ist gemeint, dass Computer besser Schach spielen, besser laufen und vermutlich auch die besseren Wissenschaftler sind als es Menschen jemals waren. Im Fall eines laufenden / springenden Boston Dynamics Roboters sind die gesellschaftlichen Auswirkungen noch relativ überschaubar. Solche Roboter könnte man einsetzen, um die Pizza auszuliefern oder in Katastrophen-Szenarien. Richtig spannend wird es jedoch, wenn man die Roboter verkleinert auf Nanogröße. Damit wären dann komplett neue Aufgabengebiete erschlossen, die heute noch überhaupt nicht ausgeführt werden. Nanorobotik wiederum eröffnet den Bereich der medizinischen Robotik.

Vereinfachten biped robot konstruieren

Beim herumprobieren mit einem animierten Laufroboter ist mir etwas aufgefallen. Und zwar gibt es das Problem, dass der Wechsel von einem Auto hin zu einem Walking-Robot doch nicht so simpel ist wie zunächst gedacht. Eine mögliche Zwischenstufe wäre, wenn man an den Chasis ein Rad anmontiert und der Roboter sich dann mit einem Bein vorwärtsstößt. Aber cool sieht das nicht gerade aus. Aber wie wäre es, wenn man anstatt einen walking Robot einen Fisch konstruiert? Und voila, man braucht in Box2D nur die Gravitation auf 0 reduzieren und ein wenig mit dem Bein bzw. jetzt Schwimmflosse herumwackeln und man erhält einen Roboter der einerseits über Gliedmaßen verfügt, aber trotzdem simpel zu steuern ist. Vorteil ist hauptsächlich, dass man unter Wasser das Glichgewicht nicht verlieren kann. Im Worstcase schwimmt der Roboter dann falsch herum, aber wirklich umkippen wie ein Insekt an Land kann er nicht.

Schauen wir uns eine Schwimmbewegung im Detail an. Auf den ersten Blick ähnelt sie einem Walk-Zyklus, es gibt also einen rhytmischen Pattern. Der Unterschied ist jedoch, dass dieser Pattern einfacher funktioniert. Bei einem biped robot an Land braucht man neben dem Walk-Zyklus auch noch einen Footstep-Planner und einen Balance Planner. Damals hat Marc Raibert mit einem One-leg-hopping robot angefangen und den stückweise verbessert. Aber, selbst ein one-leg-hopping robot ist deutlich komplizierter aufgebaut als ein radgetriebener Robot. Weil, wie gesagt es nicht einfach ausreicht eine gleichförmige Bewegung auszuführen sondern man muss noch Optimierung betreiben. Ein Schwimmroboter hingegen ist technisch nur unwsentlich komplizierter als ein radgetriebener Roboter. Meist reicht es, wenn man eine Schraube gleichförming im Kreis dreht, man kann aber auch den Roboter wie ein Ruderboot konstruieren. Wo man also einerseits eine gleichförmige monotone Bewegung hat, dafür aber einen Ausleger bzw. eine Flosse nutzt. Man hat also die seltene Kombination aus einer inversen Kinematik plus der Einfachheit eines normalen monotonen Patterns.

Generell ist der Schwierigkeitsgrad ansteigend: Radgetriebener Roboter, Schwimmroboter mit Schraube, Schwimmroboter mit Flosse, Laufroboter an Land.