Grounding is the central problem in robotics

In the screenshot a minimal walking robot is show. It was created in the box2d physics engine and contains of some boxes plus joints. Compared to other robots, the system is not very complicated, only 4 joints and 5 boxes were used. That means, the robot is not realistic compared to real dogs or real humans how have much more joints and more ways in moving forward. On the other hand to control this simple device can be called an advanced challenge. How can it be, that a simple articulated figure makes trouble if we want to control it with an algorithm? I mean, four joints isn’t that much.

Let us describe the situation from the bottom up. The robot was created with the box2d primitive: createprismaticjoint and createdynamicbody. Both commands are well documented in the official API. Most game designer are aware of these commands. And to control the joints, we have to call a method. for example joint1.motorspeed=-1.2 This feature is also documented very well. That means, from a technical perspective it is an easy task to put a signal to the joints and let the robot walk. There is a simple problem. Producing a walking gait is much more complicated than only set the joint to a certain speed. Because the joint speed is only the lowlevel perspective from the box2d engine, it ignores what walking is. That means it ignores gaitpatterns, obstacles, dynamics in the air and animation technique. To use this kind of knowledge, something else is needed which isn’t formalized in the box2d primitives.

This “something else” is called grounding. It means to connect lowlevel actions with high level natural language. On the lowlevel side we have the box2d API which provides four simple commands:

joint1.motorspeed=-1.2
joint2.motorspeed=-1.2
joint3.motorspeed=-1.2
joint4.motorspeed=-1.2

On the high level side we have actions like “start”, “walk ahead”, “rotateleg1”, “stop”. Let us take a look into the Python sourcecode to answer the question how grounding works in software:

class Grounding:
  def __init__(self):
    self.box=Lookuptable({0:'bottom',1:'torso',2:'leftrotating',3:'rightrotating',4:'leftleg',5:'rightleg'})
    self.motor=Lookuptable({0:'leftrotating',1:'rightrotating',2:'leftspring',3:'rightspring'})
    self.legangle=Lookuptable({-90:'side',-5:'forward',0:'normal',5:'backward',90:'side'})
    self.contractleg=Lookuptable({-3:'longer',3:'shorter'})
    self.rotatespeed=Lookuptable({.2:'left',-.2:'right'})
  def get(self,group,value):
    # for example: self.get("legangle",-2.1)
    if group=="box": return self.box.inverse(value)
    if group=="motor": return self.motor.inverse(value)
    if group=="contract": return self.contractleg.inverse(value)
    if group=="legangle": return self.legangle.search(value)
    if group=="rotatespeed": return self.rotatespeed.inverse(value)

I have created a lookup table, which stores word-value-items. For example the jointid=0 describes the rotating joint on the left side of the robot. Sure, the box2d engine doesn’t need this information, but the programmer take an advantage from it. What is also stored in the lookuptable are information about speed pattern. A value like “0.2” is equal to rotate a leg to the left with normal speed. The second method in the grounding class is used for retrieving the information. The programmer can access the class from the outside and make a request to it. A sample request would be: give my the boxid for the rotating joint of the right leg. Or the user can ask the grounding class: The leg has an angle of -2 degree, what is the natural language description of that value? All the questions can be answered by executing the self.get() method. The class is working similar to an SQL database.

Even the class looks not very complicated, it is the central part of a robot control system. A mapping between numbers and natural language can be used to create advanced robotics controller. It simplifies the programming task. Perhaps from the technical perspective the grounding class can be improved. It is also possible to create a dedicated sqllite table, because the posted sourcecode looks so far a bit messy. But in any case a grounding layer is needed to manage the complexity. Let us go into the details.

Suppose, no grounding layer is available in the program. If the user wants to let the robot walk he has to enter the following statement:

joint1.motorspeed=0
joint2.motorspeed=.2
joint3.motorspeed=0
joint4.motorspeed=.2

That means, he sends lowlevel values to the joints and something will happen on the screen. In most cases, the robot will not walk, so the user will change the values a bit. Such kind of software will work from the technical side great, but it is hard to maintain larger amount of codelines. Because the programmer has after a while no idea what “.2” on joint #2 means. Now, let us compare this with a grounded control system.

def rotateleg(self,name,direction):
    boxid=self.mygrounding.get("motor",name)
    speed=self.mygrounding.get("rotatespeed",direction)    
    self.setmotor(boxid,speed)
    time.sleep(.5)
    self.setmotor(boxid,0)
  def contractleg(self,name,direction):
    boxid=self.mygrounding.get("motor",name)    
    speed=self.mygrounding.get("contract",direction)    
    self.setmotor(boxid,speed)
    time.sleep(.1)
    speed=0
    self.setmotor(boxid,speed)

This time, the motion primitives have names in natural language. If the user wants to rotate a leg, he can enter “rotateleg(leftrotating,left)” The advantage is, that the user doesn’t need to remember joint-ids or speed constant, instead he formulates everything in https://en.wikipedia.org/wiki/Structured_English

Advertisements

Increase the automation productivity

Assembly line robots in the food industry are often ignored by Artificial Intelligence researchers. The idea is the subject of pick & place is well understood and there is no need for further automation. The reality is different. Only the mechanical side of fast pick&place robots is well understood. If a company needs to a robot, he can search for a used model in the internet and get a standard industrial robot for around 20000 US$. Such a machine has a demand for electricity is able to pick and place objects. If something is broken with the machine it can be repaired.

Apart from the mechanical side, nothing is clear. How the software looks like is not researched, how the workflow until a robot gets programmed is remains open and even a simple question which operating system is used for robotics is not answered yet. Sure, most industrial robots were programmed somehow, but in most cases it is only luck, that the robot is able to do the task. Let us investigate in detail what pick&place means. The first impression is that it is a boring task which can be automated easily. In reality, food pick&place is difficult to handle. The robot movement contains of substeps like select object, approach object, move gripper down, grasp object, move gripper up, rotate object, approach box, move gripper down, release object and so on.

In most cases the customer has additional demands, for example the items should be placed in a certain direction in the box. The 4 items at the bottom are aligned from north to south while the next 4 items are aligned the opposite direction. And if something is wrong with the picking process, the robot must reset himself and not make trouble in the assembly line.

What i want to explain is, that a simple task like pick&place objects from an assembly line is from a programming perspective a large scale software project which isn’t researched very well in the literature. That means, there is need for investigating the problem in detail and write better software than today available.

The situation is, that robot manufactoring companies are great hardware producers. Producing an industrial robot for under 30000 US$ is advanced. Compared to the productivity of such robots the investment is little. On the other hand, the robot companies are poor in creating the software which will drive their machines. That means, they simple doesn’t know how to do it right and additional companies on the market how are creating only the software doesn’t exist. The current situation is mostly, that a control panel for the robot is delivered. Which is some kind of stop-motion programming interface. Such a workflow works horrible. It prevents that the robot can use it’s full potential.

What i want to explain is, that there is software crisis for simple pick&place robots. The customers have thousands of robots installed in the food factory, but they have no software to run them well.

Reasons for failure

What are the reasons for the software crisis in industrial robotics? The problem is, that writing software for robotics is a hard task which needs support from academia. It is not only about writing a simple C program which is able to control a servo motor on lowlevel, but robotics software is about artificial Intelligence. For treating the topic the right way, it is important to understand it as an academic problem . This means, before the software can be written a longer research in the library is needed and lots of papers has to be written about the problem. Most robotics companies doesn’t identify themself as knowledge driven organisations like Google but as industrical companies. They are good in providing mechanical devices but weak in cooperating with universities.

Without academic papers, universities, books and theoretical lectures it is not possible to write robotics software for pick&place tasks. It is not a simple task like putting an electric wire together, but software writing for robots is equal to what social scientists are doing. It has to do with reading a lot, citing existing literature and updating bibliographies.

Best practice workflow

The good news is, that there are techniques available to create pick&place robotics software. The list is long, and most topics are advanced software development techniques:

– prototyping with Python and Matlab

– object oriented software writing in C++ and Java for production ready binary version

– 3D realtime simulation in a physics engine

– agile software development in a team with git version control

– interactive text commandline for typing in robot actions

– robotics API for the pick&place domain

– requirements engineering which specifies which feature should be implemented

– testing and bugtracking

– documentation in an academic paper with references to other projects before

That means, software development is possible but only in terms mentioned before. Ignoring this pipeline and trying to program the control software adhoc without version control, without 3d simulation, without an API won’t work. If somebody is not aware what git is, or how to program a physics simulation with bullet, he won’t be successful in programming a pick&place robot. Even the task looks simple, there is a need for a large scale academic driven software project.

Designing a household robot

Robots are usually fit into one of the following categories: they are either designed for industrial usage which means, they are large and need 10 kilowatt power, or robots are designed for researchers in mind, which means they are small like the Aibo robot and will run with a battery. But between both extrema a robot is possible how fits into the middle. Such a robot is called a household robot and will be described in the following blogpost.

Sometimes, household robots are called co-bots because they are working close to the human. They are a combination between an industrial ready robot and a research project for testing out new software. They need less than 100 watt electricity but more than 1 watt. In it’s standard form, they are a single arm robot for the kitchen. The requirements for such a device are complex. It must be a tool for creativity, because the user has to learn how to write software with the robot, but it must also be ready for productive use, because the user want’s to get something for the money he paid.

To make the answer short: a household robot contains of a lot of joints. More than any robot before. The reason is, that only a lot of joints allows to fold the robot to compact size. It is similar to a snake design which can be made very small or very large. Such a kitchen robot doesn’t need much room, but can be made greater on demand. This is realized with the joints. Not 4 or 6 but 10 upto 20 are needed. This allows the single arm to move into any direction in the workspace while maintaining the constraint of a small space. It is similar to a human arm, who also provides a lot of joints. A human arm is mostly a thinking snake. It can reach a distance up to 1 meter, but can operate in a small free space at the same time. The main task of a household robot is to not take to much space. That means, if he isn’t used the robot arm should be small as possible. It has to be folded down like a flexible ruler.

From a mechanical point of view, snake manipulators are lowtech tools. They are built with a wire rope hoist and at the end a servo motor is mounted. The motor is able to pull or push the cable and this make the snake arm move. The more difficult part is how to program such a system. A typical snake manipulator contains of 10 joints, sometimes more. And they all need commands in realtime for doing a useful task. The reason, why snake manipulators are not very common has to do with a lack of software for controlling the device.

In the literature such robots are also called Hyper Redundant Manipulator. They can be built by amateurs and provides a high amount of flexible. It is even possible to grasp large objects with such a system. What is common in all these architectures is, that the degree of freedoms is high. Usually 9 dof are used, sometimes more. In contrast, a typical industrial robot has around 3-4 dof, this allow him to reach a point in the 3d space, but it prevents the robot to make useful tasks with an object. A human arm can be called also a hyper-redundant manipulator. The idea is that more joints are automatically better, because it allows the robot to reach a place in space without taking to much space.

Cobots are not the future

So called collaborating robots are marketed at an innovative idea which brings robotics into the mainstream. The idea is, that fully autonomous robots in industrial environments are the past, while man-machine collaboration is the future. Unfortunately this vision doesn’t make sense and i will explain why.

From a technical point of view, two different kind of automation is possible: fully autonomous system in which the human operator is out of the loop and semi-autonomous systems which is similar to collaboration between man and machine. But which of them is better? To answer this question we have to focus on two different situation: first the development of robotics software and secondly, the usage of a robot in reality.

During the development of robotics software, man-machine interaction is state of the art. The developer implements a new routine, for example the pick&place task, and during the test he will interact with the robot to give additional instructions. It’s not possible to develop robotics software with collaboration. If the software is working, the human operator is no longer needed. That means, the version of the software which can be delivered to the customer doesn’t need any kind of intervention, it can work by it’s own. A typical deployment situation is to put the robot in a metal box and the human operator has no longer direct contact to the machine. If he needs to touch the robot arm, he has to press the stop switch first.

What i want to explain is, that collocation in robotics is a sign of an ongoing development process. If the software isn’t ready or has bugs, a large amount of human machine interaction is needed. Otherwise the problem can’t be fixed. It is the aim of the robot manufacturer to hide this task in his internal workflow. The normal customer shouldn’t be forced to program the robot. The reason why some robotics companies are promoting the idea of Co-Robots is because their software isn’t highly developed. The robotics API is simply not working, and the customer gets no robot but some kind of remote controlled machine. A robot is according to the definition a fully autonomous system which doesn’t need feedback.

The remaining question is how to program a software which runs by it’s own. This is indeed a major concern. There are no books available which are explaining it. In most cases the process of software development is done interactively. That means, the robot programmer starts a subroutine and determines what the robot is doing. IF the robot is behave wrong, he will stop the routine. During the development process, the human programmer and the mechanical robot are working close together. But this is not the general workmode for every user, it is only used in the programming step. It can be done by the robot company and it can be done in a simulated environment.

Let us observe a typical and well programmed robot. He is mounted in an assembly line and his only job is to pick and place food in boxes. He is doing so with high speed. No human is able to interact in this speed. The robot has to be working alone, otherwise his productivity would be lower. Also, from a medical perspective, it is not allowed, that humans are touching the food. Only the clean robot is allowed to get in contact.

Sometimes, autonomous industrial robots are recognized as something which has to be overcome. No it is not. Such industrial robots are the future. What has to changed is the programming workflow until these robots get programmed. This can be done in simulation, interactively and with an open source approach.

To understand modern robotics, it is important to identify what is going well and what not. If we are looking at industrial robots, the working mode they operate is great. It is an fully autonomous, high-speed mode which doesn’t need human intervention. This kind of standalone, non-observed mode at the assembly line is the best practice method and can be improved further. On the other hand, the process of writing software for these machine is not very advanced. That means, the software is proprietary, not documented, hard to install and not flexible enough. The reason why industrial robots are not used more frequently is because the customer doesn’t like the software and especially the software development process. This can be improved drastically.

Suppose, it is possible to program industrial robots with more features. That they will handle complicated tasks very well and are able to detect problems by it’s own. This results into a higher capabilities. The robots can be used for more tasks, not automated yet. What is important to understand is, that modern robots are from a mechanical perspective great. But the software they are using, and the development process until the software is created is outdated. This lowers their potential.

Software workflow

Let us observe how exactly software is created for industrial robots. The robot itself should pick&place food from an assembly line. How exactly is the software written for this task? Exactly this is the bottleneck in modern industrial robots. Nobody knows. Even the company who build the robot doesn’t know. They are delivering either no software at all, or they are providing an outdated programming interface which is based on MS-DOS and is at least 40 years old. Nobody can create on this basis advanced robotics software, and that is the reason why industrial robots are perceived as weak.

Let us describe what is not used in software development in the past. No 3d simulation was used, no open source software libraries, and no books about robotics software development. They reason why is simple: it wasn’t available. That means, in the past, no 3d simulation package for writing robotics software was on the market. Not for sale and not as open source. But without such kind of environment, it is not possible to create advanced software which can handle complicated situation. This results into poor written software which doesn’t work very well. It some kind of luck, how industrial robots gets programmed in the past. The workflow was not repeatable, was not documented and not researched by academia. The only part of robotics software for industrial purpose which is documented in some books is the inverse kinematics part. That means, to move the robotarm to a certain position. Most robotics software has such a subroutine preprogrammed in. The sad news is, that an inverse kinematics routine is the only advanced element in industrial robots from the past. As a result, the movements of the robots are primitive. They are bases on moving to the points in 3d space, because this is the only feature the inverse kinematics routine has to provide.

Research vs deployment

It is important to draw a line between robotics research and robotics deployment. Robotics deployment has to be done fully autonomous with high speed robots. This kind of systems are working without human intervention and they are repeating the same task over and over. For the task of creating software another form of robot is needed. This can be called research robots. Good example for research robots are PR2 from Willowgarage, or Lego Mindstorms EV3 kits. Both robots are created as interactive robots. That means, they are not working in 24/7 mode and they are not working alone. Instead, human machine interaction is required for creating the software.

It is important to understand, that research based robots can’t be deployed in real life situations and vice versa. A Mindstorms NXT3 kit was never created to pick&place objects in 24/7 mode. Instead, the robot was created for testing out new software which is adjusted at runtime. A typical interaction with Lego Mindstorms is to write new software, test it out at the brick, change the software, test it out and so on. From a development workflow, research robots are equal to prototyping. There were used in the software development process.

On the other hand, industrial robots are not a good choice for testing out software. They are large, need a lot of energy and are in most cases dangerous for human. The robotarm of a typical industrial robot can hurt people seriously if he runs to fast. So it is needed that human stay away from the machines. At the end, both types of robots are needed. Small interactive robots for writing software, and large autonomous robots for executing the software fast and efficient.

A typical industrial robot needs 10000 volt-ampere. That is equal to 10 kilowatt. Doing any kind of interaction with such a machine isn’t recommended and using it for software development isn’t possible. Such a robot has to be used in production environment. And there is no alternative available. If the robot would need less energy, for example only 10 Watt he won’t be longer powerful enough to get things done. That means, a fully equipped industrial robot is by itself great. What he asks for is some kind of second robot, called research robot to develop software for the first one. Using a combination of a small sized research robot which needs 10 Watt plus a full blown industrial robot which takes 10000 Watt is the best practice method. On robot 1 the software is tested, while on robot 2 the software get executed efficient.

Let us describe other anti-pattern in modern robots. What most industrial robots are providing is a programming interface. The idea is, that the large 10 kilowatt robot is programmed on the fly by the operator. This won’t work. Such kind of programming interface are not useful. On the other extreme, some research robots are deployed in production environments. That means, there are scenarios available in which people are using the Lego EV3 mindstorms kit in a real live deployment to pick&place objects, This won’t work too. I think, each robot should used inside the original specification.

Creating a PID controller for a ball balancing task

The screenshot shows the running software which is a fully autonomous pid controller for controlling a beam. The system is able to detect semantic events and will execute actions like rollleft and kick the ball on the beam. Because the software is a bit complicated, i want to explain it step by step which makes it easy to reproduce it from scratch.

First, we need a gui window for showing verbose information and take the user’s input. It was created with the tkinter package in the lovely Python programming language. The main feature of the GUI window is the input box which is working similar to a text-console in UNIX. The user enters a command, for example “kick” and the robot in the game will act in realtime. If the tkinter package is not installed in your Python installation, this would be the right moment for doing so. And because installing new packages with pip is easy and well documented, it make sense to install also the pygame and the box2d package as well. We will need it in later steps.

#!/usr/bin/env Python
#-*- coding: utf-8 -*
import Tkinter, ScrolledText, math
from tkinter import ttk

class GUI:
  def __init__(self):
    self.tkwindow = Tkinter.Tk()
    self.tkwindow.geometry("300x400+600+0") # place to right
    self.tkwindow.bind('<Motion>', self.mousemotion)
    self.tkwindow.bind("<Key>", self.inputhandling)
    self.fps=20 # 20
    self.framestep = 0
    self.mouse = (0,0)
    # info
    self.widgetinfo = Tkinter.Label(self.tkwindow, text="info")
    self.widgetinfo.place(x=10, y=0)
    # message label
    self.widgetmessagelabel = Tkinter.Label(self.tkwindow, text="messagelabel", wraplength=250, justify="left", font=("FixedSys", 9))
    self.widgetmessagelabel.place(x=10, y=30) 
    # message 
    self.widgetmessage = ScrolledText.ScrolledText(self.tkwindow, width=38, height=8, wrap="word")
    self.widgetmessage.place(x=10, y=210) 
    # input
    self.widgetinput = Tkinter.Entry(self.tkwindow)
    self.widgetinput.place(x=10, y=340)      
    self.widgetinput.bind("<Return>", self.parseinput) 
    # start loop
    self.gameloop()
  def mousemotion(self,event):
    self.mouse=(event.x, event.y) 
  def parseinput(self, event):
    pass
  def inputhandling(self,event):
    pass
  def paintlabel(self):
    result="fps "+str(self.fps)+" frame "+str(self.framestep)
    result+=" mouse "+str(self.mouse[0])+" "+str(self.mouse[1])
    self.widgetinfo.config(text=result) 
  def paintobjects(self):
    pass
  def printtext(self,text):
    self.widgetmessage.insert(Tkinter.END, text)
    self.widgetmessage.see(Tkinter.END) # scroll to end
  def gameloop(self):
    self.framestep +=1
    self.paintlabel()
    self.paintobjects()
    self.tkwindow.after(1000/self.fps, self.gameloop) # call gameloop every tick    
    
        
class Game:
  def __init__(self):
    self.mygui = GUI()
    self.mygui.tkwindow.mainloop()
    
mygame = Game() 

In the next step we are creating additional to the tkinter window a pygame window. Because there is a need for displaying the game itself. In theory, it’s possible to paint lines in tkinter directly, but pygame has better drawing primitives and it’s a dedicated game engine.

#!/usr/bin/env Python
#-*- coding: utf-8 -*
import Tkinter, ScrolledText, pygame, math
from tkinter import ttk

class GUI:
  def __init__(self):
    self.tkwindow = Tkinter.Tk()
    self.tkwindow.geometry("300x400+600+0") # place to right
    self.tkwindow.bind('<Motion>', self.mousemotion)
    self.tkwindow.bind("<Key>", self.inputhandling)
    self.pygamewindow = pygame.display.set_mode((500, 350), pygame.HWSURFACE | pygame.DOUBLEBUF)
    self.fps=20 # 20
    self.framestep = 0
    self.mouse = (0,0)
    # info
    self.widgetinfo = Tkinter.Label(self.tkwindow, text="info")
    self.widgetinfo.place(x=10, y=0)
    # message label
    self.widgetmessagelabel = Tkinter.Label(self.tkwindow, text="messagelabel", wraplength=250, justify="left", font=("FixedSys", 9))
    self.widgetmessagelabel.place(x=10, y=30) 
    # message 
    self.widgetmessage = ScrolledText.ScrolledText(self.tkwindow, width=38, height=8, wrap="word")
    self.widgetmessage.place(x=10, y=210) 
    # input
    self.widgetinput = Tkinter.Entry(self.tkwindow)
    self.widgetinput.place(x=10, y=340)      
    self.widgetinput.bind("<Return>", self.parseinput) 
    # start loop
    self.gameloop()
  def mousemotion(self,event):
    self.mouse=(event.x, event.y) 
  def parseinput(self, event):
    pass
  def inputhandling(self,event):
    pass
  def paintlabel(self):
    result="fps "+str(self.fps)+" frame "+str(self.framestep)
    result+=" mouse "+str(self.mouse[0])+" "+str(self.mouse[1])
    self.widgetinfo.config(text=result) 
    # mousepos in pygame window
    for event in pygame.event.get():
      if event.type == pygame.MOUSEMOTION:
        self.mouse=event.pos
  def paintobjects(self):
    self.pygamewindow.fill(pygame.Color(255,255,255))  
  def printtext(self,text):
    self.widgetmessage.insert(Tkinter.END, text)
    self.widgetmessage.see(Tkinter.END) # scroll to end
  def gameloop(self):
    self.framestep +=1
    self.paintlabel()
    self.paintobjects()
    pygame.display.update()
    self.tkwindow.after(1000/self.fps, self.gameloop) # call gameloop every tick    
        
class Game:
  def __init__(self):
    self.mygui = GUI()
    self.mygui.tkwindow.mainloop()
    
mygame = Game()  

A game without a game-engine wouldn’t make much sense. For displaying a realistic “ball on a beam” situation, we need a fast underlying physics engine. It’s box2d, which works great in in Python together with pygame. For realizing it in sourcecode we need a second class which is stored in a new file. This helps to maintain the sourcecode short. The new file “box2d.py” contains everything what is needed to create a box2d instance, filling it with objects and activate a single joint. The idea is, that the user can move the beam left and right and this will change the ball’s position.

## file: main.py ##
#!/usr/bin/env Python
#-*- coding: utf-8 -*-
import Tkinter, ScrolledText, pygame, Box2D, math
from tkinter import ttk
import physics

class GUI:
  def __init__(self):
    self.tkwindow = Tkinter.Tk()
    self.tkwindow.geometry("300x400+600+0") # place to right
    self.tkwindow.bind('<Motion>', self.mousemotion)
    self.tkwindow.bind("<Key>", self.inputhandling)
    self.pygamewindow = pygame.display.set_mode((500, 350), pygame.HWSURFACE | pygame.DOUBLEBUF)
    self.myphysics = physics.Physics()
    self.fps=20 # 20
    self.framestep = 0
    self.mouse = (0,0)
    self.ppm = 100.0  # pixels per meter, default 100
    # info
    self.widgetinfo = Tkinter.Label(self.tkwindow, text="info")
    self.widgetinfo.place(x=10, y=0)
    # message label
    self.widgetmessagelabel = Tkinter.Label(self.tkwindow, text="messagelabel", wraplength=250, justify="left", font=("FixedSys", 9))
    self.widgetmessagelabel.place(x=10, y=30) 
    # message 
    self.widgetmessage = ScrolledText.ScrolledText(self.tkwindow, width=38, height=8, wrap="word")
    self.widgetmessage.place(x=10, y=210) 
    # input
    self.widgetinput = Tkinter.Entry(self.tkwindow)
    self.widgetinput.place(x=10, y=340)      
    self.widgetinput.bind("<Return>", self.parseinput) 
    # start loop
    self.gameloop()
  def mousemotion(self,event):
    self.mouse=(event.x, event.y) 
  def parseinput(self, event):
    pass
  def inputhandling(self,event):
    pass
  def paintlabel(self):
    result="fps "+str(self.fps)+" frame "+str(self.framestep)
    result+=" mouse "+str(self.mouse[0])+" "+str(self.mouse[1])
    self.widgetinfo.config(text=result) 
    # mousepos in pygame window
    for event in pygame.event.get():
      if event.type == pygame.MOUSEMOTION:
        self.mouse=event.pos
  def paintobjects(self):
    self.pygamewindow.fill(pygame.Color(255,255,255))    
    colors = { Box2D.b2_staticBody: (16, 79, 15), Box2D.b2_dynamicBody: (165, 42, 42), }
    for body in (self.myphysics.mybox2d.myworld.bodies):  
      for fixture in body.fixtures:
        shape = fixture.shape
        if hasattr(shape, 'vertices'):  # rectangle
          vertices = [(body.transform * v) * self.ppm for v in shape.vertices]
          pygame.draw.polygon(self.pygamewindow, colors[body.type], vertices, 5)
        else:  # ball
          position = body.transform * shape.pos * self.ppm
          x, y = int(position[0]), int(position[1])
          radius = int(shape.radius * self.ppm)
          pygame.draw.circle(self.pygamewindow, colors[body.type], (x, y), radius, 2)
          x2, y2 = x + radius * -math.sin(body.angle), y + radius * math.cos(body.angle)
          pygame.draw.line(self.pygamewindow, colors[body.type], (x, y), (x2, y2), 2) #top    
  def printtext(self,text):
    self.widgetmessage.insert(Tkinter.END, text)
    self.widgetmessage.see(Tkinter.END) # scroll to end
  def gameloop(self):
    self.framestep +=1
    self.paintlabel()
    self.paintobjects()
    pygame.display.update()
    self.myphysics.update()
    self.tkwindow.after(1000/self.fps, self.gameloop) # call gameloop every tick    
    
        
class Game:
  def __init__(self):
    self.mygui = GUI()
    self.mygui.tkwindow.mainloop()
    
mygame = Game()    

## file: physics.py ##
import box2d
class Physics:
  def __init__(self):
    self.mybox2d = box2d.Box2d()
  def update(self): self.mybox2d.update()

## file: box2d.py ##
import Box2D, math

class Box2d:
  def __init__(self):
    self.fps=20
    self.fric = 0.2  # 1.0 friction
    self.dens = 0.1 # 1.0 density
    self.torque = 100 #100
    self.reset()
  def reset(self):
    self.myworld = Box2D.b2World(gravity=(0,10), doSleep=True)  # gravity=(0,10)
    self.body = []
    self.joint = []
    self.createbox("static",(3, 3.3),(3, 0.02)) # bottom 
    self.createbox("static",(2.5, 2.0),(0.05, 0.05)) # fixed
    self.createbox("dynamic",(2, 2),(1, 0.1)) # base
    self.createcircle((2.3, 1.2),0.2) # ball
    self.createrevolutejoint(1,2)
    self.update()
  def update(self):
    timestep = 1.0 / self.fps
    velocityIterations = 8
    positionIterations = 3
    self.myworld.Step(timestep, velocityIterations, positionIterations)  
  def createbox(self,moveable,p,size):
    if moveable=="dynamic":
      self.body.append(self.myworld.CreateDynamicBody(position=p, angle=0))
      self.body[-1].CreatePolygonFixture(box=size, density=self.dens, friction=self.fric)
    if moveable=="static":
      self.body.append(self.myworld.CreateStaticBody(
        position=p,angle=0,
        fixtures=Box2D.b2FixtureDef(
          shape=Box2D.b2PolygonShape(box=size),
          friction=self.fric, density=self.dens
        )
      ))
  def createcircle(self,p,r):
    self.body.append(self.myworld.CreateDynamicBody(
      position=p,
      fixtures=Box2D.b2FixtureDef(
        shape=Box2D.b2CircleShape(radius=r), 
        restitution=0.5, density=self.dens, friction=self.fric), 
    ))   
  def createrevolutejoint(self,id1,id2):
    """ create rotating joint """
    anchorA, anchorB = (0,0), (0,0)
    self.joint.append(self.myworld.CreateRevoluteJoint(
        bodyA=self.body[id1], bodyB=self.body[id2],
        localAnchorA=anchorA, localAnchorB=anchorB,
        enableMotor=True, enableLimit=False,
        lowerAngle=0, upperAngle=1,
        maxMotorTorque=self.torque,
        motorSpeed=0,
    )) 
  def createprismaticjoint(self,id1,id2,axisx,axisy):
    """ create linear joint """
    self.joint.append(self.myworld.CreatePrismaticJoint(
        bodyA=self.body[id1], bodyB=self.body[id2],
        localAnchorA=(0, 0), localAnchorB=(0, 0), axis=(axisx, axisy),
        lowerTranslation=1, upperTranslation=3.2,
        enableMotor=True, enableLimit=False,
        maxMotorForce=self.torque,
        motorSpeed=0.0,
    ))            
  def getpos(self, id):
    box2dpos = self.body[id].position
    return self.box2d_to_pygame_pos(box2dpos)
  def getvelocity(self, id):
    return self.body[id].linearVelocity
  def setspeed(self, id, speed):
    self.joint[id].motorSpeed = speed       
  def getangle(self, id):
    angle = self.body[id].angle
    # angle = (math.degrees(angle)-90) % 360
    angle = math.degrees(angle)
    return angle    
  def box2d_to_pygame_pos(self, box2dpos):
    """ convert coordinates """
    ppm = 100.0  # pixels per meter, default 100
    x,y = int(box2dpos[0] * ppm),int(box2dpos[1] * ppm)
    return (x, y)

AI

The tutorial so far was about a simple Python game which consists of a tkinter GUI window, a pygame window and a box2d class. The simulation can be played manually by a human operator. But such a simulation alone is boring and many tutorials are available which are describing more sophisticated games. The more demanding task is to realize an Artificial Intelligence on top of the game. At first, i want to post the sourcecode itself, it contains the full blown game plus the integrated AI engine. The code is distributed over 5 files in around 400 lines of code. The good news is, that half of the content is known from the introduction. It’s the tkinter gui, the pygame window and the box2d engine. But something is new: a parser, a semantic grounding class and physics motion primitives.

### file: main.py ###

#!/usr/bin/env Python
#-*- coding: utf-8 -*-
import Tkinter, ScrolledText, pygame, Box2D, math
from tkinter import ttk
import physics

class GUI:
  def __init__(self):
    self.tkwindow = Tkinter.Tk()
    self.tkwindow.geometry("300x400+600+0") # place to right
    self.tkwindow.bind('<Motion>', self.mousemotion)
    self.tkwindow.bind("<Key>", self.inputhandling)
    self.pygamewindow = pygame.display.set_mode((500, 350), pygame.HWSURFACE | pygame.DOUBLEBUF)
    self.myphysics = physics.Physics()
    self.fps=20 # 20
    self.framestep = 0
    self.mouse = (0,0)
    self.ppm = 100.0  # pixels per meter, default 100
    # info
    self.widgetinfo = Tkinter.Label(self.tkwindow, text="info")
    self.widgetinfo.place(x=10, y=0)
    # message label
    self.widgetmessagelabel = Tkinter.Label(self.tkwindow, text="messagelabel", wraplength=250, justify="left", font=("FixedSys", 9))
    self.widgetmessagelabel.place(x=10, y=30) 
    # message 
    self.widgetmessage = ScrolledText.ScrolledText(self.tkwindow, width=38, height=8, wrap="word")
    self.widgetmessage.place(x=10, y=210) 
    # input
    self.widgetinput = Tkinter.Entry(self.tkwindow)
    self.widgetinput.place(x=10, y=340)      
    self.widgetinput.bind("<Return>", self.parseinput) 
    # start loop
    self.gameloop()
  def mousemotion(self,event):
    self.mouse=(event.x, event.y) 
  def parseinput(self, event):
    name=self.widgetinput.get()
    self.myphysics.parser(name)
  def inputhandling(self,event):
    if event.keysym=="Left": self.myphysics.parser("rollleft")
    if event.keysym=="Right": self.myphysics.parser("rollright")
    if event.keysym=="Up": self.myphysics.parser("kick")
    if event.keysym=="Down": self.myphysics.parser("kick2")
  def paintlabel(self):
    result="fps "+str(self.fps)+" frame "+str(self.framestep)
    result+=" mouse "+str(self.mouse[0])+" "+str(self.mouse[1])
    self.widgetinfo.config(text=result) 
    # physics message
    temp= self.myphysics.myparser.message
    if temp !="": 
      self.printtext(temp+" ")
      self.myphysics.myparser.clearmessage()
    # wordlist and info
    temp = self.myphysics.myparser.getwords()+"\n\n"
    temp += self.myphysics.getinfo()
    self.widgetmessagelabel.config(text=temp)     
    # mousepos in pygame window
    for event in pygame.event.get():
      if event.type == pygame.MOUSEMOTION:
        self.mouse=event.pos
  def paintobjects(self):
    self.pygamewindow.fill(pygame.Color(255,255,255))    
    colors = { Box2D.b2_staticBody: (16, 79, 15), Box2D.b2_dynamicBody: (165, 42, 42), }
    for body in (self.myphysics.mybox2d.myworld.bodies):  
      for fixture in body.fixtures:
        shape = fixture.shape
        if hasattr(shape, 'vertices'):  # rectangle
          vertices = [(body.transform * v) * self.ppm for v in shape.vertices]
          pygame.draw.polygon(self.pygamewindow, colors[body.type], vertices, 5)
        else:  # ball
          position = body.transform * shape.pos * self.ppm
          x, y = int(position[0]), int(position[1])
          radius = int(shape.radius * self.ppm)
          pygame.draw.circle(self.pygamewindow, colors[body.type], (x, y), radius, 2)
          x2, y2 = x + radius * -math.sin(body.angle), y + radius * math.cos(body.angle)
          pygame.draw.line(self.pygamewindow, colors[body.type], (x, y), (x2, y2), 2) #top    
  def printtext(self,text):
    self.widgetmessage.insert(Tkinter.END, text)
    self.widgetmessage.see(Tkinter.END) # scroll to end
  def gameloop(self):
    self.framestep +=1
    self.paintlabel()
    self.paintobjects()
    pygame.display.update()
    self.myphysics.update()
    self.tkwindow.after(1000/self.fps, self.gameloop) # call gameloop every tick    
    
        
class Game:
  def __init__(self):
    self.mygui = GUI()
    self.mygui.tkwindow.mainloop()
    
mygame = Game()    

### file: grounding.py ###
class Lookuptable:
  """ constructor needs a dictionary at first run"""
  def __init__(self,category):
    self.mydict=category
  def search(self,target):
    errormin=9999
    itemmin=-1
    for i in self.mydict.items():
      error=abs(i[0]-target)
      if error<errormin:
        errormin=error
        itemmin=i
    return itemmin[1]

class Grounding:
  def __init__(self):
    self.position=Lookuptable({125:'leftoff',165:'leftcorner',205:'left',245:'middle',
      285:'right',325:'rightcorner',365:'rightoff'})    
    self.velocity = Lookuptable({-.4:'moveleftfast',-.1:'moveleft',
    .0:'standstill',.1:'moveright',.4:'moverightfast'})       
    self.distance=Lookuptable({0:'contact',20:'near',40:'away',60:'faraway'})
  def get(self,name,value):
    # for example: self.get("pos",-2.1)
    if name=="pos": return self.position.search(value)
    if name=="vel": return self.velocity.search(value)
    if name=="distance": return self.distance.search(value)
    

### file: physics.py ###
import threading,time
import box2d, parser, grounding

class Physics:
  def __init__(self):
    self.mybox2d = box2d.Box2d()
    self.myparser = parser.Parser()    
    self.mygrounding = grounding.Grounding()
    self.mainrunning=False
  def parser(self,name):
    status=self.myparser.getcommand(name)
    if status!="error":
      # run every command as thread
      command="self."+status[0]
      argument=status[1]
      thread1 = threading.Thread(target=eval(command),args=argument)
      thread1.daemon = True  # stops, if mainprogram ends
      thread1.start()
  def update(self): self.mybox2d.update()
  def reset(self): self.mybox2d.reset() 
  def setmotor(self,speed): self.mybox2d.setspeed(0,speed)    
  def getangle(self): return self.mybox2d.getangle(2)
  def getballpos(self): return self.mybox2d.getpos(3)
  def getballspeed(self): return self.mybox2d.getvelocity(3) # b2vec    
  def getsemanticpos(self): return self.mygrounding.get("pos",self.getballpos()[0])
  def getsemanticvel(self): return self.mygrounding.get("vel",self.getballspeed()[0])
  def reachangle(self, target, initspeed):
    """ example execution: 20,2 (angle=20, speed=2) """
    speed=initspeed
    for i in range(100):
      angle=self.getangle()
      diff=angle-target
      #print i, diff, speed
      if diff<0: self.setmotor(speed)  
      if diff>0: self.setmotor(-speed) 
      if abs(diff)<5: speed=abs(diff)/5.0
      if abs(diff)<0.5: 
        self.setmotor(0)
        break
      time.sleep(0.05)    
  def roll(self, direction):
    # direction: -3 strongleft, -1 lowleft, 1 lowright
    angle=2*direction
    self.reachangle(angle,1)
    time.sleep(.5)
    self.reachangle(0,1)
  def kick(self, direction):
    # direction = -1 or 1
    self.setmotor(7*direction)  
    time.sleep(0.1)
    self.setmotor(0)
    time.sleep(0.1)
    self.setmotor(-7*direction)  
    time.sleep(0.1)
    self.setmotor(0) 
    self.reachangle(0,2) # back to init    
  def kick2(self):
    speed=6 #6 
    duration=0.15
    self.setmotor(speed)  
    time.sleep(duration)
    self.setmotor(0)
    time.sleep(0.1)
    self.setmotor(-speed)  
    time.sleep(duration+.1)
    self.setmotor(0) 
    self.reachangle(0,2) # back to init    

  def getinfo(self):
    result=""
    result+="semantic "
    result+=self.getsemanticpos()+" "
    result+=self.getsemanticvel()
    result+=" ballpos="+str(self.getballpos())
    x=round(self.getballspeed()[0],1)
    y=round(self.getballspeed()[0],2)
    result+=" speed="+str(x)+" "+str(y)
    result+=" angle="+str(int(self.getangle()))
    result+=" threadrunning="+str(self.mainrunning)
    return result
  def detectdanger(self):
    pos=self.getsemanticpos()
    vel=self.getsemanticvel()
    if pos=="rightcorner":
      if vel=="moveright": 
        self.myparser.addmessage("rightdanger")
        self.roll(-3)
      if vel=="moverightfast": 
        self.myparser.addmessage("rightdangerhigh")
        self.roll(-5)
    if pos=="leftcorner":
      if vel=="moveleft":
        self.myparser.addmessage("leftdanger")
        self.roll(3)
      if vel=="moveleftfast": 
        self.myparser.addmessage("leftdangerhigh")
        self.roll(5)    
  def detectkick(self):
    pos=self.getsemanticpos()
    vel=self.getsemanticvel()
    # detect kick
    if pos=="right" and (vel=="moveright" or vel=="moverightfast"):
      self.kick(-1)  
  def detectoff(self):
    pos=self.getsemanticpos()
    vel=self.getsemanticvel()
    if pos=="leftoff" or pos=="rightoff":
      self.reset()

  def mainstop(self):
    self.mainrunning=False
  def main(self):
    self.mainrunning=True
    for i in range(10000):
      time.sleep(.2)
      self.detectdanger()
      self.detectkick()
      self.detectoff()
      if self.mainrunning==False:
        break
    self.mainrunning=False


### file: parser.py ###
class Parser:
  def __init__(self):
    self.message = ""
    self.words =[ # ("name","group","command",[parameter])
      ("words","system","words",[]), # dummy
      ("info","system","getinfo",[]), # dummy
      ("reset","system","reset",[]),
      
      ("stop","motor","setmotor",[0]),
      ("left","motor","setmotor",[-0.3]),
      ("right","motor","setmotor",[0.3]),    
      ("reach-20","reach","reachangle",[-20,2]),
      ("reach+20","reach","reachangle",[20,2]),   
      
      ("rollleft","roll","roll",[-2]),
      ("rollright",",roll","roll",[2]), 
      
      ("kick","kick","kick",[-1]),
      ("kick2","kick","kick2",[]),
      
      ("main","macro","main",[]),
      ("mainstop","macro","mainstop",[]),
    ]    
  def clearmessage(self):
    self.message=""
  def addmessage(self,text):
    self.message+=text
  def getwords(self):
    result=""
    for i in self.words:
      result+=i[0]+" "
    return result    
  def getcommand(self,name):
    result=[]
    self.addmessage(name)
    pos=self.findword(name)
    if pos!=-1:
      result.append(self.words[pos][2]) # command
      result.append(self.words[pos][3]) # parameter
    else: 
      result="error" 
      self.addmessage("command not found\n")     
    return result
  def findword(self,name):
    pos=-1
    for i in range(len(self.words)):
      if self.words[i][0]==name:
        pos=i
        break
    return pos    


### file: box2d.py ###
import Box2D, math

class Box2d:
  def __init__(self):
    self.fps=20
    self.fric = 0.2  # 1.0 friction
    self.dens = 0.1 # 1.0 density
    self.torque = 100 #100
    self.reset()
  def reset(self):
    self.myworld = Box2D.b2World(gravity=(0,10), doSleep=True)  # gravity=(0,10)
    self.body = []
    self.joint = []
    self.createbox("static",(3, 3.3),(3, 0.02)) # bottom 
    self.createbox("static",(2.5, 2.0),(0.05, 0.05)) # fixed
    self.createbox("dynamic",(2, 2),(1, 0.1)) # base
    self.createcircle((2.3, 1.2),0.2) # ball
    self.createrevolutejoint(1,2)
    self.update()
  def update(self):
    timestep = 1.0 / self.fps
    velocityIterations = 8
    positionIterations = 3
    self.myworld.Step(timestep, velocityIterations, positionIterations)  
  def createbox(self,moveable,p,size):
    if moveable=="dynamic":
      self.body.append(self.myworld.CreateDynamicBody(position=p, angle=0))
      self.body[-1].CreatePolygonFixture(box=size, density=self.dens, friction=self.fric)
    if moveable=="static":
      self.body.append(self.myworld.CreateStaticBody(
        position=p,angle=0,
        fixtures=Box2D.b2FixtureDef(
          shape=Box2D.b2PolygonShape(box=size),
          friction=self.fric, density=self.dens
        )
      ))
  def createcircle(self,p,r):
    self.body.append(self.myworld.CreateDynamicBody(
      position=p,
      fixtures=Box2D.b2FixtureDef(
        shape=Box2D.b2CircleShape(radius=r), 
        restitution=0.5, density=self.dens, friction=self.fric), 
    ))   
  def createrevolutejoint(self,id1,id2):
    """ create rotating joint """
    anchorA, anchorB = (0,0), (0,0)
    self.joint.append(self.myworld.CreateRevoluteJoint(
        bodyA=self.body[id1], bodyB=self.body[id2],
        localAnchorA=anchorA, localAnchorB=anchorB,
        enableMotor=True, enableLimit=False,
        lowerAngle=0, upperAngle=1,
        maxMotorTorque=self.torque,
        motorSpeed=0,
    )) 
  def createprismaticjoint(self,id1,id2,axisx,axisy):
    """ create linear joint """
    self.joint.append(self.myworld.CreatePrismaticJoint(
        bodyA=self.body[id1], bodyB=self.body[id2],
        localAnchorA=(0, 0), localAnchorB=(0, 0), axis=(axisx, axisy),
        lowerTranslation=1, upperTranslation=3.2,
        enableMotor=True, enableLimit=False,
        maxMotorForce=self.torque,
        motorSpeed=0.0,
    ))            
  def getpos(self, id):
    box2dpos = self.body[id].position
    return self.box2d_to_pygame_pos(box2dpos)
  def getvelocity(self, id):
    return self.body[id].linearVelocity
  def setspeed(self, id, speed):
    self.joint[id].motorSpeed = speed       
  def getangle(self, id):
    angle = self.body[id].angle
    # angle = (math.degrees(angle)-90) % 360
    angle = math.degrees(angle)
    return angle    
  def box2d_to_pygame_pos(self, box2dpos):
    """ convert coordinates """
    ppm = 100.0  # pixels per meter, default 100
    x,y = int(box2dpos[0] * ppm),int(box2dpos[1] * ppm)
    return (x, y)

The best idea is to start the description with the physics class. In contrast to the class in the introduction which was nearly empty, it has now some additional motion primitives. They are used for getting information, and executing actions. An example motion primitive is “rollleft” which will move the beam to the left. Most of the motion primitives are threaded, that means, they run in background. To executing the primitives, a dedicated parser is used. The parser takes the user’s input and determines which Python method has to be called. The parser is created around a dictionary.

At the end is another new class available, called grounding. This is used for converting absolute ball positions into natural language. It works also with a dictionary, and an example request would be that the ballposition x=245 is equal to “middle”. Such kind of mapping helps to create behaviors. Instead of writing “if ballx>230 and ballx<260”, the more simpler statement “if ball=middle” can be used. The AI itself is the physics.main() which is a short Python method used for determine the next action. It will check if the ball is on the beam and checks also if a kick is possible.

The physics.main() method can be started from the tkinter GUI window, and it's also possible to interrupt the task by request. The overall system is controlled interactively. That means, the user and the AI are working together. The user has the option to control the beam manual and overwrite the AI, or he can choose to deactivate the AI completely with the “mainstop” command.

Perhaps some words about the idea of the Artificial Intelligence. It's very similar to what Rodney Brooks has called behavior based robotics. The idea is, that the user has to create motion primitives and use these primitives for realizing high-level actions. This is called a hierarchical control system. In case of the domain “ball on beam” the motion primitives have names like “kick”, “rollleft”, “getballposition”. In case of a different domain, the motion primitives have to be changed. The means, it is not a general game playing architecture, but the software can be adjusted to new domains. So it is some kind of robot construction kit. The box2d engine and the parser are remaining the same, only the motion primitives have to be adapted to a new domain.

Class diagram

A simple PID controller to reach an angle

So called PID controllers are widespreach available in the domain of robotics. Usually they are realized around hardware. Sometimes they are utilizing analog circuit. It is possible to transfer this idea into a programming subject and realize a pid controller in the Python language.

class Physics:
  def __init__(self):
    self.mybox2d = Box2d()
  def reset(self):
    self.mybox2d.reset()
  def setmotor(self,speed):
    self.mybox2d.setspeed(0,speed)
  def getangle(self):
    return self.mybox2d.getangle(1)
  def reachangle(self, target):
    speed=2
    for i in range(100):
      angle=self.getangle()
      diff=angle-target
      print i, diff
      if diff<0: self.setmotor(speed)  
      if diff>0: self.setmotor(-speed) 
      if abs(diff)<2: 
        self.setmotor(0)
        break
      time.sleep(0.2)

What is given in the sourcecode is a python class, which is using a subclass called Box2D. According to the name, the subclass contains the physics engine itself. It creates a simulation with some objects. The shown physics class is a layer ontop of the Box2D class and provides motion primitives for the pid controller. For example, reset, getangle and reachangle. The reachangle method is the PID controller itself. It takes a goal parameter, and starts a for loop which is trying to adjust the motor to reach the goal. According to the sourcecode the PID controller is not very advanced. It contains of if-then statements and leaves the loop, if the condition is reached. The advantage is, that the sourcecode can be understood and improved on demand. That means, if the pid controller isn’t working, a new iteration in the sourcecode is needed.

Let us describe the sourcecode in a more elaborated way. It provides a grammar for a domain specific robotics problem. The words in the grammar are: reset, setmotor, getangle and reachangle. The words are calling each other, for example the reachangle word is utilizing the setmotor word. According to name method names, it’s made visible what the purpose is. Even without seeing the sourcecode itself, it’s possible to conclude, that it’s about motor control of a robot. The four words in the grammar are forming a domain specific language. In that language certain movements are allowed. The principle is well known from the “Karel the robot” simulator. Which is a programming task for a maze robot. This time, the robot is a servo motor which is rotating in a Box2D environment. The steps in programming the robot are the same. At first, the grammar has to be invented. In the example it contains only of 4 words, but it can be extended for further needs.

Until now the question remains unanswered in which direction a given grammar has to be extended. For example, if the goal is to add 3 new motion primitives to the existing ones, which kind of motion primitives are right? The answer is called requirement engineering. A requirement is planning step in the software engineering process. An example requirement can be “the motor should make a complete rotation with 360 degree”. All the requirements are written down in natural language. They can be transformed into motion primitives which are extending the grammar.

Creating a domain-specific robotics API from scratch

A Robot-Control-System has to abstract from the original task with natural language primitives. In Forth, they are equal to words, in domain-specific-languages there are equal to a dedicated programming language and in mainstream programming they are created as a library. A Python library is a class which provides methods:

class

– method1

– method2

– method2

If the user want’s to get access the methods he has to call them with myclass.method1(). Some methods have a parameter while others give a return value back. The interesting thing from a technical perspective that the principle is the same. Forth words = Domain specific language = Python library = API.

The term API is used by programmers to describe only the outline of the program interface, it’s a list of methods together with a short description but not the sourcecode. For example, an API entry could be “method1() moves the robot arm to the object”. Let us give a more elaborated example of a potential robotics API:

class

– selectobject(id) the objects are numbered from 0 to 9. After selecting an object, it gets a red frame around it.

– moveoverobject() moves the gripper over the object but isn’t touching it

– touchobject() gets contact with the object by moving the gripper down

– untouchobject() moves the gripper away from the object

– slideobject(direction=left|right|up|down) precondition is, that the gripper has contact with the object, then he moves the object with 1 cm in the direction of the parameter

So far, we have a robotics API class which provides 5 methods, They have names in natural language and a short description what the purpose is. Even not a single line of code was written, it is obvious what the prototype will do. It is not directly an Artificial Intelligence but a developer interface. That means, the programmer can call the methods in his program to control the robot arm. This allows him to build a controller with the help of the motion primitives. For example he can enter the following sequence:

selectobject(3)

moveoverobject()

touchobject()

slideobject(up)

untouchobject()

END

After executing the program the robot will do something. Perhaps he fails, but this can be fixed. The most important part is, that some sort of API is available and the API-methods have names which have to do with the domain. Such a robot control system can be realized in Forth, C++, Java, Python or any other programming language. It doesn’t depends on programming paradigm like scripted language vs compiled language, but it’s a high-level principle of how to design the system itself. Forth programmer would argue, that they are using this principle since the 1980s, because the motion primitives are equal to words, and the parameter passing is done with the stack. But the technique is not Forth specific, because C++ provides the same abstraction feature too which is called object oriented programming. Even in Visual Basic such a robot control system can be implemented, which is called procedural programming there.

But how can we describe the design principle more abstract? At first, it is not about programming itself, Because until now, not a single code line was created and no concrete programming language or operating system was utilized. Instead it is an example for natural language grounding. The idea is to use English as a communication interface between a robot program and a robot task. Let us take a word like “slideobject()”. For a computer, it is only a string, we can rename it to a random character sequence. But for humans the word have a meaning. That means “slideobject” defines a certain behavior which is defined outside the computer program in a dictionary. If somebody is typing in the keyword at Google and searches for videos and images, he will find a concrete description what it means if somebody slides an object. It doesn’t mean, that he grasps something, it doesn’t mean to throw away the object, it means to roll it and to push it.

Outside of a computerprogram a large amount of knowledge is available about manipulation tasks. This knowledge is usually used by humans if they are communicating to each other. It is given by dictionary entries and description in books and can be retrieved with Google and similar search engines. It make sense to program robot control system as a reference to this existing knowledge. That means, to name the methods in the API with the same vocabulary used before in natural language. This helps to maintain complex domains. For example the term “slideobject” is not an easy one. It describes a dexterous manipulation task which depends heavily on the roboticgripper and the object itself. On the other hand, the name itself is very short and contains only 11 characters. That means, it is a combination between a short opcode and a complex behavior connected to that opcode. This kind of default compression should be used by robot-control-systems from the beginning.

Pitfalls

What could happen wrong with the system? For example, if the objects are in clutter the robotgripper will not be able to slide an object. The first impression is maybe that overcoming the problem is possible with advanced Artificial Intelligence, for example with self-improving software which can learn from trial and error. And then the Forth kernel can create new words on the fly which will bypass the issue. Unfortunately such strategy isn’t very successful. Because the problem is located outside of computer science in the linguistics of the domain itself. That means, if the robot is not able to slide an object this issue has to be written down in the manual and the human operator has to read the manual and make sure, that the objects are aligned in advance. That means, the human operator must observe the situation and if objects are in clutter he has to stop the robot, align the objects and restart the robot. So that the original program will work great.

A simple wordlist for animating a human leg

Youtube: “Walking robot with Domain specific language”

On the screenshot a model of a human leg is shown. It contains of two joints which can be moved forward and backward. Displaying the leg on the screen or build the model with arduino in reality is an easy task. The more demanding challenge is to write the control software for the leg. In the following blogpost, I want to present a small wordset which contains of:

Overall the sum of possible words is only 5. With the motion primitives the user can move the joints forward and backward, while with the “reach” statement he can get a certain pose. “reach” needs as a parameter the target angle of joint0 and joint1. The sourcecode for the vocabulary is not very hard:

def action(self,name):
  angle=5 
  if name=="leftupperforward": self.rotatebox(0,angle) 
  if name=="leftupperbackward": self.rotatebox(0,-angle) 
  if name=="leftlowerforward": self.rotatebox(1,angle)
  if name=="leftlowerbackward": self.rotatebox(1,-angle)
def reach(self,angle0,angle1):
  if angle0>self.boxangle[0]:
    self.action("leftupperforward")
  if angle0<self.boxangle[0]:
    self.action("leftupperbackward")
  if angle1>self.boxangle[1]:
    self.action("leftlowerforward")
  if angle1<self.boxangle[1]:
    self.action("leftlowerbackward")

# main
self.myphysics.reach((155,155))

What exactly can the user do with the 5 words? An example is given in the main function. He can script a certain pose of the robot-leg by providing the target joints. Running many of the reach commands after each other will make the leg walk. The 5 predefined words allows the user to easily script such behavior. He is no longer forced to control the leg directly, instead he provides a table with the joints and the leg is animating by itself. The interesting aspect is, that the vocabulary can be extended. For example, it make sense to define a subfunction call “walk slow” and a second one with “jump”. In that subfunction the trajectory is given. The new defined high-level words can used in a more elaborated script to define a complete story. With every new word, the abstraction level is growing. At the end, the author can animate a complex character by simple sequence of action commands: “standup”, “walk”, “walk”, “jump”, “walk”, “sitdown”

The wordset itself was implemented in Python, because Python is great for prototyping. But the same concept can be realized in C++, Forth, Java and AutoIt as well. The power comes from defining the wordset first and using these primitives for constructing a high-level program.

Mindmap
Perhaps a word about the mindmap. In the picture, 5 actions are given by the mind. They are similar to a grammar. The user can execute only actions available in the vocabulary. For example, if we are removing the “reach” word, it’s no longer possible to give a target angle for the leg, instead the user has to use as fallback the 4 remaining motion primitives. In theory, it’s possible to bring the leg into a target position only by using the forward/backward commands, but it’s not very convenient. The high-level “reach” word, was implemented to make things easier.

Before the user can control a robot, he has to figure out which grammar the robot needs. All systems are starting with primitives for moving the servo-motors backward and forward. This feature is given by the electronics or by the animation software itself. That means, if a leg contains of a joint, it’s possible to move that joint forward and backward. The problem is, that with this lowlevel command alone it’s hard to realize a high-level behavior, for example to walk a step forward. Every movement contains of primitive execution, but if the user is trying to script a longer sequence he needs more elaborated action words. For a grasping robot, an example for a high-level action might be “grasp()”, while for a car robot a high-level word example is “switchlane”. A carefully contructted vocabulary helps a lot in authoring robot controllers and scripting longer sequences.