Grounding is the central problem in robotics

In the screenshot a minimal walking robot is show. It was created in the box2d physics engine and contains of some boxes plus joints. Compared to other robots, the system is not very complicated, only 4 joints and 5 boxes were used. That means, the robot is not realistic compared to real dogs or real humans how have much more joints and more ways in moving forward. On the other hand to control this simple device can be called an advanced challenge. How can it be, that a simple articulated figure makes trouble if we want to control it with an algorithm? I mean, four joints isn’t that much.

Let us describe the situation from the bottom up. The robot was created with the box2d primitive: createprismaticjoint and createdynamicbody. Both commands are well documented in the official API. Most game designer are aware of these commands. And to control the joints, we have to call a method. for example joint1.motorspeed=-1.2 This feature is also documented very well. That means, from a technical perspective it is an easy task to put a signal to the joints and let the robot walk. There is a simple problem. Producing a walking gait is much more complicated than only set the joint to a certain speed. Because the joint speed is only the lowlevel perspective from the box2d engine, it ignores what walking is. That means it ignores gaitpatterns, obstacles, dynamics in the air and animation technique. To use this kind of knowledge, something else is needed which isn’t formalized in the box2d primitives.

This “something else” is called grounding. It means to connect lowlevel actions with high level natural language. On the lowlevel side we have the box2d API which provides four simple commands:

joint1.motorspeed=-1.2
joint2.motorspeed=-1.2
joint3.motorspeed=-1.2
joint4.motorspeed=-1.2

On the high level side we have actions like “start”, “walk ahead”, “rotateleg1”, “stop”. Let us take a look into the Python sourcecode to answer the question how grounding works in software:

class Grounding:
  def __init__(self):
    self.box=Lookuptable({0:'bottom',1:'torso',2:'leftrotating',3:'rightrotating',4:'leftleg',5:'rightleg'})
    self.motor=Lookuptable({0:'leftrotating',1:'rightrotating',2:'leftspring',3:'rightspring'})
    self.legangle=Lookuptable({-90:'side',-5:'forward',0:'normal',5:'backward',90:'side'})
    self.contractleg=Lookuptable({-3:'longer',3:'shorter'})
    self.rotatespeed=Lookuptable({.2:'left',-.2:'right'})
  def get(self,group,value):
    # for example: self.get("legangle",-2.1)
    if group=="box": return self.box.inverse(value)
    if group=="motor": return self.motor.inverse(value)
    if group=="contract": return self.contractleg.inverse(value)
    if group=="legangle": return self.legangle.search(value)
    if group=="rotatespeed": return self.rotatespeed.inverse(value)

I have created a lookup table, which stores word-value-items. For example the jointid=0 describes the rotating joint on the left side of the robot. Sure, the box2d engine doesn’t need this information, but the programmer take an advantage from it. What is also stored in the lookuptable are information about speed pattern. A value like “0.2” is equal to rotate a leg to the left with normal speed. The second method in the grounding class is used for retrieving the information. The programmer can access the class from the outside and make a request to it. A sample request would be: give my the boxid for the rotating joint of the right leg. Or the user can ask the grounding class: The leg has an angle of -2 degree, what is the natural language description of that value? All the questions can be answered by executing the self.get() method. The class is working similar to an SQL database.

Even the class looks not very complicated, it is the central part of a robot control system. A mapping between numbers and natural language can be used to create advanced robotics controller. It simplifies the programming task. Perhaps from the technical perspective the grounding class can be improved. It is also possible to create a dedicated sqllite table, because the posted sourcecode looks so far a bit messy. But in any case a grounding layer is needed to manage the complexity. Let us go into the details.

Suppose, no grounding layer is available in the program. If the user wants to let the robot walk he has to enter the following statement:

joint1.motorspeed=0
joint2.motorspeed=.2
joint3.motorspeed=0
joint4.motorspeed=.2

That means, he sends lowlevel values to the joints and something will happen on the screen. In most cases, the robot will not walk, so the user will change the values a bit. Such kind of software will work from the technical side great, but it is hard to maintain larger amount of codelines. Because the programmer has after a while no idea what “.2” on joint #2 means. Now, let us compare this with a grounded control system.

def rotateleg(self,name,direction):
    boxid=self.mygrounding.get("motor",name)
    speed=self.mygrounding.get("rotatespeed",direction)    
    self.setmotor(boxid,speed)
    time.sleep(.5)
    self.setmotor(boxid,0)
  def contractleg(self,name,direction):
    boxid=self.mygrounding.get("motor",name)    
    speed=self.mygrounding.get("contract",direction)    
    self.setmotor(boxid,speed)
    time.sleep(.1)
    speed=0
    self.setmotor(boxid,speed)

This time, the motion primitives have names in natural language. If the user wants to rotate a leg, he can enter “rotateleg(leftrotating,left)” The advantage is, that the user doesn’t need to remember joint-ids or speed constant, instead he formulates everything in https://en.wikipedia.org/wiki/Structured_English

Advertisements

Building a jumping robot, made easy

Regularly reader of the trollheaven blog are perhaps familiar with GUI interface shown in the screenshot above. It is the same used in the project before. Instead of balancing a ball, this time is a jumping robot in the middle. The surprising information is, that realizing different kind of robots has nothing to do with Artificial Intelligence directly, but with the preprocessing steps. Let us go into the details.

Before an AI -controller for a robot can be realized some kind of simulation environment is needed. A typical feature of such environments is, that they are manually controlled. That means, the system has no autonomous mode, but the user has to press keys and type in commands for moving the robot. Somebody may think, this the simulation environment is not a big think and is motivated to ignore the step, but in reality the environment is more important than the AI controller itself. What we see in the screen shot is a combination of Python/tkinter, pygame, box2d and a parser for motion primitives. All elements are important, it is not recommended to leave out the Tkinter gui or any other part.

Such an environment can be used to realize different kind of robots. In my first project a ball on beam situation was created, now a jumping robot is visible. It consists of a torso which is the larger box and two rotating cubes. On the cubes legs are mounted with a linearjoint. That means, it is not a lifelike dog, but a simplified version. The interesting feature is, that after entering the jump command, the linear joints are stretching fast and this let the robot jumps in the box2d engine. It looks highly realistic.

From a perspective of history, it is copy of the famous Marc Raibert jumping robots at the beginning of the MIT leg lab. I’ve seen the old videos and rebuild the structure in my robot simulator. Let us focus on the TK gui right at the screen. This amazing forms is the core of the simulator. It shows information, for example the current frame and the angle of the legs, but it’s also possible to enter textcommands interactively. The concept is not new, it was used before in the Karel the robot project which was a famous education tool for teaching structured programming with Pascal and Java. The idea is, that the robot has motion primitives which can activated. And the user has to enter the primitives in the textbox. For example, if the user enters “rotrback” the right leg will rotate backwards by a small amount. This allows the user to interact with the robot at the screen.

From a programming perspective the overall simulator isn’t very complicated. The project has exactly 400 lines of code so far. The GUI, the pygame, the box2d, the parser and the words are all included in this amount of codelines. What is missing is the automatic mode. That means, the AI itself isn’t available. Such an AI can be created by putting the words in a program, similar to the “Karel the robot” challenge. For example, the user lets the robot jumps, and checks in the air if the angle of the legs are right and then he executes another action. Like i mentioned before such a routine is missing right now. But it can be constructed easily.

What i want to explain is, that the robotics simulator is more important than the AI-controller which runs in the simulator. If the environment was programmed well it is easy to build on top the AI engine. Most robotics projects fail, because the environment is not working. The funny thing is, that a jumping robot is very easy to build. Because the jumping is not the result of the controller, but the jumping is calculated by the Box2D engine. Let us take a look, how jumping was realized in the robot simulator:

    speed=-3
    self.setmotor(2,speed)
    self.setmotor(3,speed)
    time.sleep(.1)
    speed=0
    self.setmotor(2,speed)
    self.setmotor(3,speed)

In 7 lines of code, the linearjoints of the Box2D engine gets for a short time a signal and then they stop. As a result, the box2d engine calculates, that the robot jumps into the air. That means, it is calculated by the physics engine and not by a highly developed AI. Entering the jump command in the textbox will activate the routine, and in the background, Box2D determines everything else.

Joints

In the Box2d vocabulary a jumping joint is called a PrismaticJoint. It can move in a linear fashion back and forth. The spring effect is the result of moving the servo with high speed. In mechanical engineering, such a device is called “linear actuator with a spring return mechanism”. It produces a unidirectional force and make every robot jump. That means, the jumping is not the result of wishful thinking or motivated by Artificial Intelligence, it is simply a mechanical feature. What can happen during the jump is, that the robot will loose his balance. That means the robot is not able to land on his legs but in a wrong way. This has to be adjusted by an AI-controller.

What’s wrong with AIMA?

The book “Russell, Stuart J., and Peter Norvig. Artificial intelligence: a modern approach. Malaysia; Pearson Education Limited,, 2016.” is some sort of bible in the domain of AI. It is used in all university classes worldwide with are teaching how to program intelligent machines. It covers on over 1000 pages all disciplines and has an extended literature list at the end for further reading. The AIMA book gives an insight view, how AI is understood by mainstream AI courses.

I want to explain short, what is written in the book. That means to give a summary of AIMA in a single paragraph. According to AIMA, AI is about problem solving by search. This is done with classical planning and with algorithms like A*. Other important aspects in AI are vision and natural language processing and at the end, AIMA contains a chapter about modern directions in AI research which are probabilistic robotcs and agents.

That was in short the curriculum of the AIMA book. That are the topics covered in universities courses and are understood as mainstream AI. It is a very conservative understanding of AI which ignores most of the research done from 1970s until 2010s. But that is not the fault of AIMA itself, but how universities are working. The idea of an overview book like AIMA is give a summary about established knowledge. And AIMA is right if it’s explaining AI as problem solving by search. That is the core idea of AI.

To understand the AIMA book better it is important to focus on situation in which AIMA doesn’t provides answers. That are problems not handled very well with classical search based planners. In short, this is right for any non-toy problem. That means, the knowledge given by the AIMA book isn’t able to control real robots or solving real games. If somebody is trying to program a chess engine which is able to beat a human he won’t find the answer in the AIMA book. That means, what AIMA is some kind of non-AI. It leaves out how to deal with complex problems.

Again, this is not a critique. Before going into the details it is important to describe more simpler problems first. And playing tictactoe with algorithms from the AIMA book is possible. Also the students are encouraged to find the shortest path between 7-10 cities. This can be handled with vanilla search quite well. What is important to know is, that the AIMA book doesn’t reflect state of the art artificial Intelligence, but some kind of outdated but well established knowledge. This is common of education in general. Schools never teach advanced knowledge, they are treating with well researched subjects which were invented in the past. That means, the AIMA book isn’t a collection of proceedings from the latest AI Conference.

Let us take in short look into the forward of AIMA. The self-understanding of the book is to give a tribute to Thomas Bayes (probability theory), Alan Turing (Turing machine) and Charles Rosen (inventor of Shakey the robot, which was STRIPS based). Bayes lived around 1750, Turing in the 1940s and Rosen in the 1960s. The mean of these dates is the year (1750+1940+1960)/3=1883 and this is exactly the self-understanding of AIMA. It is a very very conservative book about thinking machines. Only things are explained in the book which are around 100 years old per average. The students get a wonderful overview about AI in the past and they are able to identify and describe artifacts in a computer museum. This kind of conservative teaching about past inventions is typical for universities in general. In most English classes, literature is presented which is old and very old. The reason is, that this kind of knowledge is valid and forms a stable curriculum. As a result, the AIMA book can be printed without major modifications for decades. It is for professors and students a guideline through the ocean of Artificial Intelligence and provides help in understanding the past. It is important to make the difference clear between a lecture at a university and a conference talk. A conference is about current topics in AI and robotics which was invented in the now. While university lectures are about topics which are much older and well understood by the mainstream.

At the end i want to give a hint, what the major assumption in AIMA is. The main idea of the book is, that artificial intelligence has to do with mathematics. That means, that equations, numbers and algorithms are the correct lingua franca to discuss AI related problems. What is used very often in the book are equations and pseudocode notations. The idea is, this is the language of choice for dealing with thinking machines. But why is this assumption used? The reason has to do with the evolution of science in general. In the 19th century, science was equal to mathematics. That means there was a difference between literature/art/music on one side and physics, mathematics and rational thinking on the other side. The AIMA book identifies themself as part of mathematics. The assumption is, that mathematics is equal to science and computing is part of mathematics.

Are alternatives to this point of view available? Sure, but not in the time frame addressed by AIMA. A combination of art and science was invented much later. I would guess it was after the year 1883. If a book presents knowledge from the past and only the past, it is not allowed to take a different perspective. A conservative book has to follow conservative understanding of mathematics.

All in all, AIMA is a great book if someone is interested what common knowledge in Artificial Intelligence is and how education works at university level.

Increase the automation productivity

Assembly line robots in the food industry are often ignored by Artificial Intelligence researchers. The idea is the subject of pick & place is well understood and there is no need for further automation. The reality is different. Only the mechanical side of fast pick&place robots is well understood. If a company needs to a robot, he can search for a used model in the internet and get a standard industrial robot for around 20000 US$. Such a machine has a demand for electricity is able to pick and place objects. If something is broken with the machine it can be repaired.

Apart from the mechanical side, nothing is clear. How the software looks like is not researched, how the workflow until a robot gets programmed is remains open and even a simple question which operating system is used for robotics is not answered yet. Sure, most industrial robots were programmed somehow, but in most cases it is only luck, that the robot is able to do the task. Let us investigate in detail what pick&place means. The first impression is that it is a boring task which can be automated easily. In reality, food pick&place is difficult to handle. The robot movement contains of substeps like select object, approach object, move gripper down, grasp object, move gripper up, rotate object, approach box, move gripper down, release object and so on.

In most cases the customer has additional demands, for example the items should be placed in a certain direction in the box. The 4 items at the bottom are aligned from north to south while the next 4 items are aligned the opposite direction. And if something is wrong with the picking process, the robot must reset himself and not make trouble in the assembly line.

What i want to explain is, that a simple task like pick&place objects from an assembly line is from a programming perspective a large scale software project which isn’t researched very well in the literature. That means, there is need for investigating the problem in detail and write better software than today available.

The situation is, that robot manufactoring companies are great hardware producers. Producing an industrial robot for under 30000 US$ is advanced. Compared to the productivity of such robots the investment is little. On the other hand, the robot companies are poor in creating the software which will drive their machines. That means, they simple doesn’t know how to do it right and additional companies on the market how are creating only the software doesn’t exist. The current situation is mostly, that a control panel for the robot is delivered. Which is some kind of stop-motion programming interface. Such a workflow works horrible. It prevents that the robot can use it’s full potential.

What i want to explain is, that there is software crisis for simple pick&place robots. The customers have thousands of robots installed in the food factory, but they have no software to run them well.

Reasons for failure

What are the reasons for the software crisis in industrial robotics? The problem is, that writing software for robotics is a hard task which needs support from academia. It is not only about writing a simple C program which is able to control a servo motor on lowlevel, but robotics software is about artificial Intelligence. For treating the topic the right way, it is important to understand it as an academic problem . This means, before the software can be written a longer research in the library is needed and lots of papers has to be written about the problem. Most robotics companies doesn’t identify themself as knowledge driven organisations like Google but as industrical companies. They are good in providing mechanical devices but weak in cooperating with universities.

Without academic papers, universities, books and theoretical lectures it is not possible to write robotics software for pick&place tasks. It is not a simple task like putting an electric wire together, but software writing for robots is equal to what social scientists are doing. It has to do with reading a lot, citing existing literature and updating bibliographies.

Best practice workflow

The good news is, that there are techniques available to create pick&place robotics software. The list is long, and most topics are advanced software development techniques:

– prototyping with Python and Matlab

– object oriented software writing in C++ and Java for production ready binary version

– 3D realtime simulation in a physics engine

– agile software development in a team with git version control

– interactive text commandline for typing in robot actions

– robotics API for the pick&place domain

– requirements engineering which specifies which feature should be implemented

– testing and bugtracking

– documentation in an academic paper with references to other projects before

That means, software development is possible but only in terms mentioned before. Ignoring this pipeline and trying to program the control software adhoc without version control, without 3d simulation, without an API won’t work. If somebody is not aware what git is, or how to program a physics simulation with bullet, he won’t be successful in programming a pick&place robot. Even the task looks simple, there is a need for a large scale academic driven software project.

Designing a household robot

Robots are usually fit into one of the following categories: they are either designed for industrial usage which means, they are large and need 10 kilowatt power, or robots are designed for researchers in mind, which means they are small like the Aibo robot and will run with a battery. But between both extrema a robot is possible how fits into the middle. Such a robot is called a household robot and will be described in the following blogpost.

Sometimes, household robots are called co-bots because they are working close to the human. They are a combination between an industrial ready robot and a research project for testing out new software. They need less than 100 watt electricity but more than 1 watt. In it’s standard form, they are a single arm robot for the kitchen. The requirements for such a device are complex. It must be a tool for creativity, because the user has to learn how to write software with the robot, but it must also be ready for productive use, because the user want’s to get something for the money he paid.

To make the answer short: a household robot contains of a lot of joints. More than any robot before. The reason is, that only a lot of joints allows to fold the robot to compact size. It is similar to a snake design which can be made very small or very large. Such a kitchen robot doesn’t need much room, but can be made greater on demand. This is realized with the joints. Not 4 or 6 but 10 upto 20 are needed. This allows the single arm to move into any direction in the workspace while maintaining the constraint of a small space. It is similar to a human arm, who also provides a lot of joints. A human arm is mostly a thinking snake. It can reach a distance up to 1 meter, but can operate in a small free space at the same time. The main task of a household robot is to not take to much space. That means, if he isn’t used the robot arm should be small as possible. It has to be folded down like a flexible ruler.

From a mechanical point of view, snake manipulators are lowtech tools. They are built with a wire rope hoist and at the end a servo motor is mounted. The motor is able to pull or push the cable and this make the snake arm move. The more difficult part is how to program such a system. A typical snake manipulator contains of 10 joints, sometimes more. And they all need commands in realtime for doing a useful task. The reason, why snake manipulators are not very common has to do with a lack of software for controlling the device.

In the literature such robots are also called Hyper Redundant Manipulator. They can be built by amateurs and provides a high amount of flexible. It is even possible to grasp large objects with such a system. What is common in all these architectures is, that the degree of freedoms is high. Usually 9 dof are used, sometimes more. In contrast, a typical industrial robot has around 3-4 dof, this allow him to reach a point in the 3d space, but it prevents the robot to make useful tasks with an object. A human arm can be called also a hyper-redundant manipulator. The idea is that more joints are automatically better, because it allows the robot to reach a place in space without taking to much space.

Cobots are not the future

So called collaborating robots are marketed at an innovative idea which brings robotics into the mainstream. The idea is, that fully autonomous robots in industrial environments are the past, while man-machine collaboration is the future. Unfortunately this vision doesn’t make sense and i will explain why.

From a technical point of view, two different kind of automation is possible: fully autonomous system in which the human operator is out of the loop and semi-autonomous systems which is similar to collaboration between man and machine. But which of them is better? To answer this question we have to focus on two different situation: first the development of robotics software and secondly, the usage of a robot in reality.

During the development of robotics software, man-machine interaction is state of the art. The developer implements a new routine, for example the pick&place task, and during the test he will interact with the robot to give additional instructions. It’s not possible to develop robotics software with collaboration. If the software is working, the human operator is no longer needed. That means, the version of the software which can be delivered to the customer doesn’t need any kind of intervention, it can work by it’s own. A typical deployment situation is to put the robot in a metal box and the human operator has no longer direct contact to the machine. If he needs to touch the robot arm, he has to press the stop switch first.

What i want to explain is, that collocation in robotics is a sign of an ongoing development process. If the software isn’t ready or has bugs, a large amount of human machine interaction is needed. Otherwise the problem can’t be fixed. It is the aim of the robot manufacturer to hide this task in his internal workflow. The normal customer shouldn’t be forced to program the robot. The reason why some robotics companies are promoting the idea of Co-Robots is because their software isn’t highly developed. The robotics API is simply not working, and the customer gets no robot but some kind of remote controlled machine. A robot is according to the definition a fully autonomous system which doesn’t need feedback.

The remaining question is how to program a software which runs by it’s own. This is indeed a major concern. There are no books available which are explaining it. In most cases the process of software development is done interactively. That means, the robot programmer starts a subroutine and determines what the robot is doing. IF the robot is behave wrong, he will stop the routine. During the development process, the human programmer and the mechanical robot are working close together. But this is not the general workmode for every user, it is only used in the programming step. It can be done by the robot company and it can be done in a simulated environment.

Let us observe a typical and well programmed robot. He is mounted in an assembly line and his only job is to pick and place food in boxes. He is doing so with high speed. No human is able to interact in this speed. The robot has to be working alone, otherwise his productivity would be lower. Also, from a medical perspective, it is not allowed, that humans are touching the food. Only the clean robot is allowed to get in contact.

Sometimes, autonomous industrial robots are recognized as something which has to be overcome. No it is not. Such industrial robots are the future. What has to changed is the programming workflow until these robots get programmed. This can be done in simulation, interactively and with an open source approach.

To understand modern robotics, it is important to identify what is going well and what not. If we are looking at industrial robots, the working mode they operate is great. It is an fully autonomous, high-speed mode which doesn’t need human intervention. This kind of standalone, non-observed mode at the assembly line is the best practice method and can be improved further. On the other hand, the process of writing software for these machine is not very advanced. That means, the software is proprietary, not documented, hard to install and not flexible enough. The reason why industrial robots are not used more frequently is because the customer doesn’t like the software and especially the software development process. This can be improved drastically.

Suppose, it is possible to program industrial robots with more features. That they will handle complicated tasks very well and are able to detect problems by it’s own. This results into a higher capabilities. The robots can be used for more tasks, not automated yet. What is important to understand is, that modern robots are from a mechanical perspective great. But the software they are using, and the development process until the software is created is outdated. This lowers their potential.

Software workflow

Let us observe how exactly software is created for industrial robots. The robot itself should pick&place food from an assembly line. How exactly is the software written for this task? Exactly this is the bottleneck in modern industrial robots. Nobody knows. Even the company who build the robot doesn’t know. They are delivering either no software at all, or they are providing an outdated programming interface which is based on MS-DOS and is at least 40 years old. Nobody can create on this basis advanced robotics software, and that is the reason why industrial robots are perceived as weak.

Let us describe what is not used in software development in the past. No 3d simulation was used, no open source software libraries, and no books about robotics software development. They reason why is simple: it wasn’t available. That means, in the past, no 3d simulation package for writing robotics software was on the market. Not for sale and not as open source. But without such kind of environment, it is not possible to create advanced software which can handle complicated situation. This results into poor written software which doesn’t work very well. It some kind of luck, how industrial robots gets programmed in the past. The workflow was not repeatable, was not documented and not researched by academia. The only part of robotics software for industrial purpose which is documented in some books is the inverse kinematics part. That means, to move the robotarm to a certain position. Most robotics software has such a subroutine preprogrammed in. The sad news is, that an inverse kinematics routine is the only advanced element in industrial robots from the past. As a result, the movements of the robots are primitive. They are bases on moving to the points in 3d space, because this is the only feature the inverse kinematics routine has to provide.

Research vs deployment

It is important to draw a line between robotics research and robotics deployment. Robotics deployment has to be done fully autonomous with high speed robots. This kind of systems are working without human intervention and they are repeating the same task over and over. For the task of creating software another form of robot is needed. This can be called research robots. Good example for research robots are PR2 from Willowgarage, or Lego Mindstorms EV3 kits. Both robots are created as interactive robots. That means, they are not working in 24/7 mode and they are not working alone. Instead, human machine interaction is required for creating the software.

It is important to understand, that research based robots can’t be deployed in real life situations and vice versa. A Mindstorms NXT3 kit was never created to pick&place objects in 24/7 mode. Instead, the robot was created for testing out new software which is adjusted at runtime. A typical interaction with Lego Mindstorms is to write new software, test it out at the brick, change the software, test it out and so on. From a development workflow, research robots are equal to prototyping. There were used in the software development process.

On the other hand, industrial robots are not a good choice for testing out software. They are large, need a lot of energy and are in most cases dangerous for human. The robotarm of a typical industrial robot can hurt people seriously if he runs to fast. So it is needed that human stay away from the machines. At the end, both types of robots are needed. Small interactive robots for writing software, and large autonomous robots for executing the software fast and efficient.

A typical industrial robot needs 10000 volt-ampere. That is equal to 10 kilowatt. Doing any kind of interaction with such a machine isn’t recommended and using it for software development isn’t possible. Such a robot has to be used in production environment. And there is no alternative available. If the robot would need less energy, for example only 10 Watt he won’t be longer powerful enough to get things done. That means, a fully equipped industrial robot is by itself great. What he asks for is some kind of second robot, called research robot to develop software for the first one. Using a combination of a small sized research robot which needs 10 Watt plus a full blown industrial robot which takes 10000 Watt is the best practice method. On robot 1 the software is tested, while on robot 2 the software get executed efficient.

Let us describe other anti-pattern in modern robots. What most industrial robots are providing is a programming interface. The idea is, that the large 10 kilowatt robot is programmed on the fly by the operator. This won’t work. Such kind of programming interface are not useful. On the other extreme, some research robots are deployed in production environments. That means, there are scenarios available in which people are using the Lego EV3 mindstorms kit in a real live deployment to pick&place objects, This won’t work too. I think, each robot should used inside the original specification.

From classical AI to modern Fuzzy logic

Fuzzy logic was introduced in the year 1965 and is sometimes called a revolution. The main idea is that grounding a robot problems from sensor information to symbols is fundamental for Artificial Intelligence and Fuzzy logic explains, how to do the anchoring right. Instead of explaining what Fuzzy logic is, I want to give a short overview about the transition from non-fuzzy AI to modern Fuzzy Logic.

Let us start in a domain which is well known by all AI researchers. It is called uninformed search. This kind of theory and practice has a long tradition and is teached in all universities and robotics classes in the world. It is not very advanced, because only toy problem can be solved with that principle but most AI researcher feeling comfortable with such a theory, because it has to do with numbers and algoirthm. Uninformed search means, to treat a problem as a game. This was described by the inventor of the game theory, which is John Nash. The idea is, that each player can execute an action and this results into a game tree. The theory describes all kind of games: for example tictactoe, chess, pong but also economic games. AI is about playing these games automatically. This is done by creating a graph of all possible actions. In the literature this principle is called depth-first search.

Like i mentioned before, that principle is not able to solve major problems. Apart from chess and tictactoe it fails in nearly all domains. Especially for robotics, depth first search is useless. On the other hand, it is a comfortable way in treating Artificial Intelligence. Because depth first algorithm are not very complicated to explain. The Rosettacode website has a lot of example implementations which are showing how to traverse a tree with Python, C or any other language https://rosettacode.org/wiki/Tree_traversal

Most mainstream books about Artificial Intelligence in the past doesn’t go further. They are explaining game theory and depth first search and this is everything what they have to offer. But there are many alternatives available. The next slightly better algorithm than a simple depth-first search is called in the literature informed search. The idea is, that a gametree is also available but this time, it is earched in a more efficient way. A typical examples for informed search is the A* algorithm. The algorithm is used for pathplanning and is able to traverse a graph faster because a little amount of domain knowledge is utilized. The expert knowledge is stored in the evaluation function which directs the algorithm to search in a certain direction.

Brute force search and A* search are belonging both to mainstream AI. That means, they are explained in major textbooks, for example in AIMA in a high amount of detail. The interesting question is how to move from classical A* search to a modern fuzzy logic based system. It is a long journey, and the reason why Fuzzy logic is not common in mainstream computing has to do, that this journey isn’t explained very well in the literature. Some amateurs didn’t even know, that something better than A* is available, they believe that A* is some kind of best practice method in controlling a robot.

Let us bring a bit light into the scene. The next step after a working A* algorithm is called macro search. Macro search is not common in mainstream computing. Macro-search is also an example for informed search, but with a higher amount of expert knowledge. The idea is, to define high-level actions, so called motion primitives which are executed in the game tree. Perhaps this sounds a bit unfamiliar, the reason is, that macro search is not explained in major textbooks like AIMA. The bible of mainstream AI “Russel/Norvig: Artificial Intelligence a modern approach” have only a short description about macro-operators. They are explaining the concept in the chapter about the Strips planning system. Macro operators and macro-search is an important strategy to search in large statespaces. They are used in game AI systems very often.

To give a general overview, the transition from classical AI to modern fuzzy logic can be described with the following path: depth first search (mainstream), A* informed search (mainstream), macro-search (esoteric), grounding (esoteric), fuzzy-logic (esoteric).

Perhaps some words about the transition from macro-actions to fuzzy logic. Macro actions have usually names in natural language for example “stand-up”, “go forward” and so on. If they are discussed seriously, all the macro actions are stored in a table, and this table is grounded in the domain. That means, the pong game has a certain amount of macro actions which have to do with hitting the ball. The grounding problem is about anchoring a domain in natural language. And here comes fuzzy logic into the game. Fuzzy logic is the answer to the grounding problems. It treats words as numbers. It is possible to connect natural language words with a parameter and calculate with these numbers. This is explain in detailed in the literature about fuzzy logic and is outside the scope of this blogpost.

The more interesting question is why depth first search, and A* informed search is part of mainstream AI, but macro-actions, grounding and fuzzy logic not. I doesn’t know the answer. The fact is, that major textbooks like AIMA have only a limited amount of chapters, and if classical AI is described on the first 1000 pages, there is no room left for describing the advanced techniques. The transition from macro-actions over grounding up to fuzzy logic is fluent. The idea is, to increase the pre-knowledge in the search process. At the end, a fuzzy control system contains a lot of domain specific knowledge but only a low amount of brute force search in the game tree. This makes fuzzy logic different from classical AI.

I don’t believe it make sense to introduce Fuzzy logic into mainstream AI. Instead only small steps can be taken. The first one is to promote the idea of using macro-operations for solving games. Macro operations are more advanced than classical AI, but not so powerful like Fuzzy logic. The reason why macro search is popular is because it is oriented on classical understanding of Artificial Intelligence. The assumption is, that there is a gametree, and Macro actions are usefull to search the graph much faster. They are some kind of shortcut for finding the goal. From the perspective of fuzzy logic, macro-search is a boring topic. That means, it is not possible to solve advanced problems with it. The advantage is, that macro actions are easier to explain. Because they have much in common with classical A* search. A* search is using a dedicated evaluation function for searching faster in the game tree. Macro actions have additional a concept called motion primitive. Apart from number oriented pathways in the tree, it is possible to activate predefined actions which are jumping in the game tree back and forth.

Let us make an example. A classical search algorithm for an inverse kinematics of a robot arm would test out different numbers. He would send the +1 signal to the joint and also the -1 signal. And if one of the nodes is near the goal, it will select this node. If we are using a macro operator new kind of actions are possible. A possible action list would be: -1, +1, -10, +10, init, strechedout. The first actions are classical numerical commands, which are testing out what happens if the joint is moving -1 or -10 in a certain direction. The last two actions are macro-action-names which will activate a certain predefined pose of the robotarm. This allows to jump in the gametree to a fixed position, without sending numerical signals to the robot first.

I know, the concept of macro actions sounds a bit weired. It is not explained in the standard-literature, but it is a powerful concept to increase the performance of an inverse kinematics solver. Macro actions are located outside of mainstream AI and are discussed in the literature under the umbrella term of advanced robotics.

Motivation

The reason why informed search, macro operators, grounding and fuzzy logic is used in Artificial Intelligence is not because the topics are interesting by itself, but because they promise to inprove the search in the gametree. A classical brute-force search which doesn’t utilize domain knowledge will find the answer to all problems, but in doing so it will need a lot of cpu power. The reason, why non-brute-force search is used is because the amount of endless computation isn’t available. The researchers need the answer to complicated problems in a shorter amount of time and on slow computers. That is the reason, why they have invented a lot of techniques which is utilizing domain knowledge.

The reason why fuzzy logic is popular since it’s invention in the mid 1960s is because it utilizes the most amount of domain knowledge. A fuzzy controller doesn’t use any gametree search, instead all the knowledge about the domain is hardcoded in the program. That means, the robot controller will work even on a Commodore 64 very fast, and this 8bit homecomputer has only 1 MIPS of calculating speed. The disadvantage of fuzzy logic is, that it’s hard to define the domain knowledge explicit. Before a developer is able to do so, he has to understand first what uncertainty means, what a gradient is, which rules are needed for solving a domain and so on. The resulting fuzzy controller is doing the same like a brute-force solver, it will find the answer to a problem. But fuzzy logic is harder to explain and harder to implement then depth first search.

What mainstream computing is doing is to avoid domain specific knowledge. The best practice method is not implementing any expert knowledge in the system and using a non-informed brute force algorithm. This kind of game tree search works for any problem, and if the tree is small it can be searched in realtime. The reason why some researchers are promoting macro-operators, grounding and sometimes Fuzzy logic is because they want to tackle problems which are more complex. That are usually robotics problems, but also image retrieval and natural language processing. The shared definition for these problems is, that they have a large state space which can’t searched by simple brute force solvers. That’s the reason why domain specific approaches are chosen.

The real definition of fuzzy logic is not grouped around uncertainty and fuzzy sets which are equal to qubits in a quantum computer, but fuzzy logic has to do with a high amount of domain specific knowledge and low processing demand. That means, fuzzy control isn’t a classical solver which is searching in the game tree, but fuzzy logic is an expert system which has preprogrammed rules.

expert knowledge

Let us take a step back how domain knowledge is handled in mainstream Artificial Intelligence. The best example is a chess engine which is working with gametree search. The surprising point is, that in reality such an algorithm won’t work. It takes too long to search the chess game tree until the opponent is checkmate. Even the fastest computer can’t do so. What is used in real chess engines is a combination of brute search with an evaluation function. That means, some kind of solver is used, but additional a large amount of chess knowledge is implemented in the algorithm. A common chess engine works by restricting the search horizon to 10 moves in the future and then the evaluation function determines which player has the better position. Without a knowledge driven evaluation function, the chess engine wouldn’t work. It would take too long to calculate all possible moves in advance.

The interesting question is how to formalize domain specific knowledge, so that it can be stored in a computer program. The best practice method in doing so is to use natural language. The reason why is because natural language is abstract as default. The principle is used very often by the PDDL community in the ICAPS challenge. A strips and PDDL operator is equal to a natural language command which moves a player in the state space to a new position. On top of a low level game a second game is constructed on a higher level which is easier to solve. The second game provides subgoals and these are used for solving the low-level game.

Typical subgoals in computerchess are: conquere the centre, bringing the opponent into check, take a figure from the opponent, protect the own king and so on. If subgoals are available it is much easier to find a move in the game tree. Easier means, that the amount of needed computing power is lower. What macro-operators are providing is equal to a subgoal. They are defining a desired state in a game, similar to a keyframe in computer animation. For example, a subgoal in computerchess is to take the bishop of the opponent. Such a goal means, to restrict the search in the game tree only for moves which has to do with taking the bishop. This reduces the amount of search effort to a minimum. Instead of answering the question how to win the overall game, the simpler to answer question is how to take the opponent’s figure.

The interesting feature in fuzzy control is, that no gametree search at all is taken place. Instead every action is done reactive. It is determined by so called fuzzy rules which actions are executed next. This reduces the search effort to zero. That means, no solver is used and the system is based entirely on domain knowledge. The benefit of fuzzy control is that it is able to handle large amount of domain knowledge. With hierarchical fuzzy logic it is possible to store every detail about a domain in natural language rules and process them with equations. Fuzzy logic can be understand as an advanced form of macro-actions. The fuzzy sets are forming a domain specific language which makes it easy to store and retrieve knowledge about a problem. A fuzzy logic line following robot has for example detailed information about distances, obstacles and possible servo motor actions. That means, the robot can solve the problem without searching in the game tree. This makes fuzzy logic to a powerful but hard to explain problem solving technique.

Creating a PID controller for a ball balancing task

The screenshot shows the running software which is a fully autonomous pid controller for controlling a beam. The system is able to detect semantic events and will execute actions like rollleft and kick the ball on the beam. Because the software is a bit complicated, i want to explain it step by step which makes it easy to reproduce it from scratch.

First, we need a gui window for showing verbose information and take the user’s input. It was created with the tkinter package in the lovely Python programming language. The main feature of the GUI window is the input box which is working similar to a text-console in UNIX. The user enters a command, for example “kick” and the robot in the game will act in realtime. If the tkinter package is not installed in your Python installation, this would be the right moment for doing so. And because installing new packages with pip is easy and well documented, it make sense to install also the pygame and the box2d package as well. We will need it in later steps.

#!/usr/bin/env Python
#-*- coding: utf-8 -*
import Tkinter, ScrolledText, math
from tkinter import ttk

class GUI:
  def __init__(self):
    self.tkwindow = Tkinter.Tk()
    self.tkwindow.geometry("300x400+600+0") # place to right
    self.tkwindow.bind('<Motion>', self.mousemotion)
    self.tkwindow.bind("<Key>", self.inputhandling)
    self.fps=20 # 20
    self.framestep = 0
    self.mouse = (0,0)
    # info
    self.widgetinfo = Tkinter.Label(self.tkwindow, text="info")
    self.widgetinfo.place(x=10, y=0)
    # message label
    self.widgetmessagelabel = Tkinter.Label(self.tkwindow, text="messagelabel", wraplength=250, justify="left", font=("FixedSys", 9))
    self.widgetmessagelabel.place(x=10, y=30) 
    # message 
    self.widgetmessage = ScrolledText.ScrolledText(self.tkwindow, width=38, height=8, wrap="word")
    self.widgetmessage.place(x=10, y=210) 
    # input
    self.widgetinput = Tkinter.Entry(self.tkwindow)
    self.widgetinput.place(x=10, y=340)      
    self.widgetinput.bind("<Return>", self.parseinput) 
    # start loop
    self.gameloop()
  def mousemotion(self,event):
    self.mouse=(event.x, event.y) 
  def parseinput(self, event):
    pass
  def inputhandling(self,event):
    pass
  def paintlabel(self):
    result="fps "+str(self.fps)+" frame "+str(self.framestep)
    result+=" mouse "+str(self.mouse[0])+" "+str(self.mouse[1])
    self.widgetinfo.config(text=result) 
  def paintobjects(self):
    pass
  def printtext(self,text):
    self.widgetmessage.insert(Tkinter.END, text)
    self.widgetmessage.see(Tkinter.END) # scroll to end
  def gameloop(self):
    self.framestep +=1
    self.paintlabel()
    self.paintobjects()
    self.tkwindow.after(1000/self.fps, self.gameloop) # call gameloop every tick    
    
        
class Game:
  def __init__(self):
    self.mygui = GUI()
    self.mygui.tkwindow.mainloop()
    
mygame = Game() 

In the next step we are creating additional to the tkinter window a pygame window. Because there is a need for displaying the game itself. In theory, it’s possible to paint lines in tkinter directly, but pygame has better drawing primitives and it’s a dedicated game engine.

#!/usr/bin/env Python
#-*- coding: utf-8 -*
import Tkinter, ScrolledText, pygame, math
from tkinter import ttk

class GUI:
  def __init__(self):
    self.tkwindow = Tkinter.Tk()
    self.tkwindow.geometry("300x400+600+0") # place to right
    self.tkwindow.bind('<Motion>', self.mousemotion)
    self.tkwindow.bind("<Key>", self.inputhandling)
    self.pygamewindow = pygame.display.set_mode((500, 350), pygame.HWSURFACE | pygame.DOUBLEBUF)
    self.fps=20 # 20
    self.framestep = 0
    self.mouse = (0,0)
    # info
    self.widgetinfo = Tkinter.Label(self.tkwindow, text="info")
    self.widgetinfo.place(x=10, y=0)
    # message label
    self.widgetmessagelabel = Tkinter.Label(self.tkwindow, text="messagelabel", wraplength=250, justify="left", font=("FixedSys", 9))
    self.widgetmessagelabel.place(x=10, y=30) 
    # message 
    self.widgetmessage = ScrolledText.ScrolledText(self.tkwindow, width=38, height=8, wrap="word")
    self.widgetmessage.place(x=10, y=210) 
    # input
    self.widgetinput = Tkinter.Entry(self.tkwindow)
    self.widgetinput.place(x=10, y=340)      
    self.widgetinput.bind("<Return>", self.parseinput) 
    # start loop
    self.gameloop()
  def mousemotion(self,event):
    self.mouse=(event.x, event.y) 
  def parseinput(self, event):
    pass
  def inputhandling(self,event):
    pass
  def paintlabel(self):
    result="fps "+str(self.fps)+" frame "+str(self.framestep)
    result+=" mouse "+str(self.mouse[0])+" "+str(self.mouse[1])
    self.widgetinfo.config(text=result) 
    # mousepos in pygame window
    for event in pygame.event.get():
      if event.type == pygame.MOUSEMOTION:
        self.mouse=event.pos
  def paintobjects(self):
    self.pygamewindow.fill(pygame.Color(255,255,255))  
  def printtext(self,text):
    self.widgetmessage.insert(Tkinter.END, text)
    self.widgetmessage.see(Tkinter.END) # scroll to end
  def gameloop(self):
    self.framestep +=1
    self.paintlabel()
    self.paintobjects()
    pygame.display.update()
    self.tkwindow.after(1000/self.fps, self.gameloop) # call gameloop every tick    
        
class Game:
  def __init__(self):
    self.mygui = GUI()
    self.mygui.tkwindow.mainloop()
    
mygame = Game()  

A game without a game-engine wouldn’t make much sense. For displaying a realistic “ball on a beam” situation, we need a fast underlying physics engine. It’s box2d, which works great in in Python together with pygame. For realizing it in sourcecode we need a second class which is stored in a new file. This helps to maintain the sourcecode short. The new file “box2d.py” contains everything what is needed to create a box2d instance, filling it with objects and activate a single joint. The idea is, that the user can move the beam left and right and this will change the ball’s position.

## file: main.py ##
#!/usr/bin/env Python
#-*- coding: utf-8 -*-
import Tkinter, ScrolledText, pygame, Box2D, math
from tkinter import ttk
import physics

class GUI:
  def __init__(self):
    self.tkwindow = Tkinter.Tk()
    self.tkwindow.geometry("300x400+600+0") # place to right
    self.tkwindow.bind('<Motion>', self.mousemotion)
    self.tkwindow.bind("<Key>", self.inputhandling)
    self.pygamewindow = pygame.display.set_mode((500, 350), pygame.HWSURFACE | pygame.DOUBLEBUF)
    self.myphysics = physics.Physics()
    self.fps=20 # 20
    self.framestep = 0
    self.mouse = (0,0)
    self.ppm = 100.0  # pixels per meter, default 100
    # info
    self.widgetinfo = Tkinter.Label(self.tkwindow, text="info")
    self.widgetinfo.place(x=10, y=0)
    # message label
    self.widgetmessagelabel = Tkinter.Label(self.tkwindow, text="messagelabel", wraplength=250, justify="left", font=("FixedSys", 9))
    self.widgetmessagelabel.place(x=10, y=30) 
    # message 
    self.widgetmessage = ScrolledText.ScrolledText(self.tkwindow, width=38, height=8, wrap="word")
    self.widgetmessage.place(x=10, y=210) 
    # input
    self.widgetinput = Tkinter.Entry(self.tkwindow)
    self.widgetinput.place(x=10, y=340)      
    self.widgetinput.bind("<Return>", self.parseinput) 
    # start loop
    self.gameloop()
  def mousemotion(self,event):
    self.mouse=(event.x, event.y) 
  def parseinput(self, event):
    pass
  def inputhandling(self,event):
    pass
  def paintlabel(self):
    result="fps "+str(self.fps)+" frame "+str(self.framestep)
    result+=" mouse "+str(self.mouse[0])+" "+str(self.mouse[1])
    self.widgetinfo.config(text=result) 
    # mousepos in pygame window
    for event in pygame.event.get():
      if event.type == pygame.MOUSEMOTION:
        self.mouse=event.pos
  def paintobjects(self):
    self.pygamewindow.fill(pygame.Color(255,255,255))    
    colors = { Box2D.b2_staticBody: (16, 79, 15), Box2D.b2_dynamicBody: (165, 42, 42), }
    for body in (self.myphysics.mybox2d.myworld.bodies):  
      for fixture in body.fixtures:
        shape = fixture.shape
        if hasattr(shape, 'vertices'):  # rectangle
          vertices = [(body.transform * v) * self.ppm for v in shape.vertices]
          pygame.draw.polygon(self.pygamewindow, colors[body.type], vertices, 5)
        else:  # ball
          position = body.transform * shape.pos * self.ppm
          x, y = int(position[0]), int(position[1])
          radius = int(shape.radius * self.ppm)
          pygame.draw.circle(self.pygamewindow, colors[body.type], (x, y), radius, 2)
          x2, y2 = x + radius * -math.sin(body.angle), y + radius * math.cos(body.angle)
          pygame.draw.line(self.pygamewindow, colors[body.type], (x, y), (x2, y2), 2) #top    
  def printtext(self,text):
    self.widgetmessage.insert(Tkinter.END, text)
    self.widgetmessage.see(Tkinter.END) # scroll to end
  def gameloop(self):
    self.framestep +=1
    self.paintlabel()
    self.paintobjects()
    pygame.display.update()
    self.myphysics.update()
    self.tkwindow.after(1000/self.fps, self.gameloop) # call gameloop every tick    
    
        
class Game:
  def __init__(self):
    self.mygui = GUI()
    self.mygui.tkwindow.mainloop()
    
mygame = Game()    

## file: physics.py ##
import box2d
class Physics:
  def __init__(self):
    self.mybox2d = box2d.Box2d()
  def update(self): self.mybox2d.update()

## file: box2d.py ##
import Box2D, math

class Box2d:
  def __init__(self):
    self.fps=20
    self.fric = 0.2  # 1.0 friction
    self.dens = 0.1 # 1.0 density
    self.torque = 100 #100
    self.reset()
  def reset(self):
    self.myworld = Box2D.b2World(gravity=(0,10), doSleep=True)  # gravity=(0,10)
    self.body = []
    self.joint = []
    self.createbox("static",(3, 3.3),(3, 0.02)) # bottom 
    self.createbox("static",(2.5, 2.0),(0.05, 0.05)) # fixed
    self.createbox("dynamic",(2, 2),(1, 0.1)) # base
    self.createcircle((2.3, 1.2),0.2) # ball
    self.createrevolutejoint(1,2)
    self.update()
  def update(self):
    timestep = 1.0 / self.fps
    velocityIterations = 8
    positionIterations = 3
    self.myworld.Step(timestep, velocityIterations, positionIterations)  
  def createbox(self,moveable,p,size):
    if moveable=="dynamic":
      self.body.append(self.myworld.CreateDynamicBody(position=p, angle=0))
      self.body[-1].CreatePolygonFixture(box=size, density=self.dens, friction=self.fric)
    if moveable=="static":
      self.body.append(self.myworld.CreateStaticBody(
        position=p,angle=0,
        fixtures=Box2D.b2FixtureDef(
          shape=Box2D.b2PolygonShape(box=size),
          friction=self.fric, density=self.dens
        )
      ))
  def createcircle(self,p,r):
    self.body.append(self.myworld.CreateDynamicBody(
      position=p,
      fixtures=Box2D.b2FixtureDef(
        shape=Box2D.b2CircleShape(radius=r), 
        restitution=0.5, density=self.dens, friction=self.fric), 
    ))   
  def createrevolutejoint(self,id1,id2):
    """ create rotating joint """
    anchorA, anchorB = (0,0), (0,0)
    self.joint.append(self.myworld.CreateRevoluteJoint(
        bodyA=self.body[id1], bodyB=self.body[id2],
        localAnchorA=anchorA, localAnchorB=anchorB,
        enableMotor=True, enableLimit=False,
        lowerAngle=0, upperAngle=1,
        maxMotorTorque=self.torque,
        motorSpeed=0,
    )) 
  def createprismaticjoint(self,id1,id2,axisx,axisy):
    """ create linear joint """
    self.joint.append(self.myworld.CreatePrismaticJoint(
        bodyA=self.body[id1], bodyB=self.body[id2],
        localAnchorA=(0, 0), localAnchorB=(0, 0), axis=(axisx, axisy),
        lowerTranslation=1, upperTranslation=3.2,
        enableMotor=True, enableLimit=False,
        maxMotorForce=self.torque,
        motorSpeed=0.0,
    ))            
  def getpos(self, id):
    box2dpos = self.body[id].position
    return self.box2d_to_pygame_pos(box2dpos)
  def getvelocity(self, id):
    return self.body[id].linearVelocity
  def setspeed(self, id, speed):
    self.joint[id].motorSpeed = speed       
  def getangle(self, id):
    angle = self.body[id].angle
    # angle = (math.degrees(angle)-90) % 360
    angle = math.degrees(angle)
    return angle    
  def box2d_to_pygame_pos(self, box2dpos):
    """ convert coordinates """
    ppm = 100.0  # pixels per meter, default 100
    x,y = int(box2dpos[0] * ppm),int(box2dpos[1] * ppm)
    return (x, y)

AI

The tutorial so far was about a simple Python game which consists of a tkinter GUI window, a pygame window and a box2d class. The simulation can be played manually by a human operator. But such a simulation alone is boring and many tutorials are available which are describing more sophisticated games. The more demanding task is to realize an Artificial Intelligence on top of the game. At first, i want to post the sourcecode itself, it contains the full blown game plus the integrated AI engine. The code is distributed over 5 files in around 400 lines of code. The good news is, that half of the content is known from the introduction. It’s the tkinter gui, the pygame window and the box2d engine. But something is new: a parser, a semantic grounding class and physics motion primitives.

### file: main.py ###

#!/usr/bin/env Python
#-*- coding: utf-8 -*-
import Tkinter, ScrolledText, pygame, Box2D, math
from tkinter import ttk
import physics

class GUI:
  def __init__(self):
    self.tkwindow = Tkinter.Tk()
    self.tkwindow.geometry("300x400+600+0") # place to right
    self.tkwindow.bind('<Motion>', self.mousemotion)
    self.tkwindow.bind("<Key>", self.inputhandling)
    self.pygamewindow = pygame.display.set_mode((500, 350), pygame.HWSURFACE | pygame.DOUBLEBUF)
    self.myphysics = physics.Physics()
    self.fps=20 # 20
    self.framestep = 0
    self.mouse = (0,0)
    self.ppm = 100.0  # pixels per meter, default 100
    # info
    self.widgetinfo = Tkinter.Label(self.tkwindow, text="info")
    self.widgetinfo.place(x=10, y=0)
    # message label
    self.widgetmessagelabel = Tkinter.Label(self.tkwindow, text="messagelabel", wraplength=250, justify="left", font=("FixedSys", 9))
    self.widgetmessagelabel.place(x=10, y=30) 
    # message 
    self.widgetmessage = ScrolledText.ScrolledText(self.tkwindow, width=38, height=8, wrap="word")
    self.widgetmessage.place(x=10, y=210) 
    # input
    self.widgetinput = Tkinter.Entry(self.tkwindow)
    self.widgetinput.place(x=10, y=340)      
    self.widgetinput.bind("<Return>", self.parseinput) 
    # start loop
    self.gameloop()
  def mousemotion(self,event):
    self.mouse=(event.x, event.y) 
  def parseinput(self, event):
    name=self.widgetinput.get()
    self.myphysics.parser(name)
  def inputhandling(self,event):
    if event.keysym=="Left": self.myphysics.parser("rollleft")
    if event.keysym=="Right": self.myphysics.parser("rollright")
    if event.keysym=="Up": self.myphysics.parser("kick")
    if event.keysym=="Down": self.myphysics.parser("kick2")
  def paintlabel(self):
    result="fps "+str(self.fps)+" frame "+str(self.framestep)
    result+=" mouse "+str(self.mouse[0])+" "+str(self.mouse[1])
    self.widgetinfo.config(text=result) 
    # physics message
    temp= self.myphysics.myparser.message
    if temp !="": 
      self.printtext(temp+" ")
      self.myphysics.myparser.clearmessage()
    # wordlist and info
    temp = self.myphysics.myparser.getwords()+"\n\n"
    temp += self.myphysics.getinfo()
    self.widgetmessagelabel.config(text=temp)     
    # mousepos in pygame window
    for event in pygame.event.get():
      if event.type == pygame.MOUSEMOTION:
        self.mouse=event.pos
  def paintobjects(self):
    self.pygamewindow.fill(pygame.Color(255,255,255))    
    colors = { Box2D.b2_staticBody: (16, 79, 15), Box2D.b2_dynamicBody: (165, 42, 42), }
    for body in (self.myphysics.mybox2d.myworld.bodies):  
      for fixture in body.fixtures:
        shape = fixture.shape
        if hasattr(shape, 'vertices'):  # rectangle
          vertices = [(body.transform * v) * self.ppm for v in shape.vertices]
          pygame.draw.polygon(self.pygamewindow, colors[body.type], vertices, 5)
        else:  # ball
          position = body.transform * shape.pos * self.ppm
          x, y = int(position[0]), int(position[1])
          radius = int(shape.radius * self.ppm)
          pygame.draw.circle(self.pygamewindow, colors[body.type], (x, y), radius, 2)
          x2, y2 = x + radius * -math.sin(body.angle), y + radius * math.cos(body.angle)
          pygame.draw.line(self.pygamewindow, colors[body.type], (x, y), (x2, y2), 2) #top    
  def printtext(self,text):
    self.widgetmessage.insert(Tkinter.END, text)
    self.widgetmessage.see(Tkinter.END) # scroll to end
  def gameloop(self):
    self.framestep +=1
    self.paintlabel()
    self.paintobjects()
    pygame.display.update()
    self.myphysics.update()
    self.tkwindow.after(1000/self.fps, self.gameloop) # call gameloop every tick    
    
        
class Game:
  def __init__(self):
    self.mygui = GUI()
    self.mygui.tkwindow.mainloop()
    
mygame = Game()    

### file: grounding.py ###
class Lookuptable:
  """ constructor needs a dictionary at first run"""
  def __init__(self,category):
    self.mydict=category
  def search(self,target):
    errormin=9999
    itemmin=-1
    for i in self.mydict.items():
      error=abs(i[0]-target)
      if error<errormin:
        errormin=error
        itemmin=i
    return itemmin[1]

class Grounding:
  def __init__(self):
    self.position=Lookuptable({125:'leftoff',165:'leftcorner',205:'left',245:'middle',
      285:'right',325:'rightcorner',365:'rightoff'})    
    self.velocity = Lookuptable({-.4:'moveleftfast',-.1:'moveleft',
    .0:'standstill',.1:'moveright',.4:'moverightfast'})       
    self.distance=Lookuptable({0:'contact',20:'near',40:'away',60:'faraway'})
  def get(self,name,value):
    # for example: self.get("pos",-2.1)
    if name=="pos": return self.position.search(value)
    if name=="vel": return self.velocity.search(value)
    if name=="distance": return self.distance.search(value)
    

### file: physics.py ###
import threading,time
import box2d, parser, grounding

class Physics:
  def __init__(self):
    self.mybox2d = box2d.Box2d()
    self.myparser = parser.Parser()    
    self.mygrounding = grounding.Grounding()
    self.mainrunning=False
  def parser(self,name):
    status=self.myparser.getcommand(name)
    if status!="error":
      # run every command as thread
      command="self."+status[0]
      argument=status[1]
      thread1 = threading.Thread(target=eval(command),args=argument)
      thread1.daemon = True  # stops, if mainprogram ends
      thread1.start()
  def update(self): self.mybox2d.update()
  def reset(self): self.mybox2d.reset() 
  def setmotor(self,speed): self.mybox2d.setspeed(0,speed)    
  def getangle(self): return self.mybox2d.getangle(2)
  def getballpos(self): return self.mybox2d.getpos(3)
  def getballspeed(self): return self.mybox2d.getvelocity(3) # b2vec    
  def getsemanticpos(self): return self.mygrounding.get("pos",self.getballpos()[0])
  def getsemanticvel(self): return self.mygrounding.get("vel",self.getballspeed()[0])
  def reachangle(self, target, initspeed):
    """ example execution: 20,2 (angle=20, speed=2) """
    speed=initspeed
    for i in range(100):
      angle=self.getangle()
      diff=angle-target
      #print i, diff, speed
      if diff<0: self.setmotor(speed)  
      if diff>0: self.setmotor(-speed) 
      if abs(diff)<5: speed=abs(diff)/5.0
      if abs(diff)<0.5: 
        self.setmotor(0)
        break
      time.sleep(0.05)    
  def roll(self, direction):
    # direction: -3 strongleft, -1 lowleft, 1 lowright
    angle=2*direction
    self.reachangle(angle,1)
    time.sleep(.5)
    self.reachangle(0,1)
  def kick(self, direction):
    # direction = -1 or 1
    self.setmotor(7*direction)  
    time.sleep(0.1)
    self.setmotor(0)
    time.sleep(0.1)
    self.setmotor(-7*direction)  
    time.sleep(0.1)
    self.setmotor(0) 
    self.reachangle(0,2) # back to init    
  def kick2(self):
    speed=6 #6 
    duration=0.15
    self.setmotor(speed)  
    time.sleep(duration)
    self.setmotor(0)
    time.sleep(0.1)
    self.setmotor(-speed)  
    time.sleep(duration+.1)
    self.setmotor(0) 
    self.reachangle(0,2) # back to init    

  def getinfo(self):
    result=""
    result+="semantic "
    result+=self.getsemanticpos()+" "
    result+=self.getsemanticvel()
    result+=" ballpos="+str(self.getballpos())
    x=round(self.getballspeed()[0],1)
    y=round(self.getballspeed()[0],2)
    result+=" speed="+str(x)+" "+str(y)
    result+=" angle="+str(int(self.getangle()))
    result+=" threadrunning="+str(self.mainrunning)
    return result
  def detectdanger(self):
    pos=self.getsemanticpos()
    vel=self.getsemanticvel()
    if pos=="rightcorner":
      if vel=="moveright": 
        self.myparser.addmessage("rightdanger")
        self.roll(-3)
      if vel=="moverightfast": 
        self.myparser.addmessage("rightdangerhigh")
        self.roll(-5)
    if pos=="leftcorner":
      if vel=="moveleft":
        self.myparser.addmessage("leftdanger")
        self.roll(3)
      if vel=="moveleftfast": 
        self.myparser.addmessage("leftdangerhigh")
        self.roll(5)    
  def detectkick(self):
    pos=self.getsemanticpos()
    vel=self.getsemanticvel()
    # detect kick
    if pos=="right" and (vel=="moveright" or vel=="moverightfast"):
      self.kick(-1)  
  def detectoff(self):
    pos=self.getsemanticpos()
    vel=self.getsemanticvel()
    if pos=="leftoff" or pos=="rightoff":
      self.reset()

  def mainstop(self):
    self.mainrunning=False
  def main(self):
    self.mainrunning=True
    for i in range(10000):
      time.sleep(.2)
      self.detectdanger()
      self.detectkick()
      self.detectoff()
      if self.mainrunning==False:
        break
    self.mainrunning=False


### file: parser.py ###
class Parser:
  def __init__(self):
    self.message = ""
    self.words =[ # ("name","group","command",[parameter])
      ("words","system","words",[]), # dummy
      ("info","system","getinfo",[]), # dummy
      ("reset","system","reset",[]),
      
      ("stop","motor","setmotor",[0]),
      ("left","motor","setmotor",[-0.3]),
      ("right","motor","setmotor",[0.3]),    
      ("reach-20","reach","reachangle",[-20,2]),
      ("reach+20","reach","reachangle",[20,2]),   
      
      ("rollleft","roll","roll",[-2]),
      ("rollright",",roll","roll",[2]), 
      
      ("kick","kick","kick",[-1]),
      ("kick2","kick","kick2",[]),
      
      ("main","macro","main",[]),
      ("mainstop","macro","mainstop",[]),
    ]    
  def clearmessage(self):
    self.message=""
  def addmessage(self,text):
    self.message+=text
  def getwords(self):
    result=""
    for i in self.words:
      result+=i[0]+" "
    return result    
  def getcommand(self,name):
    result=[]
    self.addmessage(name)
    pos=self.findword(name)
    if pos!=-1:
      result.append(self.words[pos][2]) # command
      result.append(self.words[pos][3]) # parameter
    else: 
      result="error" 
      self.addmessage("command not found\n")     
    return result
  def findword(self,name):
    pos=-1
    for i in range(len(self.words)):
      if self.words[i][0]==name:
        pos=i
        break
    return pos    


### file: box2d.py ###
import Box2D, math

class Box2d:
  def __init__(self):
    self.fps=20
    self.fric = 0.2  # 1.0 friction
    self.dens = 0.1 # 1.0 density
    self.torque = 100 #100
    self.reset()
  def reset(self):
    self.myworld = Box2D.b2World(gravity=(0,10), doSleep=True)  # gravity=(0,10)
    self.body = []
    self.joint = []
    self.createbox("static",(3, 3.3),(3, 0.02)) # bottom 
    self.createbox("static",(2.5, 2.0),(0.05, 0.05)) # fixed
    self.createbox("dynamic",(2, 2),(1, 0.1)) # base
    self.createcircle((2.3, 1.2),0.2) # ball
    self.createrevolutejoint(1,2)
    self.update()
  def update(self):
    timestep = 1.0 / self.fps
    velocityIterations = 8
    positionIterations = 3
    self.myworld.Step(timestep, velocityIterations, positionIterations)  
  def createbox(self,moveable,p,size):
    if moveable=="dynamic":
      self.body.append(self.myworld.CreateDynamicBody(position=p, angle=0))
      self.body[-1].CreatePolygonFixture(box=size, density=self.dens, friction=self.fric)
    if moveable=="static":
      self.body.append(self.myworld.CreateStaticBody(
        position=p,angle=0,
        fixtures=Box2D.b2FixtureDef(
          shape=Box2D.b2PolygonShape(box=size),
          friction=self.fric, density=self.dens
        )
      ))
  def createcircle(self,p,r):
    self.body.append(self.myworld.CreateDynamicBody(
      position=p,
      fixtures=Box2D.b2FixtureDef(
        shape=Box2D.b2CircleShape(radius=r), 
        restitution=0.5, density=self.dens, friction=self.fric), 
    ))   
  def createrevolutejoint(self,id1,id2):
    """ create rotating joint """
    anchorA, anchorB = (0,0), (0,0)
    self.joint.append(self.myworld.CreateRevoluteJoint(
        bodyA=self.body[id1], bodyB=self.body[id2],
        localAnchorA=anchorA, localAnchorB=anchorB,
        enableMotor=True, enableLimit=False,
        lowerAngle=0, upperAngle=1,
        maxMotorTorque=self.torque,
        motorSpeed=0,
    )) 
  def createprismaticjoint(self,id1,id2,axisx,axisy):
    """ create linear joint """
    self.joint.append(self.myworld.CreatePrismaticJoint(
        bodyA=self.body[id1], bodyB=self.body[id2],
        localAnchorA=(0, 0), localAnchorB=(0, 0), axis=(axisx, axisy),
        lowerTranslation=1, upperTranslation=3.2,
        enableMotor=True, enableLimit=False,
        maxMotorForce=self.torque,
        motorSpeed=0.0,
    ))            
  def getpos(self, id):
    box2dpos = self.body[id].position
    return self.box2d_to_pygame_pos(box2dpos)
  def getvelocity(self, id):
    return self.body[id].linearVelocity
  def setspeed(self, id, speed):
    self.joint[id].motorSpeed = speed       
  def getangle(self, id):
    angle = self.body[id].angle
    # angle = (math.degrees(angle)-90) % 360
    angle = math.degrees(angle)
    return angle    
  def box2d_to_pygame_pos(self, box2dpos):
    """ convert coordinates """
    ppm = 100.0  # pixels per meter, default 100
    x,y = int(box2dpos[0] * ppm),int(box2dpos[1] * ppm)
    return (x, y)

The best idea is to start the description with the physics class. In contrast to the class in the introduction which was nearly empty, it has now some additional motion primitives. They are used for getting information, and executing actions. An example motion primitive is “rollleft” which will move the beam to the left. Most of the motion primitives are threaded, that means, they run in background. To executing the primitives, a dedicated parser is used. The parser takes the user’s input and determines which Python method has to be called. The parser is created around a dictionary.

At the end is another new class available, called grounding. This is used for converting absolute ball positions into natural language. It works also with a dictionary, and an example request would be that the ballposition x=245 is equal to “middle”. Such kind of mapping helps to create behaviors. Instead of writing “if ballx>230 and ballx<260”, the more simpler statement “if ball=middle” can be used. The AI itself is the physics.main() which is a short Python method used for determine the next action. It will check if the ball is on the beam and checks also if a kick is possible.

The physics.main() method can be started from the tkinter GUI window, and it's also possible to interrupt the task by request. The overall system is controlled interactively. That means, the user and the AI are working together. The user has the option to control the beam manual and overwrite the AI, or he can choose to deactivate the AI completely with the “mainstop” command.

Perhaps some words about the idea of the Artificial Intelligence. It's very similar to what Rodney Brooks has called behavior based robotics. The idea is, that the user has to create motion primitives and use these primitives for realizing high-level actions. This is called a hierarchical control system. In case of the domain “ball on beam” the motion primitives have names like “kick”, “rollleft”, “getballposition”. In case of a different domain, the motion primitives have to be changed. The means, it is not a general game playing architecture, but the software can be adjusted to new domains. So it is some kind of robot construction kit. The box2d engine and the parser are remaining the same, only the motion primitives have to be adapted to a new domain.

Class diagram