Controlling the inverted pendulum

The inverted pendulum problem is cited very often in the literature about robotics. The reason is, that it is a simple underactuated kinematic chain which can become very difficult to solve. The inverted pendulum problem is from the same category like “Ball on a beam” task. In both cases, the system has an actuator which provides a force, but the system has also a freely movable joint which can’t influenced directly. How can we solve the issue elegant?

The first thing to do is not to focus on a certain technique like neural networks or q-learning but the first thing to do is define the difference between the AI controller and the prediction engine. The AI controller is something which can control the pendulum. It is activated and it will bring the pendulum into the upward position. The more interesting part is the prediction engine which is a subpart of the AI controller. The terms in the literature are sometimes a bit confussing. In control theory books the prediction engine is often cited as a “system identification”. In gaming oriented communities this software module is called the physics engine. If we want to solve the inverted pendulum problem we have to focus on this part.

The first question is, why do we need to predict the system, if we only want to figure out what the correct motor control command is. Isn’t it more easier to create the AI Controller directly, for example in the form “If angle newangle=-10, velocity=-2

What does that mean? It describes the behavior of the pendulum. It predicts a future state. On the left side of the equation is the current status given. On the right side the consequence. The funny thing that from a perspective of an AI Controller, this prediction is useless, because it doesn’t answer the question how to bring the pendulum upwards. But with a detailed look the equation is more powerful than expected first. Because it answers the given what result a given control signal will produce. In the reinforcement learning literature, the concept is called model-based reinforcement learning. The model is the prediction model.

But why do we need such a model, isn’t the environment in which the pendulum task is moving able to provide the physics engine? Yes and no. It is correct, that Box2d can simulate a pendulum. This is called the environment or the game. But it is not possible to create the controller directly.

The picture shows the principle of model predictive control. There are two physics engines available, the main one at the bottom which simulates the game. And a second one which has to be created from scratch and the process in doing so is called “System identification”. The good news is, that after the system identification was successful, the inverted pendulum problem is solved, there are not further challenges. The prediction model can be used very easily for building the AI controller. And the AI controller will bring the system into any desired state. What exactly is system identification? In the game theory, there are two problems discussed, the first one is how to play an existing game, for example tictactoe and the second one is how to learn the game rules for an unknown game. Learning the game rules is equal to system identification. A game is different from an AI controller. A game provides options. That a possible action sequences. For example, the Tictactoe game provides the option to place a figure on one of the 9 fields. The the game engine will mark the field as non-empty, which reduces the options for the next player. A prediction engine for the inverted pendulum works the same way.

Car driving

Let us make the advantages of a prediction engine more clear on the problem of driving a car. What programmers sometimes belief is that a self-driving car has to provide a certain reaction. For example, if the traffic light is red, the car has to stop. This kind of input-output relation ship is shown in reality, that means, the car approaches the situation, the light is red and as a consequence the car will stop. But this kind of rule is not what drivers are using for generating the action. What a good driver is doing is to calculate alternatives. He answers what-if-question. A possible case is the question what will happen, if the car doesn’t stop at the right light. If the aim is only to generate the next action, this question makes no sense, but if the idea is to understand how driving works than it is important.

Suppose, we are ignoring potential future states of a system and describe only what a car should do in each situation. The result is, that world is no longer predicted accurate. The problem is, that in the reality, each driver should stop at the red light, but he has from a physical perspective more options. He can – in theory – decide to ignore the red light. The traffic system in general won’t prevent such cases. If a driver moves on the cross even he is not allowed to do so, the game won’t freeze because the driver has made a mistake. In contrast, the traffic system will run ahead. The result is an accident. To the explain it more clear. If a large group of poential actions are ignored, the system identification is not complete. It is important to say for each situation what the outcome is, no matter if a car has stopped at the light or not.


Underactuated robotics without mathematics

In the screenshot a simple example of an underactued system is shown. It was build in box2d and shows a walking robot who hangs on a rope. The robot can control only his feet but not the harness in which he hangs. The joints of the rope were deactivated, they can move freely. The main idea of this setup is, that the robot never will loose it’s balance. The mechanical structure prevents that the is falling to the left or right, even if the foot placement is wrong.

So what can we do with the system? It makes it easier to develop a robot controller. Because the task is no longer to implement all the details for example center of gravity, falling detection and gait patterns, but it is enough if the robot moves his legs by random numbers. Why is that important? Support, there is no frame around the robot and the box2d robot stands alone on the floor. The result is, that the programming the biped controller is way more too complicated. Even it is only a simulated robot the amount of math is huge to let the robot walk.

The interesting thing is, that many things like reduced power consumption are not important for that scene. The energy consumption of the Box2d engine is exactly zero, no matter if a freely movable joint is used or not. The only reason why underaction is useful here is because it reduces the difficulty of programming the controller.

I have to admit, the it is unclear right now, how the feets have to placed to let the robot walk. A possible technique is to program a predicting engine first and use it for model predictive control later. So the question is: what happens, if the robot is moving the right foot upwards? Will the robot’s base move to the left or to the right? The good news is, that after playing around manually with the system some predictions can be made. What is sure is, that if both feets have the same distance to the body, the robot is able to stand. If one feet is smaller, then the robot will be out of balance to the other side. And if both legs are kicked with higher speed then the robot will make a small dancing performance.

Everything within the frame. That means, the robot motions are restricted by the rope at the ceiling. The best example which is similar to the construction is a remote controlled puppet. A puppet has the advantage over robots that it is not standing freely but it is hold in position by the artist. In this case, the rope is visible. The difference to a puppet theater is, that the rope doesn’t provide actuation but it is an underactuated puppet. That means, the robot has to provide the force. Perhaps we can compare this with a biped robot who is mounted on a frame. That means, his legs are in the air because the aim is that repairing is more easier.

Instead of describing the inner working of a robot controller let us focus on the result. Suppose the human user has a mouse. He points to a certain position on the screen. What the controller has to do, is to find the servo commands for the feet to bring the robot into that position. This can be made, by asking the motion model. It is known, that the robot moves up, if both feet are moving down. So it is an model predictive control problem. Similar to a ball on a beam, which have to be hold in a goal position.

The main reason why I’m belief this kind of frame around the robot is helpful has to do with the reduced difficulty. It is not very hard to imagine a robot controller who can bring the robot to the mouse position. I would guess that a simple pid controller can do that. For a normal biped robot the task would be more complicated. It is not possible to control a biped walking robot in that way. Let us describe, why the task is much easier. If both feets are moved upwards and the coordination is not perfect, the robot will loose it’s balance. The reason is, that small mistake has consequences for the center of gravity. But, the robot is hold by the robot. By it’s design he can’t loose it’s balance. That means, the rope will compensate control errors elegant. Nobody will recognize that the controller is not perfect.

It’s not possible to build machine like robots

A common misconception of robots is, that they will become tools in the hand of man. Similar to known technology from the past. For example an electrical vacuum cleaner. The idea is, that a robot is only a slightly improved classical machine which will have a switch off button and works inside the normal specification. Sure, from a technical point of view, it is possible to build such robots. That are programm controlled machines. There is a robot arm, a piece of software and the robot is able to move. The problem with these robots is, that from an economical perspective they are useless. What a factory really needs are robots with a much higher amount of complexity. The reason is, that a factory doesn’t need a machine which can work, but a machine which can handle existing machines.

The only way a robot can operate in this mode is, if he has the same capabilities like human workers. That means, the robot needs natural language understanding, a vision system, the ability to understand complex situation and to do complex manipulation tasks. This kind of robot will not like a toaster, but it would be equal to a human-like robot. Which means, it will has arms, legs, a head and can speak.

I would suggests that most people are not aware that only these highly developed robotics are making sense. Something which doesn’t reach that level can’t be used for automating a factory. Today, no such technology is available. The programmers doesn’t know to realize such human-like machines. They can build robots but they have a much lower skills and can for example follow a line. As a consequence robots can’t be used in reality, but only in an educational setup. They can compete very well with other educational topics for example biology and math, but they can’t compete with existing machines like a knitting machine or a car.

In the education systems, robots are everywhere. They were used as practical tools for teaching programming. They are used as a subject in the literature and they are envisioned for the future. Most reasearch done today has to do with robotics in the loop. It is some kind of best practice method. ANd it is very interesting because robots can be used for any purpose. In contrast, robotics are not used very often in reality. A normal restaurant has no demand for these machines, and as a coworker at the assembly line they are not ready today. The education system is the only social system in which robots are widespread available. In this kind of natural space they will evolve into something which is more advanced.

The sudden transition to industrial robotics

The good news is, that today’s workers doesn’t have to fear robots in the factory, and the workers in the year 2029 can also be secure, that they can’t be replaced by Artificial Intelligence. The reason why factory are depended from humans is not because humans are superstrong, or can put the pizza into the box faster than a robot, but humans can work very efficient together with mechanical machines. This makes them to a productivity powerhouse. A classical food production machine which produces bread together with a few humans who can fill in the raw materials and doing smaller repairs is the most cost effective way in realizing mass production. The automation level is not zero percent, because a few humans are operating the machine, the automation degree is somewhere between 70-80% because it is a coworking system of mechanical machine plus intelligent humans.

Even if new robots will be developed in the future, which can not only walk on two legs, but are able to pick&place the factory is not able to replace the existing human workers with robots. The requirements for a worker is not that he or she has to carry heavy loads or running hundred miles without using a truck, the requirements for humans is, that they are able use existing all-mechanical machines with the maximum productivity. And they do it very well.

At the same time it is possible to define a situation in which robots can do it better than humans. The requirement for a robot in a factory is, that he is able to operate mechanical machines. That means, the robot can press the button to switch the packaging machine on. The robot will detect a problem in the mechanics, the robot is able to tell the owner of the factory what the problem is, and the robot can do the tasks, not done by a mechanical machine. Today’s Artificial Intelligence technology is not highly enough developed for that purpose. The engineers are not able to build robots who are copying the full spectrum of human skills. On that level, robots are useless for industrial automation. The can’t compete with humans who are able to all these tasks without problems.

Replacing a human worker is only possible with human-level-AI. That means, with a robot who can pass the turing-test. The robot has to understand natural language, can handle all sorts of manipulation problems, has an advanced vision system and is able to learn new things at the workplace, for example operate a new sort of mechanical machine. Nobody knows how this kind of human-level-AI robot can be realized. And the assumption is, that in ten years from now, the situation is the same. On the other hand it is possible that in one moment in the future such human-like robots will become available. And then they are able to replace humans.

In the meantime, the progress in automation will become amazing unchanged. The options for a factory to increase the productivity are very limited. They can only a buy a mechanical machine which is a based on a design, 100 years ago, and they can hire human workers who are familiar with this technology. The reason for this kind of standstill is, that the inventions made during the industrial revolution make sense even for today’s requirements. Technology like a forklift, a knitting machine, a diesel engine or electric light was highly productive 100 years ago and the same is there for today.

Even if robots are not introduced in the factories in the world, they will become an issue in the laboratory. Famous universities are teaching robotics related information to beginner students, and newly founded robotics challenges are developing existing technology into new dimensions. From a robotics point of view, today’s robots are the most advanced ever and the development won’t stop. What we will see in the next 10 years, is a paradox. At the same time, the factories are using unchanged technology which is all-mechanical while at the same time in educational context very advanced robotics systems are developed who are feeling emotions, can solve complicated tasks, walk on 2 legs and are able to communicate with humans.

The problem is not to introduce robotics into the factory, the task will to be introduce robotics into the universities. Most students are not familiar with Artificial Intelligence, at the same time the subject will become more important in the future. And one of the question which is open is why should students learn something about robots, if no robots are available in reality? That is a very interesting question. Robots are only available for teaching purposes, the best examples are Nao, Mindstorms EV3 or Arduino robots. Robots are not available for doing work in a kitchen, in a restaurant or in a factory. That means, the education system is not behind the schedule, it is ahead of the plan. The chance is high, that students in a normal school will get at least a programming course for the Lego EV3 robot. This kind of technology is not used in today’s factories, and they have no demand for programmable machines. That means, the students learn in the school something, what they can use in reality not in the next 10 years, but perhaps in 20 years.

Somebody may argue, that a boring EV3 Mindstorms robot in a school is not that complicated, and in reality much more advanced technology is available. No, it isn’t. The EV3 kit and technology like an Ardunio linefollower robot is the most advanced technology available today. No real company outside of the education system can provide something which is more advanced.

Productivity Paradox

In some older literature, it was called a paradox, that robotics doesn’t make sense in the factory. It is not that hard to figure out the reason why. It is because the engineers of the past had made their homework. Since the beginning of industrial revolutions, machine engineers have thought about mechanical automation. They have invented thousands of machines for each purpose. All these machines are working great and have reduced the costs to a minimum. Today it is not possible to go back into a time before automation. The consequence of the highly successful automation in the past is, that it is very hard to invent new machines which are better than the old ones.

A robot is a from a technical point of view new technology. He runs with software, while a normal mechanical machine in the factory doesn’t need a program. So in theory the robot is superior. But to proof this in reality is not possible for today’s software developers. They have not figured out how to program the robots right, and a robot without software is useless. Somebody may argue, that only factories were highly automatized during the industrial revolution. Very funny, let us take a lock into different areas of economy. All households are equiped with electric current including machines like a vacuum cleaner, a washing machine and so on. All business located in the city for example restaurants are delivered with cars and trucks. And things like public transportation, hairdressing and even beauty salons are using highly productive machines. That means, we are not living in a pre-industrial world in which nobody knows how to build and use machines, but the technology available today is operating in the post-industrial world, and there is a long tradition in building these machines.

Sure, the industrial revolution was not so advanced, that the automation level is 100%. For driving a truck a human is needed, only specialist can repair a knitting machine, the human in the kitchen needs to do some manual work until the cake is ready. The industrial revolution has left many manual work unsolved.

Assembly line

The problem with the productivity paradox is the result of a misunderstanding what automation is and what it’s limit are. In most books, the assembly line is described only as something which forces human to work much faster. In reality, the assembly line in the factory was only the beginning. Many other machines were invented too and the overall system is a fully automated process. The assembly line is only the conncetion between different kind of machines. The first machines takes the meal and produces dough. The second machine is forming a round shape, the next machine is baking the bread, and the last machine puts the bread into a bag. This kind of technology is nothing which has to be invented in the future, it is working today all over the world. And it is based on the assembly line which transports the bread from one station to the next.

Sure, even this fully automated process needs human work, but what they are doing is different from repetitive work. They are responsible for let all these machine running. No human in a factory would argues against machine and likes to make a bread manual. Because then it would takes years to do all these work. Highly productive machines are an integral part of all industry.

System identification vs. optimal control

The term motion controller or AI Controller is used for a software which is able to provide the control signals. It determines, if a line following robot should activate the left or the right servo motor at a given time. The surprising information is, that apart from this problem in the literature a subproblem is discussed quite frequently, this is called system identification. System identification can be described colloquial as “programming a physics engine”. A physics engine is able to simulate a system but it is not able to control it. The best known example used for many games is Box2d. Box2d is not an AI software which can control robots, it is a prediction engine which provides the future state of a robot.

System identification and programming a box2d clone is nearly the same. But if well working physics engines are available, why is disscussing the literature the problem of system identification in depth? Why are the AI developers not simply use an existing engine, if they want to predict future states? The reason is, that existing physics engines are overaccurate and slow. It is not possible to use them for predicting a large amount of actions. Perhaps we can describe the situation at the line follower example from the introduction.

The robot moves straight ahead. What will happen in the next 1 second? Right he will move further. The future position of the robot is determined by the current angle and his speed. But what will happen, if the robot moves to the left? This will change his angle, and as a result the future position will be different. In case of a line follower robot the predictions are not very complicated to implement, in case of an inverted pendulum or a biped robot the situation can become quite complex.

A so called model describes the prediction in a handy way. A prediction model is the same as a physics engine. What the AI Developers are trying to do is to create models for a large variatiyof tasks. If they have put all the information into a physics engine, the system is formalized. The model can be used as the blueprint to create the AI controller on top.

Perhaps the idea of discussing two separate problems is not the best one. Can we ignore the system identification and create the AI controller directly? Let us go back to the line following robot. If we have only one timestep into the future, perhaps it is possible to avoid the separate prediction step. It is enough, if in the sourcecode it is written: “if obstacle ahead, then moveleft”. This kind of reactive controller detects the obstacle and creates a control signal to avoid it. In reality, the robot can execute more than one action. He can execute a sequence, for example “forward, forward, left, forward”. If we can predict what the robot’s position is, after this longer sequence this is equal to give the robot a horizon. And this horizon allows the AI controller to plan more detailed paths. Predicting a longer action sequence is only possible with system identification.

What robotics engineers are trying to do is to solve robotics challenge with the same principle used in chess. Which is search in the gametree. They can only do so, if they have a model of the game which is played. Creating such a game like model is called “system identification”. It predicts the future state of a control task. Playing games with robots is called model predictive control. It means basicly, to create first the chess game itself and then search in the game tree for a winning node.

Industrial robots vs educational robots

Industrial robots were introduced in the 1980s in some factories. The idea was, that they are advanced machines which can be used for increasing the productivity. This assumption was wrong. The problem for industrial robots is, that a company doesn’t need them in reality. What a factory needs are mechanical machines and human workers. For example a human in forklift, or a human who repairs a fully autonomous knitting machines. This kind of tasks makes a company smile, because it is increasing his profit.

There is another type of robots available which were introduced not long ago, so called education robots are not designed to work somewhere but they are used in the university to teach humans the art of artificial intelligence. The interesting feature of education robotics is, that this technology has become a great success. It is used in university and many students are buying EV3 mindstorms kit for usage at home to teach themself how to program it. The difference between industrial robots and education robots is, that the second category isn’t used for increasing the productivity in reality, but to teach the students something about the future. The idea is to make them familiar with current and future development.

Let us focus again on industrial robots. Is it possible to design them for practical usage? Oh yes it is possible. An industrial robot will replace human workers. He can do everything what a human can do. His shape is similar to a human, that means the robot has legs, arms and a head. The model was shown in the movie “I, robot”. The interesting aspect is, that apart from humanoid robots it is not possible to design a different kind of industrial robot. The reason is, that a company will buy them only, if they can increase the automation level, and this is only possible if they are replacing humans. And replacing humans in factory is only possible if the robot copies them.

What humans are doing in a factory is not to work directly. Since the industrial revolution started in the year 1800, nearly all kind of work is supported by mechanical machines. This is true in farming, food production, electricity generation and construction building. What humans are today is to realize work which can not be automated by mechanical machines. This kind of work can’t be automated with industrial robots invented in the 1980s. A typical example is a robot arm with 4dof. This machine is not useful for a real factory, because they have no task which fits to the robot. What a company needs instead is a robot who can drive a forklift, a robot who can repair a diesel engine, or a robot who can do in an unstructered complex environment a pick&place task.

What industrial robots invented in the 1980s are doing is to automate work which was automated already. Repetitive tasks are a typical scenerio for a mechanical machine. For example the task of putting food into a box. Such a task can be handled with an industrial robot invented in the 1980s. The problem is, that such kind of tasks are no longer available in a factory. They are automated by machines invented 100 years ago. That means, the task “putting food into a box” can be handled fully autonomous with an all-mechanical machine. If we are comparing industrial robots with these all-mechnical machines robots are a bad choice. They are overpriced, and their productivity is low. If a factory decides to replace existing mechnical machines with industrial robots this is equal to go bankrupt. They are no longer competitive at the market. That is the reason why the introduction of industrial robots has failed.

The reason why the sales of industrial robots is not zero and especially in the last 5 years the amounts of robots has risen is because industrial robots are used outside of their purpose. If a company buys the latest model of Kuka or Fanuc they are not using the robot for work. Instead it is used for educational purpose only. This kind of reason is accepted in some companies and the robot project is treated as a success. The problem is, that Kuka defines their robots not as an educational tool similar to the Nao robot, but Kuka beliefs, that the robot is used in real production. This is not true. What the factory has for real work applications are mechnical machines, mainly cranes, forklifts, knitting machines and packaging machines.

This is the situation today. If we are taking an outlook 20 years into the future things can change. Robots will become an issue for a factory, if they have reached human-level-AI. That means, if the robot can climb into the forklift, understands natural language and will handle a certain task. This kind of advanced Artificial Intelligence is not available today. It is something which has to be invented in the future. It will revolutionize industrial production.

Can robots replace humans in a factory?

On the short term this is not possible. The reason is, that the work humans are doing in a factory is highly complicated. It is work, can’t be done by mechanical machines, for example to fill large amount of sugar into a food production line. Often this work needs a lot of knowledge about the production process. The reason why human’s work in a factory is more complicated than in the past is because the companies have automated large part of the production with machines. This was done during the industrial revolution. Today, mass production with low costs is used everywhere.

It is not possible to increase the automation level further. For doing so, it would be needed to copy the work of the humans, that means to do non-repetitive tasks. Only humans can replace humans. Even the latest generation of industrial robots from Kuka or Rethink robotics are not able to replace human. That means, the robot not simply has to do work in factory, he has to do work for supporting existing mechanical machines. Baxter robot can’t repair a machine or he is not able to communicate with other workers about potential problems in the assembly line. That is the reason why robots are not used in most companies.

On the long run it is possible to replace human workers with robots. This can be done, if the robots are on the level of humans. That means, if robots are available which can do everything what a human can do: walking around on legs, reading the manual of a production line, filling sugar into a machine, and doing other helping tasks for existing machines. What we can say for sure, that today such robots are not available. Perhaps they are ready for market in 10 years, or in 15 years. If such human-level-robots are available they can and they will replace human workers. There is no need to use human in a factory, if robots can do the same for lower costs. The problem is, that this kind of transition comes slowly but without warning. That means, at a point in a future a company will invent a human-level-robot and if in a short amount of time, all companies will buy this robot as a replacement for their human workers. The introduction will be similar to the first iphone. That means, overnight the new technology is there and than everybody will buy it.

The good news is that in the remaining time, robots won’t replace humans. Robotics available today is some kind of joke. The normal industrial robots can’t be used for anything, especially not for doing useful work. What is used in today’s factories is a base automation which is provided by machines built 100 years ago during the industrial revolution. These machines doesn’t have to be programmed and they have no artificial Intelligence. That are all-mechinal machines for supporting mass production. The best example is the assembly line itself, but large scale cooking machines, packaging machines, clothiing production machines, cranes, forklifts and cars are also good example for this technology. Such mechanic systems were invented 150 years ago and built since then without modification. Their productivity is very high and their costs are at a minimum. The only bottleneck these machine have, is that they need some humans around them. That means, the automation level is not 100%, but somewhere around 70-80%. The remaining manual work is hard to automate. The mechanical machines it is not possible. To problem is called control problem, better known as cybernetics. For example, a highly productive crane can transport large amount of load, but the crane needs an operator who is pressing the buttons. And the operator can’t be replaced easily by software, only a highly developed Artificial Intelligence can do the same task.

That means, even if somebody has programmed a robot who can press the buttons, the system can’t replace the human operator on the crane. For doing so, the software has to become on the same level like a human, which means, everything else apart from human-level-AI is not able to increase the productivity in a company.