How many active authors are writing the AI section of Wikipedia?

The category Artificial Intelligence in Wikipedia contains of articles about intelligent machinery, neural netwrks and robotics. A closer look into the version history of the articles will show how many users have contributed to this section. Either by scrolling manual though the logfile or by using the advanced “revision history statistics” which is part of the page information section. Here is the result for some randomly selected of AI articles:

– Golog, written by 7 users
– Artificial Intelligence written by 4400 users
– Artificial Neural network, by 2800 users
– Markov chain, by 1000 users
– General Problem Solver, 9 user
– OpenCog, 65 users
– Recurrent neural network, 274 users
– Hierarchical task network, 48 users
– Stanford Research Institute Problem Solver, 71 users

Larger articles are created and maintained not by a single author but by the Wikipedia crowd which contains of hundreds of different authors from around the world. This ensures an objective perspective to each topic and prevents biased information.

Some of the authors have edited not only a single article but have written in more than one article. This reduces the overall amount of active authors. But the number remains high. I would guess, that the entire AI Section of Wikipedia contains of at least 1000 different authors, maybe the number is 2000. A conservative guess is located somewhere in the middle so we can assume that the AI Section is written by 1500 different authors.

Now let us set this number in relation to the overall Wikipedia project.

– Wikipedia AI section contains of 9000 articles, written by 1500 authors
– the overall wikipedia contains of 6 million article
– so the sum of authors in Wikipedia 6000000/9000*1500=1 million

This shows, that the project is in a healthy condition and contains of reliable information.

Controlling the cartpole in OpenAI gym

In a previous blogpost the OpenAI gym environment for the cartpole experiment was introduced. The problem for the programmer is to convert an input array which contains of 4 values into a control signal for the cart. The space in between is a blackbox which needs to be formalized in a computer software.

The first thing to mention is, that the four input variables (cart pos, cart vel, angle and anglevel) are not enough to control the cart. The trick is to increase the number of values. These extra input values are descrbing not the current situation but the future state of the system. This additional information makes the control problem much easier.

It is not very complicated to predict the future state of the inverted pendulum. After analyzing the feature vector graphical there is a strong relationship between the cart velocity and the action of the user. If the user moves the cart to the left, the cart vel will increase slowly to the left side. Somebody may ask why is the cart velocity in the future framestep important. The interesting point is, that the cart velocity has an influence to the anglevel and the anglevelocity is connected to the angle of the pendulum.

But let us go a step backward. What the user wants to do is to hold the pendulum upside all the time. He can’t control the angle value directly but he can only move the cart left and right. This is a typical situation for all underactuated systems. To determine which action signal is needed, some precalcuations have to be made.

First step is to convert the observation of the current system and the last action signal into a prediction of the future. Then the prediction is used to determine the control signal. This is feed into the system. In the literature the princhple is known as model-based sliding mode control. It means, that before the user is able to control the system he has to identify the model. This allows to create predictions of future states.

According to a survey in the literature the topic isn’t described very good. It seems, that model predictive control is the only option for controlling nonlinear underactuated system. Using plain control techniques like Finite state machines, neuronal network or reinforcement learning doesn’t help in this case. In contrast, there is a need for system identification first, no matter how exactly this is be realized.

Or let me answer it the other way around. It is not possible to control the pendulum by using only the normal four observation variables provided by opengym. The observation is measuring only the current state, but isn’t predicting the future situation.

Why is the future of a cartpole system so important for the control process? Let us assume the cart is driving fast from right to left and the user sends the action=1 to the cart which stands for right. The result is nothing. What the cart will do is to reduce the left movement a bit, but it won’t change the direction immediately. That means, even if the user has pressed the right button it depends on the current state of the system what will happen next.

In another example the cart isn’t moving, and the user is pressing action=1 right. In this case the cart will start moving to the right side. The example have shown that the future position of the system depends on the current state plus the action. Only if all the variables are known (current state, action, future state) it is possible to create a control poiicy.

Differential equations

I’m not the first author how investigates the inverted pendulum problem. There are at least 100k papers available in the Internet which are discussing the control of pendulum. In most cases a combination of differential equations and simulink is used to solve the task. The idea of 99% of the papers is called model predictive control. Instead of only measuring the current status of the system, the future state is predicted. In the mathematics a prediction model is called a differential equation. It is a physics engine which formalizes which angle the pendulum will have in the next step, and in 2 steps and so on. This information helps to identify the correct servo control signal in the now.

The term modeling in mathematics means the same what programmers are calling a physics engine. It allows to simulate a system. In the cartpole example of openai gym, the system contains of the observation variables which are cartposition, cartvelocity, angle and pole-velocity. This information is measured 20 times per second (20 fps). A naive attempt to control the system with these four variables will fail. Instead a model of the system is created first. The model contains of differential equatations which are describing how the angle will change if the cart moves towards the left direction. The automatic controller is not only aware of the current system state but he can see the future of the system as well. This makes it easy to determine the correct movement signal.

The interesting situation is, that most existing papers will agree to this philosophy. That is the best practice method to control the cartpole problem. The only thing what is discussed controversial is how to explain the idea to the user. In the first example, the differential equations are written in a certain syntax. In the next paper, only the simulink model is provided. In the next paper, the user is asked to create the equations by its own and in more recent publication, the differential equations are learned by a neural network at runtime.

It is important to recognize that all these ideas have something in common. The input state of the system (4 variable) are feed into the model. The model is able to predict the future and this infomration controls the pendulum autonomously.

This kind of abstract description helps to understand how to control other system. For example a biped robot, a ball on a wheel, a ball on a beam and so on.All these systems are producing a current system state which is equal to a feature set. For example, the speed is measured and the velocity of a joint. These information are used to predict future states of the system. This is realized with differential equations or other techniques like neuronal networks. And then the prediction is used in a controller which produces a signal to the system. Bascially spoken, all the control mechanism are based on model based control. It is not possible to control an inverted pendulum without creating a mathematical predication model first.

Predictive models

Let us take a look why predictive models are created. In the standard setup of OpenAI gym the observation variable contains of 4 values which are representing the current state of the system. What is missing are what if use-cases. It is unclear what will happen, if the cart is moving to the left. The only option is to test it out and then there is a surprise if this will make the pole falling down.

So the question is what will happen in the future before the action was taken. And differential equations are able to answer this question. Differential equations can answer the question what the future angle will be if the cart is moved to the left. If the equation fits to the original openai environment they are a tool of choice for creating a prediction.

The questions for mathematicans and programmers is which kind of differential equations describes a certain system. That means, what formula is needed for the cartpole example. And if the equation was found how to use the knowledge for induce the control action. Creating the differential equation, solving them and generate the control signal is done in the blackbox. The blackbox gets the current state as input, is doing some calucations and produces the output.

Introduction into the OpenAI Gym cartpole problem

Most AI enthusiasts have heard about the OpenAI gym library because it is mentioned in many books. The cartpole environment is an interesting testbed for testing out newly developed motion planning algorithm. The main advantage of OpenAI gym is, that the programming requirement is ultralow. The basic setup contains of 20 lines of code, which can be copy and pasted from the documentation:

import sys, gym, time
 
class Game:
  def __init__(self):
    env = gym.make('CartPole-v0')
    rewardsum=0
    observation=env.reset() # init observation
    for framestep in range(100):
      #action=env.action_space.sample() # random action
      if observation[2] > 0: action=1
      else: action=0
      observation, reward, done, info = env.step(action)
      rewardsum+=reward
      print(framestep,observation, action,rewardsum)
      time.sleep(0.05)
      env.render()
    env.close()


mygame=Game()

I have modified the code slightly to provide a simple strategy plus a rewardsum. If the user has installed the OpenAI environment with “sudo pip3 install gym” the code should run well under a Linux operating system. If the time.sleep command and the env.render() command gets commented out, the system runs at maximum speed.

On the terminal window the overall reward is shown. The simulation runs for 100 steps and the goal is to maintain the pole in every step in the up position (+1 reward). The given simple strategy is able to reach 50 up to 70 points but never the full 100 points. So the basic question for an AI expert is which kind of strategy will reach the maximum score of 100 points?

Answering this question is surprisingly hard. Not because something with the python code has to be fixed, but because it is unclear how to write an AI agent which can play cartpole.

Such AI problems are not completely new. In the past there was the famous micromouse challenge, the robocup challenge and the Mario AI challenge. The cartpole example in OpenAI gym has the adventage that it is more easier to replicate. The user doesn’t need to download advanced software, nor he needs to program a robot. But he can experiment with the simulation more easier.

This sort of standardized environment is static. The existing documentation doesn’t explains much additional options. Even a simple keyboard interrupt is hard to realize with openai Gym. But this is no disadvantage, because it allows the user to focus on the core problem. The question is not how to program a fancy 3d which contains 10 MB alone, but the question is how to program the AI Controller for the existing poorly programmed game.

The problem is, that the answer is not known. Posting the question into a forum won’t produce a clear advice. It seems, that the problem is easy to understand but hard to solve. What we can say is, that all the existing AI programming techniques can be applied to the cartpole problem and it make sense to explain which of them is better suited.

Let us describe the cartpole problem in detail. In contrast to a line following robot problem, it is much harder to write an algorithm which can solve the game autonomously. The reason is, that the system is underactuated. The user can only indirect influence the position of the pole. This makes it a great testbed for advanced robotics algorithm.

The good news is, that the user doesn’t has to write a computer vision library first, but he gets the numerical value of the game state by printing out the observation variable:

[ 0.07892425 0.148639 -0.15568975 -0.29123444]

It contains of: [cartpos, cartvel, poleangle, pole rotationrate]

If the user has access the matplotlib library, he can plot the 4 features into a time diagram and will recognize that the system is chaotic. That means, there is a correlation available but it is very hard to formulate an equation.

If the time-axes is removed the phase chart remains visible.

This chart is used since decades to explain, why the inverted pendulum is hard to solve. The reason is that the correlation between the values is very hard to formalize in equations. Possible actions of a human or robot have to be done inside the phase diagram. Last but not least it is important to know, that the cartpole problem can be called a toy problem, because in contrast to real problems of biped walking robots the problem was researched heavily in the literature and it was demonstrated in the past that a mathematical formalization is possible. There are many examples available in which a control algorithm was able to stabilize the pole.

Here is another self generated image from the cartpole problem. It shows the relationship between oberservation variable 2 and 3 which is angle of pole vs. rotation rate of pole. Both are staying in connection to each other. That means, the system isn’t random but it contains of a structure.

Perhaps it make sense to describe the problem from an abstract perspective. The input for the controller is a python list which contains of 4 values. The output of the controller is action=left/right. And the magic is the black box between the input and the output. It is magic, because it is not known who this black box is working. This makes the cartpole problem so interesting. It is not solved yet and it is waiting for smart AI students from the future who have read the manual.

The industrial revolution was driven by typesetting

Sometimes it was called a miracle why the industrial revolution has started in the United Kingdom and has changed the entire world. The driving force behind the upraising of steam engines, the electrical light and assembly lines for car manufacturing was the innovation in typesettng of books. Without the ability to print books and newspapers about engineering topics it is not possible to build advanced factories and educate the employees and managers.

The timeline of the printing press can be divided in larger periods. In the year 1814 the first steam engine driving printing press was available which has reduced the costs. Around 1910 the Linotype and Monotype typesetting machine has decreased the costs further. The causality between improved book printing technology and the increased knowledge transfer between the people is obvious. Before somebody can make a new innovation he has to read the books of the innovators from the past. And before a new sort of technology can be introduced into the reality, it has to be announced in journal articles read by managers.

The printing press is a meta-innovation which is leading the last 500 years. It was responsible for the industrial revolution and the computer revolution since the 1950s as well. In the 1950s the first Phototypesetting machines were available which had an clear advantage over former Monotype machines. Similar to previous innovations in the typesetting domain it helped to distribute information to a larger audience. If the a university library in the 1960s was created from scratch, the chance was high that all the books were created with phototypesetting machines.

The same situation can be observed in the advent of the Internet. All the latest academic information are distributed with the Google Scholar search engine. In contrast to the Monotype machine, Google Scholar isn’t a mechanical machine but it’s a webserver in the internet. But the function is the same. Google scholar was invented to distributed knowledge to a larger audience by minimizing the costs.

How new things are invented

What all the technical universities in the world have in common is, that they are providing access to a library. The library is the central place of a university. The interesting fact is, that it is very easy to become innovative if somebody has access to the knowledge of a library. Suppose the steam engine wasn’t invented yet, but the motivated student has access to some books from technical domains. How long does the student he needs, if he has invented the steam engine from scratch?

Not very long. He has to read through all the books and after a while he has understood the principle of how to convert steam into mechanical energy. He is doing some experiments, writes down the result and his master thesis about the steam engine is ready. Or let us observe the situation from the opposite perspective. Without access to printed books it is not possible to invent anything.

The interesting situation is, that books and how they are made have changed over the time. The easiest form of creating a book is a handwritten manuscript. This was used before the first printing press was there. With the invention of the printing press the first machine created books were available. These books helped are forming the basis to invent more advanced printing press devices and so on.

Group working in book production

The surprising fact is, that the printing press has a low speed and can become quite fast at the same time. Most authors have written only 2 books in their entire lifetime. They need 10 years until all the pages are written and it takes an additional year until the printing press was able to create a larger amount of copies. In most cases, a book isn’t sold for only a week but the publishing house is selling the same book for decades. It takes a long time until potential readers and library gets noticed of new books and have decided to take a look.

The surprising fact is, that the slow production speed of books was never a problem during the industrial revolution because many authors and printing press are working in parallel. A knowledge oriented culture is motivating not only a single author to write a book, but thousands of them at the same. The result is that the publishing house will produce new books every day.

Let us make a simple calculation. A single author is able to write 2 books over a lifespan of 40 years. The amount of 10000 authors is able to produce 20k books during 40 years. The result is that per average every day 1.37 new books are released to the public. It is equal to a constant flow of content which is distributed worldwide.

Debian, a lookback after 2 months

Two months ago, i switched from Fedora Linux to Debian. It is time to give some feedback about the pro and cons. The most obvious difference is, that a Fedora operating system needs to be shutdown and rebooted once a day, while Debian systems have a longer average uptime. According to the “last reboot” stats, my own system is running per average for 3 days without a reboot. The reason is, that the underlying Linux kernel in Debian is always the same. It is very rare that a new Linux kernel is installed by the update manager.

In contrast, Linux distributions like Fedora or Debian Testing are installing very often a new Linux kernel which requires a complete shutdown and reboot of the system. From a user perspective, Debian has much in common with established operating systems like Windows 10 and Mac OS X which are also providing a longer uptime than Linux. The main reason why the amount of Linux users is small is because most of the existing distributions have a problem with the update frequency.

Some Linux distributions like Slackware are not maintained anymore because the project is obsolete. Other Linux distributions like Arch Linux and Fedora are shipping new kernels to the end users on a daily basis. The problem is, that installing a new kernel is a risky decision, because after the reboot the system is very different than in the previous version. This make sense in longer duration, but not for a day-to-day use case on production machine.

Debian has some disadvantages too. In contrast to a common myth, the system has the same poor performance like any other Linux distribution as well. The gnome desktop needs 2.5 GB RAM in the idle mode, and the XFCE desktop needs around 2.3 GB RAM too. Additionally, the user is confronted with high cpu utilization even if he has closed all the applications. It is unclear if malware or normal background daemons are occupying all the CPU. An additional problem debian -specific problem is, that the system is documented poorly. The existing debian wiki can be called a joke, and printed books are not available. The only manual available today is an outdated HTML book from the year 2006 about tips and tricks for Debian maintainers which provides only boring content.

So we can say, that in comparison with established desktop operating systems like Windows 10, Debian is a week choice. Only if the user identifies with the Free software idea and is able to resists against technical problems he will benefit from the project. The advantage of Debian over Windows 10 is, that if a software is available in the repository it is very easy to install the program. For example, if the user needs a drawing software he can run “apt install gimp”, if he needs the lyx software he is running “apt install lyx” and so on. Another advantage of Debian Linux is, that the software runs on most PC available. No matter if the user has access to an old notebook or a new desktop PC, Debian 10 can be installed everywhere.

From a subjective perspective the assumption is, that Debian is a highly insecure operating system. The subjective feeling is, that the system has more security problems than Fedora because the CVE-tracker is maintained not by professionals, but by the Debian community which are unpaid users which have installed Debian only as a second operating system after Windows 10 and are not familiar with the c programming language at all. So it is is bit hard to trust the software and the ability to resists against newly discovered security bugs.

Measuring the status of industrial robots

From the perspective of practical applications, robots can be divided into two groups: high operational hour and low operational hour. The first category is used in the reality which includes welding machines and palettizing robots. If a factory has bought such a model, the chance is high that the robot is running 24/7. That means it will do in the company a certain task and it was a good decision to buy the machine.

In contrast, robots from the second category can’t be utilized for concrete tasks. They are working only in theory, but their real operational hour is low. This is equal to more complicated robots which are solving assembly tasks and biped walking. The reason why the operational hour for these machines is low is:

– the customers are not familiar with the devices because the technology is new
– the software of the robot doesn’t work under real conditions
– it needs robotics experts for using the device in a concrete situation
– the price to buy these machines is too high

There is a robot category which is located in the middle between the extreme points, called pick&place robots. In theory, a pick&place robot can be used in reality. The machine is able to sort smaller objects from the conveyor and a typical model is a delta-robot which includes a vision system. The problem in reality is, that compared to well known robot types like a welding machine, a pick&place sorting robot is advanced technology. It contains many modules which can be broken and which has to explained in detail. This prevents that such machines are heavily used for practical applications. They are available in some companies but they are used seldom.

An objective measurement in which category a robot type fits into it is to search for second hand robots. Welding machines and palletizing robots are sold used from one owner to another owner. In contrast, biped robots and dexterous robotic gripper are not sold used, because nobody is using them today. The prediction is, that in the future the chart will change a bit. If advanced robots become more visible, the operational hour of these types will increase.

Robotics as a cargo cult science

According to the world robotics report 400k industrial robots were sold in the last year, https://www.roboticsbusinessreview.com/research/world-robotics-report-global-sales-of-robots-hit-16-5b-in-2018/ The amount has increased over the years. Countries which have installed these robots are China, southkorea, US, and European countries. So basically the whole world has bought robots for factory automation.

But is this news the reality? Let us take a closer look how robotics is used in the reality. The interesting situation is, that most companies have two assembly lines. On the first one the real production is done, mostly by human workers, and the second assembly line is equipped with robots and it is located in the research area of the factory. All the 400k robots (if the number if correct) are sold to the research location of the factory and not a single one is used for factory automation.

To understand this sad situation we have to first describe why robots are successful. Robots are presented at tradeshows, are used in youtube videos and are researched by academics. In all these domains, there is a progress available over the years and robots have helped to understand what Intelligence is. On the other hand, robots are never used in the reality for practical applications. In the 1970s Joseph Engelberger struggels with robotics at the workplace and in the 2010s ABB Robotics have failed too.

The gap between a robot in the laboratory and a robot in the reality has increased over the years. That means it is absolutely forbidden for everybody to move an industrial robot from the research facility into the production facility of a company. This kind of explicit law was created to prevent that robots will increase the productity, will steel the jobs of human workers and helps the factory to reduce the costs.

What factories are allowed is to waste a low of money for newly developed robots. The latest innovation are cobots which can work together with humans. These systems are able to fold airplanes of paper and can solve pick&place tasks. Similar to previous attempts in automating the production, a cobot is not allow to be activated outside the research facility. It is ok, if an employee is using the robot to learn something new, and the next employee is encouraged to write a doctoral thesis about the machine, but it is forbidden to use the robot for automating the production.

This kind of law sounds a bit unlogical because by definition a robot is sold to increase the productivity. Or to be more specific, this is the reason why robots are marketed to the public. In the reality, a robot isn’t able to automate anything and it can’t increase the productivity. In the literature the law is known as the productivity paradox. It says basically, that robots are an important tool to teach Artificial Intelligence, but a bad idea if somebody likes to build cars or produce clothing.

To understand why the productivity paradox is not only a hypothesis but a law we have to describe the opposite. Suppose, the paradox isn’t there. Then a factory somewhere in the world is buying an industrial robot, installs the device at the assembly line and one week later, the system is up and running. It helps the factory to save a lot of money and has made the life easier for the employees. The funny thing is, that no a single case in the world is available in which such a case can be observed. If somebody has found such an example, he has falsified the productivity paradox. It would be the first time that robots have used for a practical applications and not a research project.

Let us investigate the law in detail. The consequence of the productivity law is, that the level in automation is fixed and can’t improved in the future. It is froozen on a level which was reached during the industrial revolution and no matter which new robots are invented in the future, they are not allowed to enter the production facility. There are some efforts available which ensure that the law remains active. One of them are marketing efforts to sell industrial robots to the public.

Suppose a human worker at the assembly line is asking their boss for a robot. The human worker likes to automate a pick&place task and he explains to the boss that a new robot can do this job better than a human. The result of this request is, that a larger robotics company will invent a robot dedicated for this purpose. And the robotics company will present such a robot at the next trade show. That means, the marketing effort for the concrete pick&place robots gets started. This marketing effort is a counter strategy which prevents that the robot is build and used in reality.

All the robot presented at trade shows can’t be used in the reality anymore. If a robot is able to pick&place objects from the assembly line, this task will done by humans forever. And if a robot has shown in the presentation, that a burger can be flipped, this ensures, that burger flipping is reserved for human workers forever.

Let me explain the situation in detail. Suppose a human workers has the job to take 6 apples from the conveyor and puts them into a box. There are two options available. Either a robot was invented which can do the task. Or the robot wasn’t invented yet. If the robot wasn’t build yet, the task can’t be automated. Instead its up to the robotics company to start a project for a pick&place robot. In the second case (the pick&place robot is available) the robot isn’t able to do the job, because this would be equal to use a research robot in the production facility.

That means, the single worker at the assembly line knows, that a robot can do the same job with less effort, but the company isn’t buying the machine because of the productivity paradox.

Productivity paradox

The Productivity paradox explains the missing robots at the workplace with a single reason. The relationship between costs and advantage is too low for robots. A robot will produce high costs but won’t improve the situation. The productivity is not an objective criteria but it is interpreted by humans. Before a robot is used in the reality, a human has to buy and activate the machine. How the human is creating the decisions is unknown, what we can say for sure is, that humans are deciding against robots.

Humans are using robots very often in the research lab. They are fascinated by the possibility to research Artificial Intelligence and they hope that in the future robots will become available widespread. At the same time, humans are rejecting robots in the real world. They are deciding not to buy industrial robots and if they have bought a model it is never used on the assembly line, but in a separate location.

It is well known from other technologies like the electric light and the computer that in the beginning it needs a warm up period in which humans gets familiar with the new technology. Most humans have lost their fear of the Personal computer and using this device every day. In case of robots the situation is different. No human in the world has lost their fear of a robot. And the fear has grown over the years. Media are playing an important role to indoctrinate the humans that robots are evil devices. Most youtube users have seen the biped walking robots and they are convinced that this technology is something which should be rejected.

The interesting situation is, that even robotics experts who have constructed the robots are not using them for practical applications. The reason is, that every human is in fear of a robot upraising and doesn’t see an advantage of using robotics in the reality. The situation can be monitored in detail on the example of vacuum cleaner robots. Some households have bought such a device, but not of them is using the robot as a vacuum cleaner. Instead the machine is used by the cat as a moving vehicle, or it is used to explain a friend who technology will become better in the future.

In theory, the owner of a vaccum cleaner robot is free to decide how to use the machine in reality. The interesting situation is, that from a sociology point of view, the law is very clear. It is not allow to use a vaccum cleaner robot for practical application. It is a research project which allows the human to learn something about AI, but it is not in the reach of humans to use a robot for doing a repetitive task.

At the end it make sense to analyze under which constraint the productivity paradox can be overcome. The only situation for doing so is if human will loose their fear of modern technology. If humans are able to judge about robotics rational they will recognize that robots can be used for practical applications. This won’t be happen in the future, because the robots will become more human like and this will increase the fear of humans before this technology.

In front of a robot from the 1980s nobody is in fear, because the machine can’t even walk. But in front of a robot from the year 2030 everybody is in fear, because the machine walks on two legs and is able to understand English. What robotics experts are doing is to increase the fear of robots. They are developing more advanced technology which will impress the audience more. And this increased fear will prevent that robots gets introduced into the reality.

Fear of robotics

In the past there were some examples available in which researchers have tried to see robots more rational. One idea was to build cute robots which have the shape of a teddy bear. The idea was that humans like animals very much so they will like social robots as well. Ironically the result was the opposite. Cute robots have increased the fear before robots drastically. Especially if the robot will behave natural, the human won’t trust the machine. There is a famous video available at youtube in which the Ferby robot was presented as the devil in person. That means, the robot has made the humans very angry and this prevents that robots are perceived rational.

Demonstration mode

The interesting situation is, that the relationship of humans to robot depends on the social context. If a robot is put into demonstration situation every human feels comfortable with the device. The inventor is proud, that the robot is able to do the pick&place task. He gets applause from the audience. The audience feels comfortable too, because they see something new which allows them to learn about technology.

If the same robot is put into a working mode, on a real production situation, the situation is changing dramatically. The programmer of the device isn’t sure, if the machine is working correct. And the employees who have do the work with or without the robot are felling unhappy too because they don’t understand it and they don’t trust the programmer. All the humans are in a loose loose situation and this forces them to stop the experiment as soon they can.

In a demonstration mode, the robot works great and the humans are happy with the device. In the working mode, the robot makes many mistakes and the humans are in fear of the machine. This gap can’t be overcome with better robots and not with better training. The assumption is, that the gap is the result of the training. That means, the humans have learned over the years to feel uncomfortable near to a working robot.

The consequence of this paradox situation is easy to predict. Everything remains the same. Humans are needed at the assembly line forever, and it is not possible to replace them with machines. The productivity paradox was confirmed, the society remains stable and the AI revolution never took place.

Data-driven maze navigation

A robot can navigate in a maze with two ways: algorithm driven which means, that on the CPU a computer program is doing a search in the state space or data-driven, which means that an existing trajectory database is asked. In the following blog post only the second idea is explored.

Printing a maze on the screen isn’t very hard. The open question is how the robot on the left upper side will find the path to the goal to the bottom right position? The precondition for a data-driven control is the existence of a trajectory database. In the easiest case it is an external CSV file:

id,trajectory,skillname
0,([100,100],[200,100],[230,140]), skill1
1,([100,100],[180,120],[180,200]), skill1
2,([100,100],[200,100],[230,140]), skill1
3,([100,100],[200,100],[230,140]), skill2
4,([100,100],[200,100],[230,140]), skill2

Each entry in the database is equal to a waypoint list. It describes a path over the time scale. The skillname at the end is a label to group the trajectories into clusters. If the planner needs a path from A to B, it has to search in the database for a trajectory with has at it’s first waypoint the A position, and at the last waypoint the position B.

The idea is, that the user creates a CSV file, enters the roadnetwork for the maze, and then the planner is able to steer the robot towards the goal position. Perhaps it makes sense to explain the concept more abstract. The created CSV file grounds the maze problem in the reality. The reality is everything what is not available in the computer program. If the human operator takes a look at the maze, he has a certain understanding of how the maze should be traveled. It contains of untold constraints, experiences from the past and other high complex knowledge not written in sourcecode but it’s only available in the human operator imagination.

To insert this expert knowledge into the path planner the software has to be grounded. Grounding means to connect the written robot program with the needs of the reality. The CSV file is used for this purpose. The CsV file is producing sense for both stakeholders. It can be parsed by a computer software, and it can be modified by a human. For example, if the human operator doesn’t like the trajectory id4 very much, he can modify it’s trajectory by hand. He overwrites the existing values with new waypoints, This will force the robot into a different path.

In the domain of machine learning the CSV fileformat has become a standard for input data. In libraries like Keras the CSV file is equal to the dataset for training a model. Training means, that the CSV file is converted into a neural network. The advantage is, that a neural network can interpolate between the data. But for better understanding it makes sense to ignore machine learning at all and focus on the CSV file itself.

Data-driven animation

Computer animation was perhaps the first discipline who has introduced a data-driven philosophy. The reason is, that an animated character has to look natural. That means, the movements of the figure should be grounded in reality. The consequence was that no simple algorithms can be used but only a hand animated character will convince the audience. Hand generated control signals is equal to store the movements into a CSV file in absolute coordinates. The position of the legs and the body is given by the human operator and the computer is doing nothing but only renders the information into a video. Most of today’s animation software is working by this principle. The human user provides a spline and the object will move on this spline. If the box should move into a different direction the data of spline has to be modified. In contrast, the rendering algorithm which moves the object on the spline remains the same.

This provides a great flexible for the human artist. He can move the object on any direction he likes. The computer software isn’t used to generate a meaning, but it is only a tool similar to a paintbrush. For reason of simplification we can say, that data-driven animation is equal to grounding. By providing the information in a textfile the domain was grounded by the user. Grounding means, that the animation make sense for the human user and the computer as well.

A typical file format for data-driven animation is CSV (comma seperated file). In contrast to json and sql databases the format is understood by 100% of the existing programs. In the easiest case, the CSV file contains points in the 2d space which are animating the character. In a more complex situation, the CSV file holds a movement database which contains of skills, preconditions and segments.

Algorithms vs precomputation

The reason why most robotics have failed is because of missing grounding. The algorithm in the software is doing something but the input data for the algorithm are wrong or missing at all. A typical example, is a pathplanner which has no constraints and no map of the environment. Technically the path planner is running great, the only problem is, that all the generated paths are useless. They are not grounded in reality but they have a solved a problem which isn’t there.

The typical workflow of a computer program that the input data are feed into the algorithm, and then the algorithm will show the result. In the domain of robotics control it is hard to get machine readable input data. For all the domains the problem description isn’t available, which means that a possible algorithm will fail in any case. The better idea is to ignore algorithms at all and focus only on the input data. In case of data-driven animation, the input data is equal to the CSV file. This is the bottleneck of the system. So the question is how to create a realistic looking input file.

The amount of options is surprisingly high. Ony idea is to edit the file in a text editor. The next choice is to draw the spline with a mouse on a screen. Another idea is to use a motion capture device for record human movements. Another idea is to use an existing file from the internet. What all these techniques have in common is, that they are producing sense. A certain CSV file was created with a reason.

Making programming by demonstration more flexible

There is an advanced robot technique available called programming by demonstration (PbD). The idea is to record a motion trajectory and then playback the motion later. PbD is used heavily for programming real robots but has a lack of theoretical description. It is seldom explain in the academic literature because the assumption is, that the concept can’t be adapted to new domains.

First we have to describe the limits of PbD. Suppose a trajectory was recorded to pick&place an object from the table. In the replay mode, the object is located on a different location. But the original trajectory is getting replayed. The result is, that the system won’t work anymore. To overcome the situation it is important to understand that this fail pattern doesn’t speak against PbD in general but it is only a detail problem which can be fixed.

To make PbD more flexible, the amount of trajectories has to be increased. The robot needs one trajectory for grasping the object from position A, and a second trajectory to grasp the object from position B. To store all these trajectories a database is needed. In the easiest case the database is a CSV file in which the trajectories are grouped in clusters. There are some trajectories available for pick up the object from a starting position, while other trajectories are available to place it the goal position. In the replay mode, a parameter is used to select a certain trajectory. For example:

replay(2,”pickup”,), replay(10,”place”), replay(1,”pickup”), replay(10,place) and so on

With this simple parameter improvement the overall pipeline becomes adaptable to different situation. The robot has 10 different trajectories for pick up an object and additional 10 trajectories to place it to the goal position. Creating a larger amount of trajectories is surprisingly easy. The human operator has to activate the record mode and move the robot gripper a bit. In only 30 minutes, it is possible to create a large amount of trajectory demonstrations.

In the example command “replay(2,pickup)” a database request is formulated. It is translated into “search in the database for trajectory with the id=2 and execute it on the robot”. The overall intelligence of the robot isn’t located in its software but everything is stored in the CSV file database. If 10 different trajectories are provided for pickup an object, the robot knows very well what to do in each situation. The object can be placed on different positions and the robot is able to grasp it.

The difference to the naive programming by demonstration is, that not a single demonstration is recorded but a cluster of trajectories. The database doesn’t contain only one entry but 20 and more. If the database is getting larger, the replay mode will become more flexible. Instead of only press the replay button, the human operator can specify in detail which trajectories he likes to replay.

The gui for realizing such a software looks equal to a roadnetwork. Its a plan for potential movements in the action space. Selecting one of the roads will force the robot to move on a certain path.

Grounding

Perhaps it makes sense to explain why programming by demonstration works in reality. Because it solves the grounding problem. After pressing the record button, the human operator provides a data-source. The trajectories isn’t calculated by the software itself, but it is given as a data source from the outside. And this data source contains of expert knowledge. The trajectory has an internal meaning which wasn’t available before in the robot control system. For example, if the spline can move around an obstacle, even in the robot software no such thing like an obstacle is available.

Let us investigate what will happen with the robot control software if the database with recorded motion gets deleted. The new situation is, that the program forms a closed system in which actions take place. If the program contains of an environment map the path planner is able to generate a trajectory. Unfortunately, there is no map, because it wasn’t programmed yet. The program isn’t grounded in the reality, because there is a difference between what makes sense for the software and what is needed in reality.

It is pretty easy to compare different programming by demonstrations systems, All what is needed is the CSV file which stores the trajectories. This is the heart of the system. It is important how many entries are given in the file, and while rows are available. In contrast, the replay software which converts the CSV file into a concrete trajectory can be ignored.

Pbd vs. planning in state space

To understand why programming by demonstration is working for practical applications we have to answer the question why normal algorithm based strategies have failed. Planning in the state space works by defining the current world state in the computer program and then test out different actions for the robot. This will produce the optimal sequence. The problem with this planning technique is, that for most practical tasks the current state is not known. Only simple toy problems like path planning in a maze can be converted into a machine readable problem.

Let us construct an example to who the limitation of search based planning. According to the software, the robot holds a position in a maze, and has to find a path to the exit. But in the reality, should navigate in the maze with certain constraints and the obstacle can change during runtime. The problem is, that the maze modelled in the software, and the real maze are different sort of games. The planner in the software isn’t able to find the optimal sequence for the real maze. This mismatch can’t be overcome with better algorithm, but it is a fundamental mismatch.

The reason why the mismatch is there has to do with computer oriented problem solving technique. The idea is that the computer is able to run an algorithm which is coded in software. And the programmer has to provide the software and the algorithm as well. Then the software will solve the problem. This workflow results into failed robotics projects because it produces closed systems. A closed system is a computer program which is working by it’s internal needs perfectly but the produced output can’t be adapted to the real problem.

The idea of data-driven problem solving is not completly new. In the domain of computer chess there are two problem solving techniques available. The first one is based on an opening book which stores movements in a database, and the second one is working with state space search for the main game. Robotics can’t be solved with state space search but only with the opening book strategy. The reason is, that in the input data, the domain is defined.

Datastructure for programming by demonstration

Robotics programming by demonstration was described since the 1980s in the literature. Since the year 2010 the concept was formalized under the term “learning from demonstration”. To understand the concept we have to determine on which part of workflow the user has to direct his focus. The concrete robotics hardware can be ignored. All the models from Kuka, Motoman or Fanuc are working the same. They are mechanical machines, driven by servo motors and are running with different firmware. Surprisingly the teaching device for the robot can be ignored too. Sure, the human operator can start a recording with this device, but it doesn’t help to understand the reason why.

The more important aspect of programming by demonstration is the database which holds the motion recording. If a new trajectory was demonstrated by the human operator, the spline curve is stored in a CSV-file somewhere in the computer system. This csv file holds the interesting information. Programming by demonstration and creating a trajectory database is the same.

A convenient form of storing the robot’s trajectories is clustering the information into skills. For example:

id, skillname, waypoints, comment
0, grasp, p1 p2 p3, hello world
1, grasp, p1 p5 p3,
2, move, p1 p2 p3,
2, move, p2 p4 p2,
2, move, p4 p2 p3,

This CSV file defines two different skills: grasp and move. The trajectories in the skill cluster have something in common. They are producing a similar movement. It allows to structure the demonstration into groups.

task distribution between computer and humans

The dominant reason why robotics projects failed in the past is because the software has to solve to much problems. A typical misconception in robotics is to ask for an algorithm which is able to plan the path for a robot. The human operator likes to press the run button and the software has to figure out how to move the gripper to the goal position. This kind of requirement can’t be realized by modern software. To determine a path in a complex environment the software will need a full description which includes obstacles, the allowed distance to obstacle and what to do in case of unexpected event. Formalizing the environment into a computer program goes beyond today’s technology. The consequence is, that automated path planning works only in simplified computer games but not for robotics applications.

The better idea is to assume that the human operator provides the path in advance. This reduces the task for the robot to the problem of trajectory tracking. Trajectory tracking means that a path is already available and the robot has to follow the path. A typical beginner example is to program a line following robot. Line following assumes that somebody else (not the computer) has drawn the line to the floor and the robot has a clear objective who to behave.

Programming by demonstration and creating a line following robot have much in common. In both case the trajectory is given in advance but it is not calculated by the machine. It is up to the human operator to draw the line on the floor or to find out how the industrial robot should reach the goal. The interesting point is, that for a human this task is pretty easy. Especially if the line drawing process is supported by a GUI interface and paths from the past are stored in an external file for later reuse.

let us describe how the situation looks for the human operator. He starts the software, loads the predefined CSV file which has 200 entries into the memory and now the robot has access to all the sense making trajectories. For executing one of them a simple click on the button is enough. More complex tasks can be realized by combining trajectories into longer sequences.