Building an AI Blog community is harder than expected

The first impression might be, that Artificial Intelligence is hot topic in the internet and it’s easy to identify relevant blogs. The problem is that most of so called AI blogs are discussing a slightly different topic or are forming their own community. A typical example are the blogs of phd students. Many of them are available in the internet. The typical blog contains of a large publication section in which the phd student has posted the papers he has published at arxiv in the last 5 years. The problem is, that the phd community is forming it’s own group which isn’t working with blogs but with academic reputation.

The second example are large newspapers blog in which journalists have written article about the social role of AI. They are announcing, that Google is using a new sort of neural network or that robots will become important in the industry. I would call this sort of information mainstream AI, because the aim is to explain to public how the world looks like. The intention of such blogposts is not to discuss the details.

The third form of AI related blogs are not located in computer science but in the humanities. The idea is to describe AI from a philosophical point of view. A typical question is, if machines can think, if language is separated from consciousness or what Singularity means. This is also not real AI because, the aim is not program software.

After this negative example it make sense to define how the perfect AI blog will look like. The best example I’ve found in the internet was a python tutorial website, in which a programmer has explained how to realize the game AI for a snake game. In this post everything was made right: AI was seen as a programming topic, secondly the intention was not to publish a paper at arxiv but to write a blog post and third the subject was explained to the newbies who are not familiar with the technology already.

Before a blog can write about AI, the previous step is to write about Python game programming. This kind of amateur community is trying to realize short programs on their desktop PC which looks like Snake, pong or Tetris. The community who is doing so is large, and they have published lots of tutorials what they are doing.

If this idea is put to the next step the next question is how to realize the AI for such a minigame. This topic is not very often discussed in the internet and only a small amount of people are talking about it. If we can answer who the problem is solved, we can identify the relevant blogs who are doing so well and not so well. These blogs are equal to the AI community. Their aim is not to discuss singularity, not to write Arxiv papers, not to talk about philosophy, but the focus is much more simpler: how to write on top of a python game the AI?

Sure there are many so called papers available from the universities how have answered this question on a theoretical level. But the authors of these papers are not equal to the ai community. What they have done is to create content in an academic environment. Which means they are in their own world outside the blogosphere. The same problem is located around blogs who are dedicated programming blogs. They are explaining what Python is, and how to program a game in C++. This has also nothing to do with the AI Community. Because programming is different from AI programming.

Definition of the AI Community

As a work hypothesis, the definition of the AI Community is, that it’s about weblogs in which tutorials are published about implementing Game AI in amateur games, written with Python. This definition is based on the following elements: weblog, tutorial, Game AI, Python. Everything which doesn’t fit to the definition is not part of the AI community.

Let us make some thought experiments. A new arxiv paper in which a neural network is explained doesn’t fit to the definition, because it’s not a weblog but a peer reviewed paper. A blog in which amateur philosophers are discussing about Artificial Intelligence in general doesn’t fit too because it’s not about Python nor Game AI. The barrier to enter the AI community is high and the number of blogs which are belonging to this definition is small.

The good news is, if the definition is very strict it is much easier to get an overview over all the blogs in the world who are fitting to this definition. It is realistic to identify them all and make a small list. I would guess that the amount of blogs on the list is smaller than 100, perhaps the number is between 10-20. I don’t know.

But why is the definition so strict? Wouldn’t it make sense to include more content which has to do with AI in a more general way? Let us analyze what will happen if we see the Arxiv papers as part of the AI community. The problem is, that the papers submitted to the academic community have a different kind of aim than a blogpost about Game AI. In most cases, the papers doesn’t describe Narrow AI but they are created around mathematical formalism. If a papers contains lots of university related equation the reputation of the author gets much higher. Unfortunately this reduces the number of people who understand the paper. This contradicts the idea of a community to teach a topic to newbies who are not familiar with AI before. The entry barrier of academic journals is too high to become part of the AI Community.

Advertisements

Let’s connect the AI Community!

In a previous blog post, I’ve explained how to search for blogs from the AI domain. The idea was, that most authoritative structure in the internet which is aware of all these blogs already is google and what the user has to do is to find out how to ask Google the right way. Let us make an example, how to not to find relevant blogs in the internet. If we are typing in “Artificial Intelligence” lots of websites will be shown. The problem is, that they are talking about the mainstream term “AI” which stands for anything and nothing. The result list is long and the knowledge on these sites is low. A second problem is, that most of the content was created not by amateurs but from large newspapers with the aim to fill the empty gaps in a journal.

The better idea is to search for a slightly different keyword and be flexible in the search request to Google. How to search right for AI blogs? This question is the right one, but it is hard to answer. The trick is to find a combination between detailed keywords and the correct domain name. What we want are not results about AI in general but about “Python game AI”. And we are not interested in articles from the Guardian but from amateur blogs. A possible search request is:

["Fuzzy logic" OR “Model predictive control” OR “Forward model” OR “AI planning” OR "hierarchical task network" OR “blackboard architecture” OR "model-based reinforcement learning" OR “Learning from demonstration”] [site:hypotheses.org OR site:wordpress.com OR site:blogspot.com OR site:github.io]

It is using different keywords which are aggregated with the OR operator and at the same time different domain-names are searched also connected with the OR operator. The improvement to the last posted code snipped is the “site:github.io” add-on. Github.io is referencing to so called github pages. This is a feature from the github social network which allows the user to publish static HTML pages in the Internet without costs. Many hobby programmers are using this feature, because it gives them more control over the content. In theory they can create old-school HTML pages which are not dependent on WordPress like bloatware, but are slim and without any pictures. And it seems, that github has fullfilled with this feature the needs of computer science students very well. High quality content is available under this domain.

The other keywords for restricting the search results to wordpress and blogspot are well known search techniques to make smaller blogs hosted on one of the major blogging websites visible.

I would guess, that the code snippet to ask Google the right way is not perfect. Perhaps it’s possible to adjust the keywords and the domain name a bit to get better results. What will happen in the worst-case that not a single blog is shown in the result list because the date range was restricted to the last week. It’s unclear if Google hasn’t indexed the blogs yet, if no content was posted, or if with the code snippet something is wrong. But in general this is the right way to identify relevant AI blogs in the internet.

Remote comment: Game AI in python

Snake AI in Pygame, https://pythonspot.com/snake-ai-in-pygame/

The Game AI is hidden in the “def target()” function. In the AI literature the principle is called a static decision tree. The AI player perceives the environment “if self.x[0] > dx:” and executes an action in response: “self.moveLeft()”.

What was shown in the project is not an sophisticated AI player which is able to solve the game, but the game AI concept itself. First, the snake game was programmed in Python, and on top of the game, the AI player was realized in a subfunction. Who this AI subfunction has to be realized depends on the knowledge the programmer has. The described Behavior tree is the most simple form of implementing an AI. It doesn’t contains search algorithm nor gametree graphs but it is working on a simpler level in under 10 lines of code.

Book Review: Fuzzy logic with matlab

Sivanandam, S. N., Sai Sumathi, and S. N. Deepa. Introduction to fuzzy logic using MATLAB. Vol. 1. Berlin: Springer, 2007. Amazon Kindle Price is US$ 100

Let us describe what is wrong with the fuzzy book. The first thing to criticize is, that the license isn’t a creative commons one. That means, the content of the book can’t distributed over the internet freely which makes it the wrong choice as educational material. But suppose, the book should be used in a closed circle together with proprietary software. The next problem is, that only an introduction of Fuzzy logic was given, but it’s not explained how to create “fuzzy models”. Without a fuzzy model, it’s not possible to realize a concept which is called indirect control. This concept is needed for nearly all practical projects. As a result, the students will struggle to use the knowledge in the book for any useful application.

What me irritated most in the fuzzy book was, that it was written to support Fuzzy logic theory. Fuzzy sets are introduced as something which is here to stay. This kind of concept doesn’t make much sense. Fuzzy logic isn’t a scientific discipline, but an esoteric corpus with a non-mathematical background. The more interesting question is why Fuzzy thinking is not helpful.

How to connect a blog-community

Writing the own blog is the first step every author has to do. He creates some postings, and perhaps he creates some screenshots to make the content more interesting. But after a while it is important to take the next step. The idea is to search for blogs written by authors and write a comment under their postings. This helps to connect to each other. But how exactly can blogs with the same topic be found in the internet?

The best way in doing so is to feed the right keyword into Google. the following example is searching for blogs about Artificial Intelligence.

["AI Winter" OR "Fuzzy logic" OR “Model predictive control” OR “Forward model” OR “AI planning” OR "hierarchical task network" OR "decision tree"] [site:hypotheses.org OR site:wordpress.com OR site:blogspot.com]

Additionally, the date has to be modified to show only results from the last month. Then the list has to be browsed manual and if one of the postings sounds interesting, a comment can be written under the post. Somebody may ask, what the advantage of a comment is over a post at reddit. The answer is, that all the social media websites like Facebook, Twitter, Reddit and Instagram are evil. They are trying to divide the bloggers and they would like to build frontpages and blog aggregator websites. At the end, the authors are not talking to each other on peer to peer basis, but they are talking with the higher instance. And the rules within Facebook and Reddit are determined by these companies.

The better idea is to keep things as simple as possible. A blog can be installed by a single person without the need of a large scale company. And a comment can be posted below an article without asking the moderators at reddit if this makes sense or not. I do not think, that there is need for aggregated, syndicated and planet-moonmoon something websites, but blogging has to do with content itself and with manual feedback. The only thing what should be automated is the fulltext search engine. And indeed, without Google it’s not possible to identify thematic relevant blogs.

Reddit

Let us describe what Reddit is. According to Reddit each content needs a higher instance in which the URL to the content is shared. Nobody knows what sharing is, except Reddit. Sharing is a feature in their proprietary software which runs the website in the background. But what exactly is the purpose to keep the content seperated from the URLs? A naive understanding of the internet is to assume that links and social networks are a fundamental part of the technology and it is not possible to create content without these abilities. In reality nobody needs to connect websites or promote content. The only thing what is needed is the text itself. Let us compare the internet with a library. Which part of the library is more important, the catalog or the books? The catalog can be ignored. It is possible to identify the books without the dewey classification system. Simply with knowing which color a certain book has or where it is located physically. The same is true in the internet. If we would like to read the website of a server in the united states, the URL will end with the .com word. And even the domain name system is not so important the beginner would guess. It’s possible to enter the physical IP address in the browser window.

If somebody is not interested in the complete internet but likes to read only his prefered 10 blogs, he can note down the IP adress in normal sheet of paper. So he won’t have a need for any reddit, DNS Server or something like this. And if the IP adress is unknown he can write a simple letter to the opponent and ask for the new adress. This allows to make new friends.

Step by step tutorial for creating a Reddit post

Creating a Reddit post is very easy. It’s not important to adapt to the Reddit community but to the blogosphere located outside the website. In the following example the task is to create a snippet in tribute to the Wikinews project. The idea is to filter, aggregate and evaluate content which is already there. The first thing to do is to search for the original information. We found the piece of information in a subsection of the wikinews discussion page. It’s about a new Wikipedia app under the Android operating system. We are reading the information carefully and decides that this is a headline with a global impact.

So we have to write a short note plus a title plus the URL: At the end is a small information about the creative commons license needed, because we want that our Reddit post can be distributed in the internet. Before the post is submitted we should take a look of the design in the Reddit preview website which is located at https://redditpreview.com/

Let us summarize the workflow. We have selected a topic (Wikinews), we have identified a link which is fresh and we have written a Reddit post. Now it is time to upload the content to the original Reddit site. The upload is not very exciting and it’s done million times each day.

Cloning Reddit the right way

Reddit is perceived as the hidden champion in the internet because it drives all the traffic and is read by many million of people. All major newspapers are aware of the website and the website has a gatekeeper function. So let us analyzing the concept a bit. The basic idea of Reddit is called blog aggregator. The concept was introduced with the Planet software in the year 2004 for the gnome project. Planet is a python script which combines RSS feeds from individual blogs into a larger news site.

The planet software itself isn’t used for creating the Reddit website, but the concept is the same. The idea is that under the same surface different content sources are combined The topical reddit post contains of a title, url and a description. So it is basically a handcrafted blog aggregator.

For replicating the result we have to answer first which underlying technology is the best one. The reddit website is working like most websites on a webserver in the internet, while the planet software is providing only the RSS functionality. An easy to install alternative to both is a mediawiki installation in which different users are posting links. Mediawiki is known for it’s ability to create content by a community, and it is possible not to post normal text but only URL+description snippets.

A more powerful and state-of-the art blog aggregator is equal to a wiki system in which the users are asked to post weblinks to external content which has to:

– new

– fits to the topic

If the users are posting spam links, the wiki-moderator can rollback the edit and ban the user. The advantage over the planet software and the current website is, that mediawiki is available as opensource and makes collaborative editing more easily. Such a wiki could take the same role like other blog aggregators like planet.gnome are playing today. It is monitoring the existing blog community around a topic.