Limits of Wikipedia

Dear reader,

I want to write a new article for the Trollheaven blog, but I’m unsure which topic is the right one. Might you help?

If the result of the poll is unclear, because two favorites have the same amount of votes or no one has pressed the button, a peaceful random generator is started.

Update 2018-10-16
The poll is over. “Limits of Wikipedia” has won. Here is the article.

Wikipedia’s next door

Sometimes, the Wikipedia is described as the standard in the internet and the best encyclopedia in the world. This description is right, if WIkipedia is compared with other mainstream encyclopedias like Brockhaus or Encyclopedia Britannica. But if we focus on scientific knowledge Wikipedia is not very advanced. To describe the problem in detail, I’ve found a simple but effective way to demonstrate the limits of today’s Wikipedia.

The searchengine Google provides a tab called Shopping for searching in commercial available products. If we enter the keyword “Encyclopedia” into the box and adjust the list-order to expensive products first, we will find a huge list of commercial available encyclopedias which are not WIkipedia. All of them are created by academic publishing companies like Elsevier, Springer and WIley. They are not general purpose encyclopedia but specialized on a topic from science. Here are some examples:

– Encyclopedia of Language and Linguistics, Elsevier, 9000 pages, 10000 US$
– Encyclopedia of Complexity and Systems Science, 10000 pages, 9000 US$
– Encyclopedia of Evolutionary Psychological Science, 7000 pages, 5000 US$
– Encyclopedia of Nanotechnology, 2900 pages, 1500 US$

The list is much larger then only these 4 books, I would guess that at least 200 different encyclopedias are available. Each of them is large and expensive. The reason why they are sold to libraries is because they are better than Wikipedia. They contain more keywords and the description is more accurate. It is simply not true that Wikipedia is the best encyclopedia in the world. It is only the cheapest one and the quality is not very high.

Explaining the difference between Wikipedia and the Elsevier/Springer encyclopedias is easy. The traffic of the keywords is different. Some keywords are also part of Wikipedia. But they have a very low usage statistics. That means it’s not a mainstream topic which generates 1000 hits per day, but it is possible that such keywords have only 2 visits per day. Wikipedia has only a few of these low traffic keywords available. If somebody needs such a specialized information he has to buy a Springer encyclopedia.

Let us estimate how expensive this would be. 200 encyclopedias each for 8000 US$ is 1.6 million US$. A huge price, but it make sense to invest this amount. Most institute libraries in the world have done so, because they need the knowledge. They can not switch to Wikipedia because Wikipedia doesn’t provide this specialized knowledge.

Bringing Wikipedia to the researchers

The Wikipedia encyclopedia is recognized as a mainstream encyclopedia which provides knowledge about the latest Starwars movie, Harry Potter books and Reggae musicians. It is accepted by a non-scientific audience as a reference for getting information quickly without asking websites which containing a lot of advertisement. The internal quality control of Wikipedia works great and avoids that spam and biased information is injected into the encyclopedia.

A researcher in a biology lab has two options. Either he can ask Wikipedia for help or he can search in a Springer Encyclopedia. The Springer version provides a much higher quality. I’m referencing to this fact because right now, Wikipedia only has replaced general purpose encyclopedia like Encyclopedia Britannica. But not specialized versions which are written by scientists. If we compare on an objective base the quality of Wikipedia with a Springer Encyclopedia of a certain topic, we will notice, that Wikipedia is weaker. That means, in most cases the lemma has no entry and if it’s available in the Wikipedia the article is too short. That means, Springer is able to sell their own Encyclopedia for thousands of US$ because Wikipedia is not able to provide the needed information.

I do not know how to solve this issue. But i can give a measurement if it is solved or not. If Wikipedia has better content than a Springer Encyclopedia, the issue is solved. And to determine the progress it is necessary to compare both sources. Left we open the article in Wikipedia and right we open the article in a Springer handbook. The difference is, that a specialized version explains every detail of a subject. The audience is not the whole world, but a researcher who is interested on a concrete subject and has a lot of background knowledge. This kind of audience is not happy with today’s Wikipedia. The problem with Wikipedia is, that it only provides general knowledge but has many missing topics in scientific sub disciplines.

To overcome the problem it is necessary to create articles in Wikipedia with a low amount of visitors. That are specialized entries which are relevant for not more than 100 people worldwide and which will generate only 1-2 visits per day. These subjects are not very attractive for Wikipedia authors because if an article is not read by the public it is useless.

The good news is that the overall structure of Wikipedia doesn’t have to change. Specialized articles can be handled like any other article too. That means, the workflow of creating and evaluating the content is the same. The only new thing is, that these kind of articles will generate a ultra-low amount of traffic. That means it seems to specialized for a general purpose encyclopedia. But at the end it will help to increase the acceptance of Wikipedia in the research community.

Let us examine some examples from the “Springer Encyclopedia of Algorithm”. None of the following lemmas are available in Wikipedia:

– Analyzing cache misses
– Approximate Dictionaries
– Approximate Regular Expression Matching
– Approximation schemas for bin packing

The reason is, that these entries are very specialized. Apart from computer scientists nobody will use these terms. But all of them are available in the Springer Encyclopedia, and this is the reason why the Springer version is used in an Institute library but Wikipedia not.

What have these lemmas in common? They are three word lemmas. That means, the question is not what “approximation” means. (This is explained in WIkipedia very well) the question is what a certain short sentence mean. Wikipedia has only a handful of two words and three words lemmas in the database. For example “Approximation error”, “Newton’s method” and “Tolerance relation” is all explained very well in the Wikipedia. But there are many more lemmas which are more specialized and doesn’t have an article right now.

What Wikipedia can learn from Springer

Springer has a unique position to the researchers. The company is perceived as close to the problems. That means, a Springer book fulfills the needs of a researcher. What is the secret? The secret behind every Springer book is, that it is focused on a detail problem. A handbook about Nanotechnology is specialized on only this topic but describes it in detail. And the Springer encyclopedia are domain specific encyclopedia too. They are not written for a broad audience but for experts in the field.

Is it possible to transform this concept into the Wikipedia ecosystem? Yes, it is possible but it is hard. The main problem is, that today’s Wikipedia authors are not experts in their field but have a general knowledge. They have much in common with general Liberians from a public library who know from any subject a bit, but nothing in detail. In contrast, the Springer encyclopedia was written by experts which bring in a strong background knowledge. This make the content so relevant for the readers.

Wikipedia have tried to become more important to researchers in the past but failed in doing so. It was not possible to motivate existing researchers in contributing content. Instead Wikipedia has it’s strength in topics with a general interests for example movies, sports and political information. Nearly all aspect of everyday life is available in the Wikipedia, but that is not enough for a scientific encyclopedia. The future vision is to enrich Wikipedia with more specialized information which goes very deep into a subject.

I think WIkipedia can’t learn anything from classical encyclopedia like Encyclopedia Britannica or Brockhaus. Both are death today. But WIkipedia can learn a lot from Springer. The people there know more about creating an encyclopedia than the authors / admins at WIkipedia. And they are experts for specialized knowledge which is teached in universities.

On the other hand, Springer can learn something from WIkipedia. And this is how to reach a huge audience. Wikipedia has the top #1 rank in Google and is read by millions of people. Springer doesn’t have such a kind of traffic. A normal Wikipedia article has around 100 visits a day. In one year it is 182000 visits. Wikipedia is a mass medium, while Springer is a specialized medium. If Springer want’s to sell more books they need WIkipedia, and if WIkipedia want’s to get high-quality content it will need Springer.

Springer Link

Let us take a look what the commercial publisher Springer has to offer. In the section “reference works”, encyclopedia and handbooks are listed. An encyclopedia is similar to Wikipedia an alphabetically ordered list of articles, while a handbook contains overview articles which are much longer. Each subject like mathematics, engineering and physics has a huge amount of Springer reference works. It is possible to view example chapters, but the full text access is restricted to users who pay. This principle is well known under the term paywall.

What is unique in the Springer encyclopedia? It contains usually very complicated and specialized subjects. For example these one:

– Adaptive Control for Linear Time-Invariant Systems
– Boundary Control of 1-D Hyperbolic Systems
– Dynamic Noncooperative Games
– Information-Based Multi-Agent Systems

None of these keywords is available inside Wikipedia. If a researcher needs them, he has to buy the Springer book. What they have in common is that they sounds complicated and that they contains of more than a single word. Instead they are 3 words and even 4 words lemma titles. That means, it is a specialized entry for a specialized audience.

And this is the main difference between a mainstream encyclopedia like Wikipedia which is read by the mainstream and an academic encyclopedia from Springerlink which is read by researchers.

What the researchers have done in the last 10 years is to build their own Wikipedia version which is protected behind a paywall. That means, the researchers within universities reading and contributing to the Springer encyclopedias but not to Wikipedia. In contrast, Wikipedia is written by journalists, bloggers and amateurs. The Springer encyclopedias are written by real researchers with a deep knowledge in their field.

Springer Link statistics

The Springer Link website contains of 24 categories like Biomedicine, Chemistry and Computerscience. Each category has around 50 different encyclopedias to offer which are listed in the reference-work section. The total amount of scientific encyclopedia from Springerlink is 24×50=1200. Each encyclopedia costs around 4000 US$ and provides around 4000 pages. The total number of printed pages is 1200×4000=4.8 million. Elsevier, a Springer competitor, has also many encyclopedia to offer.. They are listed at the Sciencedirect website. The price tag is similar. That means a book with 2000 pages will cost around 2000 US$.

A size comparison with Wikipedia is possible. The printed wikipedia has 7473 volumes with 700 pages each, https://en.wikipedia.org/wiki/Print_Wikipedia The amount of pages is 5.2 million. While the Springer encyclopedias containing in total of the above mentioned 4.8 million pages.

Wikipedia vs. academic encyclopedias

Wikipedia strength is, that the encyclopedia is cheap and covers mainstream topics. His weakness is, that specialized lemmas from scientific fields are missing. The commercial encyclopedias from Elsevier and Springer have the opposite profile. They are expensive, but provide specialized academic topics. The content is created by experts.

Having fun with Wikipedia

In the beginning of the famous encyclopedia, it was easy to vandalize the project. Vandalizing means to destroy something, to rant against the admins and to make clear who the boss is. The best practice method in doing so is to to search for a high traffic lemma for example “Artificial neural network”, delete all the content and press the save button. Now, Wikipedia is shutdown and the world sees nothing if they need information about the topic.

After 30 minutes or so, some admin is alarmed because we have deleted his work and he is complete irritated. That means, the admin doesn’t know what is happend with his encyclopedia and he must first consult the manual to rollback the information to a previous state. In this time, Wikipedia is offline and we have won.

Unfortunately, the time has changed. Modern admins are prepared for such kind of vandalism. They are better informed how to use the mediawiki system and in worst case they will block the attacker completely which is a bad situation, if we want to vandalize the Wikipedia a bit more. What can we do, if the aim is to have a bit fun with the admins?

What a good vandal is doing is to upgrade his tools. Instead of simply clearing an article the better idea is to produce a non-sense article. A non-sense article has the advantage that automatic spam protection are not able to recognize it and sometimes it took weeks until an admin will detect the problem manually. The best way to create a nonsense article for Wikipedia is the Scigen generator, https://en.wikipedia.org/wiki/SCIgen It was invented with the aim to fool an academic journal but it works also for wikipedia.

The first step is visit the Scigen website and press “generate new paper”. Then the document has to be converted into the wikisyntax. If everything looks fine, it can be uploaded to wikipedia. The advantage over normal vandalism is, that on the first look the Wikipedia article is similar to a real article. The automatic incoming filter of Wikipedia which checks all the content will not be alarmed, because it is normal text, contains no plagiarism and provides references to other academic papers. To recognize the problem, somebody must read it in detail, but this is never done. Most admins are in hurry because each day around 700 articles are created from scratch. So our non-sense article can stay in the encyclopedia and we had a lot of fun during the break.

The confusing unification of 16 bit architectures and Internet QoS is an extensive riddle. In fact, few cryptographers would disagree with the analysis of thin clients. We disprove that the little-known authenticated algorithm for the evaluation of thin clients by J. Smith is maximally efficient.<ref name="arun2003" />

==Introduction==
In recent years, much research has been devoted to the structured unification of I/O automata and Smalltalk; nevertheless, few have constructed the deployment of consistent hashing. The notion that scholars agree with journaling file systems is generally adamantly opposed.<ref name="arun2003" /> The notion that analysts interfere with interposable symmetries is always bad. This is essential to the success of our work. Therefore, Moore's Law and reinforcement learning agree in order to realize the development of IPv7.<ref name="brown2004" />

We next turn to all four experiments, shown in Figure 4. Of course, all sensitive data was anonymized during our bioware emulation. Note how emulating Markov models rather than emulating them in hardware produce more jagged, more reproducible results. Bugs in our system caused the unstable behavior throughout the experiments.<ref name="white2000" />

==References==
<references>
<ref name="white2000">
{{cite journal
| title = ArgivePlexus: Multimodal, introspective communication
| author = N. White and J. Hennessy
| journal = Journal of Flexible, Stable Random Methodologies
| volume = 9
| pages = 1--11
| year = 2000
}}
</ref>

<ref name="brown2004">
{{cite journal
| title = A case for Voice-over-IP
| author = K. Brown, C. Miller, S. Cook, and R. Stearns
| journal = Journal of Semantic, Authenticated, Modular Configurations
| volume = 4
| pages = 20--24
| year = 2004
}}
</ref>

<ref name="arun2003">
{{cite journal
| title = A case for context-free grammar
| author = M. Arun and C. Maruyama
| journal = Journal of Classical Algorithms
| volume = 737
| pages = 70--90
| year = 2003
}}
</ref>

</references>

Sometimes, a wikipedia article with a high amount of traffic is blocked as default. But that is no problem, because many others can edited freely. Here is the list of most visited lemmas. https://en.wikipedia.org/wiki/Wikipedia:WikiProject_Computer_science/Popular_pages For example, the topic “Support vector machine” has over 2000 views per day, but it is not protected. So it is the ideal starting point to drop some nonsense. If the aim is, that the Scigen content stays longer in the wikipedia, it is good idea to search for a low traffic lemma. That is not observed carefully, and we can make edits without being interrupted by the admins.

Advertisements

Wikiipedia is different from OpenAccess

For an outsider it may be surprising the know, that Wikipedia and OpenAccess have not much in common. The first impression is, that a typical Wikipedia article has footnotes and is discussion science, while the ordinary paper from Elsevier is doing the same. But, if we are investigating the history of Wikipedia and normal academics there is a huge contrast. The main difference is, that Wikipedia is working according to the hacker mentality. It is some kind of GNU project in which everybody is welcome, as long as his contribution is right; while academia works slightly different. Academia works with with academic societies. That are student groups which have something a journal and it is very important to differentiate between a member of the society and the rest of the world. For example, the IEEE society works with that principle, but there are many other of these groups. The main aspect of academic societies is, that not the knowledge is in the middle but a person.

Let us investigate a short example. An ordinary people wants to be a member of IEEE. How can he do so? Answer, it is not possible or only over a long period of time. The normal way of become a member is biographical driven. That means, the must learn 18 years in a normal school. Get a degree. Then he is attending the university, get also a degree. Than he is attending the PhD program (which costs a lot of money), and after this procedure is over (which takes around 30 years or more) he can become a member of IEEE society. That implies he has the right to read the journal and write articles by his own.

In contrast, to become a member of Wikipedia is much easier. He can sign up for an account, and 5 minutes later he can vandalize the first article. He must invest not 30 years, and no money. That is the main difference between Wikipedia and serous academia.

Between both poles there is an interesting in-between-service available, called Academia.edu. Academia.edu works with with Wikipedia principle. That means, the general idea is, that everybody is welcome and after 5 minutes he can upload his first paper. That is perhaps the main reason, why Academia.edu is outside of real science, because this workflow contradicts the model of learned societies. Academia.edu, Wikipedia and Unix is all together some kind of hacking. That means to reduce the entry barrierer down to zero and provide a space in which ordinary stranger can become sucessful. The problem for established academiia is, that this working habbit can become normal. This contradicts the normal story which is told inside of science.

The interesting aspect of Academia.edu, Unix and Wikipedia is, that the principle works. That means, it produces something useful. So the question is, why is serious science not open? Why is publishing only possible if someone is a member in IEEE? And indeed, answering the question is the problem. Or from the other way around, Wikipedia is something different from science. One the first hand, it has much in common, but Wikipedia works different. That is the reason, why so few professors are part of WIkipedia. They ignoring the website and they are in fear of it. It may look surprising, but the mathematics section in Wikipedia was not written by mathematicans at the university. The reason has to do with the different biographical background. That means, a mathematican according to the AMS society is somebody who has a phd degree and works since many years with a university background. That means, it is per definition not everybody, while in Wikipedia a mathematican can be everybody who is uploading an interesting artcle.

Create a Wikipedia article from scratch

The first step in creating a Wikipedia article is to search for information already there. The topic, I have chosen is “Learning from demonstration”, and former Wikipedia authors have uploaded some content to the subject. With the internal searchbox, the websites are identified and read carefully, because we do not want post the same information twice.

After reading the given information it is obvious that the existing article https://en.wikipedia.org/wiki/Apprenticeship_learning fits at best to the “Learning from demonstration” topic. It is focused not on program synthesis, but on robot-control and mentioned vocabulary like “inverse reinforcement learning” which is the term used in the literature too. Adding the new information to our notes makes sense:

A view on the pageview statistics shows, that the number of daily visits is very small (35 visits per day). Such a niche topic is a good starting point for the first edit in Wikipedia, because the chance for conflicts is small. That means, it is likely that the Wikipedia admin let us playing around with the text, because this article is not under fire.

The next question is, how to bring our own content into the article? Deleting all the existing information and write our own text from scratch is not a good idea, because previous authors have the article perhaps on their watchlist and will protest if we are doing so. What is always possible is to add new information in a soft-mode, that means we are creating a new section with the markdown syntax “# section” and write down our needs.

Before writing our text, some literature may be helpful. The existing references in the Wikipedia article are not enough, so we must add some new papers. In the best case, we have already a literature list in the bibtex-format, which we found useful. Here is the list:

In theory, the above cited literature must be formatted in a special way to match with the Wikipedia-syntax. In reality, this is one of the minor problems. It is explained somewhere in the help section, how a literature template is used right. So I can leave out this step and focus more on a content level. We have the above cited 5 sources, which are useful for describing the topic. Reading again the papers is useful and helps us, to make some notes about it, which we can later extend to a text for Wikipedia.

After reading the 5 papers again, we find that two of them a similar and one is boring. The list can be reduced to only 3 papers which are good, and our readers have less information overload. A second advantage of reading the information again, is that it is very clear what “Learning from demonstration” means, and we can write down a prototype text. But where to start, how to lowering the entry barrier? The best advice is to imagine that not Wikipedia is the target but we want to write a comment! for a Youtube! video. Everybody knows, that comments there are not very serious and it is more important that they are colloquial. That is exact the writing style a good academic paper should have.

Writing a first draft version was surprisingly easy. Because it is formulated as prose and not as a scientific paper. Making a complex academic article from it, is only a formal question. This is realized by adding references to external literature. That is the main difference between prose and science. From the text itself, there is no difference. According to the filesize, the short text in the screenshot is around 2400 bytes long, which is compared to the average edit in Wikipedia very long. Most edits there are not longer than 500 characters.

Until now, it is unclear if many small edits or one big edit is the better choice for interacting with the encyclopedia. But one advice is very clear, that making an edit to fast is wrong. Our above prototype text is not ready for uploading to the internet. It has some spelling mistakes, the cited literature is in the wrong format and some aspects are missing. In theory it is possible to upload even draft-content and edit it on-the-fly. But testing out the tolerance of the wikipedia admin is not the best idea. Instead the novice text-creater should do writing a nearly perfect text on his local harddrive and upload only the final version. So we must postpone the interaction with Wikipedia a bit, and improve first our text.

The good news is, that the article is not very often changing. The last edit was 3 weeks ago, and last year a period of 6 month was there, in which no edit took place. So it makes no matter, if we are uploading our text today, or in one week. Wikipedia can wait.

The next step is to reformat our prototype text into the wikisyntax. This has to be done with the text itself, which can be enhanced with so called Wiki-links for referencing to articles already there and with a citation template for making the reference list right. The result can be controlled in the Wikipedia sandbox.

Even our text is short, the formatting is surprisingly complex. Until all the keywords are referenced and a new created table is in place it takes some time. A nice sideeffect of the formatting is, that our prose text which was originally targeted to a youtube audience looks now more professional. It consists of the same words, but this time they have literature references and clickable links.

After doing some improvements in the literature list the version in the sandbox looks like in the following screenshot.

The article itself is ready. Until now, only the sandbox was aware of it. Now it is time to getting the real Wikipedia informed. That means, we are copy & paste the 4kb sourcefile into the article which is already there and wait, what will happen. The opponent known as “Wikipedia admin collective” has many options. He can delete our text in total, he can delete minor parts of it, or he can accept the text, that means it will happen nothing. It is hard to guess, what the Wikipedia admin will do. In my opinion, the admin will perhaps edit the English language a bit, because I’m a native speaker and it is very likely that some grammar mistakes are there. But it is also possible, that Wikipedia thinks, that my text is not scientific, and that the referenced literature makes no sense.

That is exactly the feeling which I’m calling in the beginning “fear”. That means, the author of new uploaded content doesn’t know, what the opposite will do. He has to loose a lot. On the other hand, it is payback time, and the text will go online under any conditions. Do not playing the game, is not an option!

After saving the changes in the real Wikipedia, the Revision history is updated and our edit is placed on top.

If we are trying to identify the edit in the global list of Recent changes, we will notice that our edit is only one of many hundreds. The updates are coming very fast and we must scroll a bit to find our posting.

Whats the difference in English Wikipedia?

spelling

Writing an article into the german Wikipedia is easy. The author must create a scientific looking text which cites lots of external literature, format the text with the wikisyntax and uploads it to Wikipedia. In most cases, the Wikipedia admins will like the new information and request for deletion are seldom. But, adapting this best practice method to the English wikipedia seems a bit complicated. From a formal point of view, Wikipedia is only yet another website, but if we are looking at a certain topic which we want to edit for example “episodic memory” which needs additional information, we will see in the traffic counter that the page generates 400 visits per day. That is lot more than the German Wikipedia. 400 visits per day means, that if the author uploads his text to that place, also his text will get read by thousands of people. And that is a problem.

Sure, I know what a scientific looking text is, and how to cite external literature is easy. But it makes a difference, to upload a paper to Academia.edu where the total count of reads will not be greater than 10, and posting a paragraph to Wikipedia, in which nearly the whole world is reading the information. From the goal of vandalising WIkipedia heavy and early it makes sense, the problem is, that the internal structure there is not very friendly to spamming, so that if the text is passing the quality control it will be read really by the world.

I wouldn’t call the feeling direct anxiety, but a healthy form of respect against the potential medium. Perhaps, i should for the beginning search for an article which is not so frequently read, in the hope that spelling mistakes and content-based inaccuracy will not result into an edit-war?

Deleting early and too much

That potential authors of wikipedia are nervous is normal, because they foresee what will happen with their text. The change of +1000 will be read carefully by an admin, and deleted because it makes no additional value to the encyclopaedia. As a result, the author will ask himself, if the invested time was waste and if he is competent enough for writing scientific content at all. Not uploading a text to Wikipedia is the right decision for any scientists who has to loose something. On the other hand, there are strong reason for doing exactly this. The main reason for investing time, even if the content will be deleted, is the value Wikipedia delivers today. That means, the individual reader profits from it, and it is time to pay back.

Wikipedia vs. academic publishing

http://www.sciencemag.org/news/2014/07/1-scientific-publishing

A comparison between the Wikipedia project and the scientific community is not an easy task. But it is possible. The similarity is that both are driven by surprisingly little amount of people. A study in PLOS One http://journals.plos.org/plosone/article?id=10.1371/journal.pone.0101698 has observed the details how many scientific authors are really publishing a paper. The most important insight is, that only 150608 scientist have published at least one paper a year. As active authors with at least 3 papers per year can be counted only 40k scientists worldwide.

Perhaps the study is not very exact, the tables and figures are a little confusing, and the study researched a very long time period. But is seems to be comprehensible that the number of active science-authors is smaller than 100k worldwide. In comparison, the Wikipedia english version, has nearly 20k active volunteers. And the trend is constant, also in future and also under openscience condition, the number of active researchers who are publishing something will not change significantly.

The interesting fact is, that the 100k active worldwide scientific authors are 100% not independent who are writing at home an amateur science article, but they are well funded persons who are working for big universities. Instead of describing the facts neutral, the myth of citizen science and academic social networks is spread out. And some advanced openaccess advocates claims, that something has to change. That can be seen as a joke, nothing has to change in the academic publishing system. It works perfectly and it is absolutely right, that Nature rejects 90% of all incoming manuscripts.

Open Science

The mistake which are done by Wikipedia and Academic social networks is to ask what should be made different to improve the situation and to motivate young people for engagement. The consequence are projects, which have in mind to increase the number of authors, to lower the barrier to publish something, to improve the visibility of written papers. Instead of follow a goal, the first step would be to analyse the current situation neutral. The question is not who should future of science look like and who we can destroy Elsevier, the more useful question would be, where is the science community today? Is the system broken or not?

My impression is, that the amount of active wikipedia authors and the number of active academic authors are some kind of nature-law which is constant and not changeable by a marketing campaign. Instead of bringing science forward it is time to say that science works. The workflow of paper based high-funded journals is the right one. The quality control is necessary and there is no need for improvements. The supervisor of Ijad Madisch who said “Drop this Firlefanz!” out of his head was right. He understood the publishing system better and ResearchGate will never be a success.

Autorenschwund bei Wikipedia

https://de.wikipedia.org/wiki/Wikipedia:Kurier Dass die Anzahl der Wikipedianer mit mehr als 5 Edits im Monat gesunken ist auf aktuell 4618 ist kein Problem. Laut der Tabelle lag der Höchststand bei 8700 und das sind eben normale Schwankungen. Auch im Linux Kernel arbeiten relativ wenig Leute mit. Laut der letzten Hochrechnung sind rund um Linux Torvalds nicht mehr als 3000 Developer damit beschäftigt neue Treiber einzupflegen. Und Linux gedeiht prächtig. Die Maßzahl für den Erfolg von Wikipedia ist die Anzahl der roten Links. Wenn Wikipedia nicht mehr nachkommt, die Rotlinks in lesbare Artikel umzuwandeln wäre es schlimm. Aber das ist nicht der Fall. Nur relativ selten gibt es unerschlossenes Gebiete. Zumindest in den Geisteswissenschaften. Im Bereich der Wirtschaft und Informatik sieht es nicht ganz so gut aus. Dort herscht in der Tat akuter Autorenmangel. Soweit ich weiß besteht das komplette Informatik-Portal in der deutschen Wikipedia aktuell nur aus 10 Leuten, da wäre ein wenig Nachwuchs sicherlich nicht verkehrt.

Ich glaube aber nicht, dass man irgendwas tun kann um diesen Nachwuchs zu generieren. Weil Wikipedia laut Defintion auf freiwilliger MItarbeit basiert und wer keine Lust hat, einen Artikel zu schreiben der hat eben keine Lust. Überhaupt glaube ich, dass der Flaschenhals weniger Wikipedia als solches ist, sondern dass man die Sphäre der akademischen Veröffentlichungen in seiner Gesamtheit betrachten muss. Der Engpass liegt eher im Bereich OpenAccess. Also dort, wo wissenschaftliche Forschung betrieben wird die auf Google Scholar publiziert wird. Mehr als diese Sphäre abzubilden kann Wikipedia nicht leisten. In der Hilfe ist das Prinzip ziemlich gut klargestellt worden: Wikipedia zitiert wissenschaftliche Quellen. Und wo diese nicht existieren kann man auch keine Artikel schreiben. Würde man mehr Quellen online haben, könnte auch das darauf aufbauende Lexikon umfangreicher sein und das heißt, man würde auch mehr Wikipedianer benötigen die die Texte verfassen.

Agressivität in der Wikipedia steigt an

Wer mit Wikipedia bisher keine nähere Bekanntschaft gemacht hat oder davon ausging, dass es dort eine “healthy community” gibt, die sich unterstützt wird entsetzt feststellen, dass innerhalb des Lexikons ein sehr rauer Umgangston herscht. Live miterleben kann man diesen auf der Vandalismus-Meldungsseite https://de.wikipedia.org/wiki/Wikipedia:Vandalismusmeldung Aber sind das womöglich nur die Schattenseiten und nicht repräsentativ? Eher im Gegenteil, so aggressiv wie auf der Vandalismusseite geht auch in anderen Bereichen zu, daher stellt sich die Frage: wie halten das die Admins aus, sich jeden Tag dieser Aggressivität zu stellen bzw. selbst welche auszuüben? Blättern man ein wenig durch die Vandalismus-Seiten durch, so stellt man fest, dass anders als zunächst gedacht offenbar die Admins sogar ausgesprochen gerne sich mit derlei Dingen beschäftigen. Es gibt sogar eine Unterseite namens “Trollübersicht” wo langjährige Zerstörer der Wikipedia ausführlicher vorgestellt werden, teilweise mit psychographischen Profilen. Anders formuliert, dafür dass die Vandalismusseite angeblich die Schattenseiten des Lexikons repräsentieren geht es dort erstaunlich munter und lustig zu. Ich glaube, das hat etwas damit zu tun, dass die Rollen klar verteilt sind. Das heißt, alle Beteiligten wissen um die Spielregeln und haben das ganze zu einem Wettbewerb erweitert. Das betrifft die Trolle auf der einen Seite, die natürlich wissen, dass sie Unsinn in die Wikipedia hineinschreiben und damit offizielle oder inoffizielle Guidelines verletzen. Sie machen es, weil es unglaublich gut tut, gegen die weltweit angesehende Wikipedia mal so richtig zu stänkern. Und auf der anderen Seite gibt es die Wikipedia-Admins, die natürlich ebenso wissen, dass sie unter Feuer stehen, aber eine breite Palette an Erfahrung und Tools an der Hand haben, um 6 Stunden sperren, oder 24 Stunden sperren zu verteilen und die es ebenfalls als Genugtuung empfinden mal so richtig schön vom Leder zu ziehen. Kurz gesagt, das haben sich zwei Seiten gesucht und gefunden, die zueinander passen. Das ganze funktioniert wie ein Katz-und-Maus spiel was schon seit ewigkeiten existiert, und bis in alle Ewigkeit die Beteiligten fasziniert.

Die Textbeiträge auf der VAndlismusseite folgen einem sehr eigenen Diskursstil, ja fast wie bei einem Gedicht ist dort zu lesen:
– Kleiner Timmy on Trolltour
– für 6 Stunden gesperrt, Begründung war: Unsinnige Bearbeitungen
– Editwaer mit mehreren Beteiligten
– keine Besserung erkennbar

Sozial entlastend wirkt auch, dass ein Diskurs zwischen beiden Seiten nicht erwünscht ist und auch komplett unwahrschenlich erscheint. Man hat ein wenig den Eindruck, als ob da sich zwei Seiten im Schützengraben gegenübersitzen, und das einzige was ausgiebig tun ist Dauerfeuer zu geben. Das ganze ist nicht etwa ein Ziel, sondern es ist die Beobachtung. Egal ob man die Vandalismusseite an Heiligabend, an einem Montag oder in 3 Jahren nochmal aufruft, es ist immer dasselbe Bild. Die einzige erkennbare Währung in diesem Spiel sind Lulz, also die Schadenfreude es der Gegenseite mal wieder so richtig besorgt zu haben. Wobei Lulz als Fun wohl beide Seiten (Hase und Igel) haben, sonst würden sie nicht aus ausdauernd dabei sein. Offenbar macht es einerseits unglaublich viel Spaß, seine Zerstörungswut an der ehrwürdigen Wikipedia auszulassen und genausau macht es Spaß, User und IP dafür zu maßregeln (am besten von oben herab).

Nur zugern würde ich persönlich für eine der beiden Seiten Partei ergreifen. Aber dafür ist das Schauspiel viel zu kostbar. Vielmehr ist das Gegeneinander und die unversöhnliche Härte die fasziniert, es ist vergleichbar mit der Futurebörse in Chicago wo sich die Händler anschreien. Es geht ähnlich wie in dem Roman von Frank Norris nicht nur um die Sache an sich, sondern es spielen immer auch persönliche Motive eine wichtige Rolle. Ähnlich wie beim Kartenspiel kann daraus ein Sucht werden, wo es um die Interaktion als solche geht, wo also das Trolling bzw. das Gegentrolling zum Lebensinhalt wird. Im Wikipedia Sprech wird von notorischen Wiederholungstrollen gesprochen, aber auch die Gegenseite zeichnet sich durch eine Konstanz und Vehemenz aus. So ähnlich wie bei einem einarmigen Banditen kann man den Automaten dazu verwenden, um dem Alltag zu entfliehen, sich also für einige Stunde in die Parallelwelt des Vandalismus flüchten.