How to decide for a computer journal?

Since the 1980s many magazines for computer beginners were published. The famous Byte magazine is an early example, but Dr. Dobbs, IEEE spectrum or the “happy computer” have reached a large audience. But which of the former journals is the best? The answer is, that this can be answered only for the past. The idea that a reader identifies with one of the these magazines was typical in the 1980s and the early 1990s before the advent of the internet. It was the result of an economic situation. A certain user has carefully selected on of the journals, for example the Byte magazine, became a member of the journal club and every month the new issue came into his postbox via snail mail. If the user was interested in professional applications of computing, he wouldn’t select the “Happy computer” but the “Communications of the ACM”. It was a journal with a serious background, that means, that article are written for a professional / expert audience but not for beginners. The economic principle was the same. That means, the user subscribed the journals, paid the annual fee and gets the journal via mail.

The internet has changed the relationship between reader and journal. Today it’s not longer possible to decide for journal A and argue against magazine B. Instead the user has to read them all. This is possible with a fulltext search in the content. That means, the user is no longer forced to read a current issue from a certain editor, but the content is selected ad hoc. The day is not starting with the new of the Byte magazine, but with entering a keyword into the Google search engine and ask the complete internet which kind of information are available. If the result page has information for beginners, it is fine. If the results are focussed on professional programmers it is working quite well. The new situation is, that the reader is no longer part of a club which means to be not a member of the thousands of other clubs. Instead the reader comes with a concrete question and all the journals and article writers have to deliver their content. Not directly to the reader because this would overwhelming him, but to the intermediate called “Google search engine”.

The reading culture in the 1980s and 90s was dominated by a direct relationship between a reader and his publisher. A certain reader was a fan of a journal. He had a relationship with the content and didn’t like magazines with a different kind of style. This direct relationship is gone since the internet age. The distance between readers and journals is bigger than ever. Instead the newly founded fulltext search engine defines the relationship. That means, the computer journal has a relationship to Google, because Google provides him new readers. And the end user has a relationship with Google, because Google provides him information.

How exactly is the relationship between a user, Google and a magazine defined? Usually, the user enters a keyword into a searchbox. .This is equal to a filter. For example, the user want’s to know something about the Commodore 64, but not about the Amiga 500. What has changed since the lovely 1980s is, that today’s users have learned who to use the searchbox right. They are able to formulate a search request precisely. They would enter perhaps “Commodore 64 geos” to get some information about the grahical operating system for the 8bit homecomputer. What the ordinary user isn’t interested in, is to get information from a certain source. Instead Google decides, which magazine article is ranked on place 1, 2 and so on. The advantage is, that even the user doesn’t know the magazines name, he will find the information. This is sometimes called anonymization, because a longterm journal with a fancy name has no longer an advantage over an online forum or a blog. What Google is doing is to ignore all reputation somebody has collected and searches in the raw textfiles to search for words. If somebody is interested in information about a certain microprocessor and knows only the exact typenumber, Google is able to find all the information about it. Such a service wasn’t provided by the magazines in the 1980s, they were focused on a subject. That means, in the IEEE spectrum only high quality content was given, but no infomration about computer games. If a user had such interests, the IEEE magazine wasn’t helpful. He couldn’t and didn’t wanted to fulfill the users need.

The reason why Google is loved by the end user is, that Google works always and for everybody. No matter if somebody is interested in beginner topics, in advanced articles or in a certain subject, Google provides the information. Not because Google is a new exciting content producer, but because Google has indexed all the information in the internet.

The byte magazine or the IEEE spectrum can’t be called obsolete. Because the content is read by many thousands users every day. What has changed is the relationship between the journals and the reader. The formally direct relationship is no longer available. It was killed by the internet. That means, the journals have lost their readers and instead they have get Google as some kind of anti-reader who is not really interested in their content but is crawling all the content available. Getting Google as a reader for a magazine or a news website is the nightmare of every author. It’s not possible to argue with a search engine or explain something to an index bot.

Advertisements

Some projects which have revolutionized Academic publishing

If we are searching how Academic publishing works, most information are from the year 2000 and before which contains outdated best-practice method. In some recent talks about library modernization the debate is grouped around the magic word “digitization” but this doesn’t describe very well what state of the art technology is. That is the reason, why a short overview is necessary which technology is available which has already changed the workflow in Academia.

The first one is Google Scholar. This search engine was mostly ignored. No books are published yet about the engine, and in the public debate the engine is invisible. That means it is the big elephant in the room, which is used by 100% of the researchers but nobody is talking about it, nor is courageous to describe it advances. What Google Scholar is, is very simply. Instead of searching in the metadata with Worldcat.org it is possible to search in the fulltext of all existing papers. Such kind of technology was not available before the year 2008 and especially not for free.

A second major breakthrough is the founding of Academia.edu. The advantage is, that the plattform is for free, open to everybody and allows to upload pdf document. All three features combined results into a very powerful distribution platform which can bypass existing publishers and existing libraries. Like in the Google Scholar case nobody is talking about it, but most are aware of it. What we see in reality, is some kind of professional ignorance. That means, if we are asking 100 scholar, nobody will say, that he heard of Academia.edu, but what he really want’s to express is, that the website doesn’t fit to the stories told about academic publishing.

The third important milestone in academic publishing is the invention of the LaTeX document system. It was invented long time ago, together with the UNIX operating system. LaTeX is more powerful than MS-Word and Adobe Indesign combined. It allows to create academic document which contains an bibliography in the pdf format.

What will happen if we combine all three inventions? Nobody knows it exactly, but it will change everything. A combination of Google Scholar, Academia.edu plus LaTeX is able to replace existing publishing infrastructure, will reduce the costs and make Science open to the world. Let us describe an example workflow. The easiest interaction mode with the scientific community is to read passive existing information. This is possible with the Google Scholar website. It allows to find high-quality documents without further costs. Google Scholar works outside of university library, a standard internet connection is enough. In some countries the access is blocked by the government, but this is also true for Wikipedia. Suppose, the user has read many documents, then he can write his own paper. With LaTeX this is very easy. He doesn’t need an external company who is doing the layout or proofreading, everything can be done alone within the LaTeX software and as output the user gets a PDF file. And now comes the magic step. Thanks to the help of Academia.edu it’s possible to upload the self-created document to the internet, so that everybody can read it. That means, the entire workflow in academic publishing can be done outside of traditional university system and without any costs. The software is available for free in the internet, and the mentioned websites are free to use.

Right now, the amount of people worldwide who are doing so is small. Most homeusers are not aware what Google Scholar is, or what the advantage of the bibtex format is. But, the technology is there and it’s working great, it will be only a question of how long it takes until millions of people will recognize the advantage for their own.

Effects

Why should somebody care about Google Scholar, Academia.edu and LaTeX? Because the combination of all three will make classical academic institutions obsolete. If the author has it’s own profile in the Internet at Academia.edu he doesn’t need anymore a publisher. And if all documents are digital only, no library is needed which archive existing information. And if the library is no longer needed, there is no need to go to the university to get the latest scientific research. That means, everything which is known about Academia worked in the past will become outdated. Universities, Publishers, Libraries and Author are getting under pressure. That means, they have to invent themself and they will fail. That is the main reason, why nobody about talking about the revolution. Because everybody is aware of the danger and he is trying to play the existing game as long as possible. But the major pressure is not against the institutions, it has to do with money. The new all digital publication system will become cheaper than ever before. A single paper will not be created without any costs, but it will be a fraction of the costs of the past.

Open Access Gold is preventing progress

Open Access advocates are promoting the model as future publishing system. Papers under that model are public accessible and the financing is secured by libraries. On the first look, is a promising model of how education works. But the truth is, that Open Access slowdowns the progress and is not the right way to go.

Let us focus on some preconditions under which Open Access Gold takes place. The first constraints is, that today’s publishing company remaining untouched. That means, the old major player like Wiley, Springer and Elsevier will become the dominant stakeholders in the Open Access Gold world too. And the second untouched assumption is, that the old-school libraries which are government founded will remain the same. That means, the taxpayer has to pay the invoice.

Instead of promoting Open Access Gold the better idea would be realizing a competing market in which Elsevier and Springer are under pressure and are replaced by newly founded publishing companies for example PLOS, while at the same time, the government sponsored libraries are replaced by privately financed houses which has customer and are more open to new technology.

Open Access Gold is nothing else but the admission, that nothing has to change in the publishing system. That means, Open Access is the combined effort of outdated publishers and outdated libraries to defend their weak position against technological progress. Open Access Gold will fail, it will replaced by academic publishing which has lower costs than today.

The third assumption of the Open Access Gold model is, that only in that mode it will become possible to make papers access for free to the world. That means, only if outdated publishers and obsolete libraries are in charge it is possible to provide free information for all. This assumption is wrong, because it describes not a free market but a monopolized situation which is financed by the government. The taxpayer is financing the libraries and the libraries are financing Elsevier. That means, Elsevier is financed by the taxpayer. The better approach is to cut down government sponsored publication and let the market decide. He is able to provide higher quality with lower costs.

Anti-market advocates are usually argue, that a market-driving academic publishing system is equal to pay wall protected content, which means that the public looses access to important information especially in subjects like medicine. The opposite is true, there are many examples out there in which a for-profit attitude is equal to free access for everybody. The best example is the Apple iTunes store which hosts thousands of podcasts and so called iTunes U lectures from privately owned universities which are streamed for free to the world. The financing of such free content can be done with hardware sales (which is done by Apple) or with advertisement. Another example of a “free to everybody” but commercial oriented website is Github which provides large amount of free content but is not payed by taxpayer’s money.

Publishing a paper the right way

Most people are aware of the SCIgen document generator, https://pdos.csail.mit.edu/archive/scigen/ which is able to generate a paper with random text. What is not known is how to publish such a paper in journal. The assumption is usually, that the context-free grammar in Scigen has a low quality and that the output of the paper is below the standard of an academic journal. But not so fast. Who has defined that the content of a paper is important?

My hypothesis is, that something else is more important. The author field. In the Scigen GUI it is possible to enter up to 5 authors. The assumption is, that Academia is group working, and if the names in the field are filled with values the quality of the paper is better.

Let us make an experiment. In the author field 1 my own name has to be enter. Because I’m the group leader. But I need 4 names more. If all the names are filled the group is able to publish a paper, and this paper is much better then a paper from a single researcher. Sure, the context-free grammar from Scigen remeins the same, but now the paper is the result of a group work, and this is what Academic publishers want.

A detailed looks in paper records

In a former blogpost, https://trollheaven.wordpress.com/2018/09/17/paper-statistics-for-artificial-intelligence/ i introduced short the paper statistics from https://www.scimagojr.com/countryrank.php The first chart was helpful to understand the general idea, but now i want to go into the details. The statistics goes back into the year 1996, and from all the subject I will only focus on “Artificial Intelligence”.

At first the worldwide paper production is interesting. From under 10000 papers about Artificial Intelligence in the mid 1990s the number grows to around 60000 in 2017.

What is also interesting is the percentage of the countries. In the mid 1990s academic papers were written by the United states and japan. Japan was strong because of the so called fifth generation computing, which was a state sponsored effort in Japan to bring Robotics and Artificial Intelligence forward. China was in that period nearly not visible. In the year 2005 china has created more papers on Artificial Intelligence than Japan and in 2008 they even surpass the U.S. Since then both countries have nearly the same amount of paper productions which smaller changes at the top. Each countries produces around 10000 papers each year.

What is not visible from the statistics is the used language. Probably the U.S. researchers are publishing in English. What the preferred language of chinese publishers is remains unclear. In Google Scholar, it is rare to find a paper from a chinese university, perhaps they were written in Chinese which makes it harder for an international audiance to find such papers. According to other statistics most academic research in China is done in Mandarin for a Chinese only audience, but that is only a guess. In some papers, also chinese researchers are in the author list and the paper was written in English.

And the same language barrier is happening with japanese papers. That means, even Japan is researching a lot, most papers are not visible for the world, because the researchers only can read English. If only English-written papers are allowed for the statistics, the situation looks a bit different, that means, the U.S. has the top position and then comes a huge gap to other countries.

Why is academic publishing so expensive?

Last week, a new documentary was released called “Paywall: The Business of Scholarship (2018)”. It as available as OpenAccess at https://paywallthemovie.com/paywall In the movie an important question was asked: why is publishing so expensive, why earn the publishers so much money? Answering the question is easy: because most parts of the chains in the workflow are not organized as market driven companies but as government owned organisations. The first examples are the public libraries in the united states: 100% of them have the same legal status like a court but it would make more sense to see a library something similar to a McDonalds. The second example are the schools and public universities. They are also not market driven but are in the hand of the government without the need to become profitable. Such an environment is not interested in reducing costs, and that is the reason why public founded libraries are paying so much money for Elsevier journals.

Overcome the problem is simple. It is called library privatization and is equal to transform public libraries and public university into for-profit cooperations which compete on a free market and have the need to reduce their costs. This would improve the efficiency and would allow newly founded OpenScience companies like “Academia.edu” or “Figshare” to work profitable.

Predatory publishing is the answer to a problem

Sometimes predatory publishing is described as a mistake in a wonderful working scientific ecosystem with lots of of high quality journals. But it is the other way around, predatory publishing is nothing which can be prevented or which can be avoided – it is the future. A general definition of predatory publishing is easy to give:

– It contains of an all electronic publishing which means the authors are uploading their papers to a website

– predatory publishing is commercial oriented, that means, there is money in the game,

– the paper can be downloaded as openaccess free for the reader, that means anybody can read the paper and it is possible to search in fulltext

It sounds a bit incomprehensible to call these ingredients problematic. It not an all digital publishing workflow is the goal, what else? Should future authors print out their manuscripts or write it down per hand? And if publishing has nothing to do with economic aspects, how should science be managed instead? Right, there is no alternative available to predatory publishers and the critics who are arguing against predatory publishing know it. They are not really against predatory publishers, they see the debate as some kind of a hoax. That means, calling a predatory publisher un-scientific is some kind of inside joke. It is equal to trolling the debate.

But i don’t want to be judge to harsh. Even a hoax debate about publishing in academics is a progress. In the past, there was no discussion available. That means, the scientific community discussed everything, but leaved out the most important part: writing a paper, sending a paper to a journal and retrieving existing papers. Journals were available that is true, but there was nobody out there who could give advice about their inner working. With the new topic of predatory publishing the situation has changed. It is the first time, that there are some explanation given what a paper is, and how publishing works.