by Yvonne Tunnat
The study “Open is not forever: a study of vanished open access journals” (PDF) from Mikael Laakso, Lisa Matthias and Najko Jahn sounds alarming at first, because it opens a wound in the Open Science community: Open Access (OA) journals sometimes simply disappear from the internet. Specifically, it concerns 176 OA journals from a wide range of disciplines that disappeared from the internet between 2000 and 2019. But Yvonne Tunnat shows in her article whether we need to worry about which journals and countries are affected and how institutes like the ZBW are working against the disappearance.
Vanished Open Access journals: Really a problem?
For both authors and readers, the sustainability of sources on the internet is indispensable. After all, we want to be able to rely on the fact that something we read on the internet today and possibly reference or write for our own work will still be there tomorrow. Of course, the authors would also like to publish their papers in journals that will preserve them for a long time.
But before we actually investigate the vanished journals in detail, it makes sense to first look at the definition of “vanished”. To be considered “vanished” in the sense of the study by Laakso, Matthias and Jahn, OA journals must fulfil two conditions: (1) At least one issue of the OA journal must have been published and (2) the journal has disappeared from the internet (and even via other channels, such as libraries, less than 50% of the articles are still available).
According to the Science article “Dozens of scientific journals have vanished from the internet, and no one preserved them” by Jeffrey Brainard, the examples from the study are basically not journals published by well-known publishers or repositories. None of the institutions whose OA journals have disappeared from the internet have played a major role in research. Some of the “vanished” OA journals were even merely so-called Predatory Journals. Those would not be re-indexed by the Directory of Open Access Journals (DOAJ) and would therefore be considered as disappeared. A number of other articles have taken up the study and drawn attention to it in abridged form. But how big is the influence on the daily work of scientists really? A look at the figures helps to answer this question:
The tables show that most OA journals disappeared between 2010 and 2014, 112 out of 176, with the majority (110 journals) originating from North America and Europe, while only seven journals vanished in the Middle East and Africa.
These figures are particularly meaningful in view of the total number of existing Open Access journals. 14,068 journals were listed in the Directory of Open Access Journals in 2019. The 176 journals thus represent only 1.25% of the total available OA journals. Overall then, the phenomenon of vanishing OA journals is clearly the exception.
Digital long-term archiving often not secured
However, the study by Laakso, Matthias and Jahn points to another problem: Only 4,057, or almost 29%, of the 14,068 OA journals have secure digital long-term archiving. This means that although their content is currently hosted on servers and possibly stored redundantly in the server room in the background, there is no control over whether the file formats are still up-to-date and readable. More importantly, as most of the articles tend to be in unproblematic file formats such as PDF, there is no second place to store the content when the journals disappear. Such second locations, for both restricted and Open Access journals, could be libraries or long-term archiving solutions such as LOCKSS, CLOCKSS (= Controlled LOCKSS) and Portico (Portico, CLOCKSS and LOCKSS archive both Open Access and closed access journal titles).
Of course, this does not necessarily mean that content that is not backed up in a second location is in immediate danger of disappearing from the internet tomorrow. Moreover, it is also true that the risk of disappearance does not only concern OA journals but also access-restricted content – only these are outside the focus of the study.
ZBW example: Long-term archiving of economic literature
An essential element in preventing the disappearance of relevant Open Access journals is comprehensive digital long-term archiving. I will show how this can work using the example of my work area in the ZBW. Its collection mandate for economics literature is explicitly supra-regional, this means it covers many languages and countries. However, the ZBW cannot collect everything, so a focus is placed on the most important literature for our users. For economics, we can assume that those journals that have actually disappeared from the internet within the last 20 years have been under our radar because either language or specific subject matter was too exotic and therefore not sufficiently relevant for our users.
The ZBW clearly formulates the goal that what was once found in the ZBW holdings will also be found there in the future. We guarantee the backup under the persistent address and ensure integrity and readability of the content. What we have taken responsibility for, we back up with a claim to completeness for an indefinite period. The most important Open Access source of the ZBW is EconStor. As a subject-based repository for economic sciences, it is now used by more than 600 scientific institutions and over 1000 individual authors to distribute their publications in Open Access. In addition, over 100 journals use EconStor to secure their content. The Digitale Archiv at the ZBW is responsible for literature that is not author-based (such as statistics or company reports) or that can only be made available with restricted access. Both EconStor and the Digital Archive are long-term archived by the ZBW. This means that the content cannot simply disappear from the internet.
The ZBW also offers its users access to restricted journals. However, since it does not have hosting rights for these, it cannot guarantee digital long-term archiving here. In 2017, we examined to what extent long-term archiving is already assured for these journals and came to the conclusion that sustainable availability is already ensured for the majority via CLOCKSS, Global LOCKSS, Portico, the Library of Congress or the Scholars Portal.
How digital long-term archiving works in practice
For digital long-term archiving, we transfer the materials hosted by the ZBW into our long-term archiving system. This is built on the Rosetta software solution from ExLibris. Rosetta offers mechanisms that regularly check content and file formats for integrity and technological relevance. As a rule, the content is saved in PDF format. We have, therefore, already established extensive automated quality controls for this format. The PDF format is a sustainable format, provided it meets certain quality requirements. If it does not meet these requirements, it will be reworked by us accordingly. Content in other file formats is also closely monitored and processed by us if necessary.
Another important field of work is metadata. Not only content metadata – such as information on the title, authors and year of publication – are recorded and maintained by us, but also technical metadata (for instance on the format and display software) are determined by us and updated if necessary. Published and widely used file formats are archived in preference. The ZBW maintains a comprehensive policy for long-term archiving and also a file format policy .
Tips for long-term archiving
Finally, a few practical recommendations for newcomers to long-term archiving: From our perspective, we can recommend, for example, the use of Open Source tools that are used and maintained by the community. This includes, for example, software for file format recognition, which can automatically determine the exact format of the archived content – a basic requirement for ensuring the long-term usability of the file.
Sometimes it is necessary to convert the content to a new, more modern and secure file format. Migration software for this is also partly Open Source and is maintained by the community. An important address for the maintenance and provision of such tools is the Open Preservation Foundation. It is also important to share experiences in long-term archiving with the community. The most important contact in the German-speaking countries is the nestor network. The ZBW is currently active in a total of three nestor working groups. The results of their work are, in principle, open to everyone. nestor also organises workshops and conferences, which are also open to everyone free of charge. Here knowledge is shared and best practices are jointly developed.
nestor is connected to international networks such as the Digitalen Preservation Coalition. Since their work results and jointly maintained tools are open to all, the entry into digital long-term archiving is also made easier. In doing so, it is possible to draw on the experience of others and thus avoid mistakes and reach goals faster.
The national and international community works together in many places and supports newcomers. The ZBW has been an integral part of this community for ten years. The more knowledge and better tools are publicly available, the easier it is for smaller institutions and publishers to tackle digital preservation. This will hopefully lead to a situation where in the next study, the number of vanished OA journals will be much smaller and less significant than in the study by Laakso, Matthias and Jahn.
This text has been translated from German.
Yvonne Tunnat is a library scientist and responsible for digital long-term archiving at the ZBW – Leibniz Information Centre for Economics. Within the community, she has strong national and international links, mainly in her field of work of file format recognition and file validation.
Source of the images:
Diagrams: “Open is not forever: a study of vanished open access journals” (PDF) by Mikael Laakso, Lisa Matthias and Najko Jahn (own presentation).
The YES! Spirit: Activating a Young Community of Users and Transferring Knowledge
For a long time now, the younger generation has been accused of showing only lukewarm...