A living document: reincarnating the research article
© Shanahan; licensee BioMed Central. 2015
Received: 22 January 2015
Accepted: 20 March 2015
Published: 11 April 2015
The limitations of the traditional research paper are well known and widely discussed; however, rather than seeking solutions to the problems created by this model of publication, it is time to do away with a print era anachronism and design a new model of publication, with modern technology embedded at its heart. Instead of the current system with multiple publications, across multiple journals, publication could move towards a single, evolving document that begins with trial registration and then extends to include the full protocol and results as they become available, underpinned by the raw clinical data and all code used to obtain the result. This model would lead to research being evaluated prospectively, based on its hypothesis and methodology as stated in the study protocol, and move away from considering serendipitous results to be synonymous with quality, while also presenting readers with the opportunity to reliably evaluate bias or selective reporting in the published literature.
When the Royal Society first advocated the transparent and open exchange of ideas backed by experimental evidence, the Society was widely ridiculed. At the time, the concept of openly sharing your work in a research article was highly controversial. It was not uncommon for new discoveries to be announced by describing them in papers coded in anagrams or cyphers  - reserving priority for the discoverer, but largely indecipherable for anyone not already in on the secret. Both Newton and Leibniz used this device.
As you might imagine, this led to a number of disputes over priority, and it seems rather absurd to us today. However, since the advent of the research article over 300 years ago, academic publishing has been viewed as a way of minuting what was done and sharing the results .
Three-hundred years is a long time; technology has seen huge advancements over the last 20 years alone. The Internet has seismically disrupted the way we both communicate and find data, displacing traditional information delivery and becoming an integral part of life for millions. The increased availability of information has led to calls for greater transparency in research - for a clear, detailed record of exactly what was done, and how, to allow the work to be reliably reproduced. Despite this, many journals perpetuate the view of research articles as ‘minutes’. Print era anachronisms persist through the continuation of page and word limits and the release of discrete issues, as if all articles remain subject to print-only production constraints. Indeed, it was only recently that certain top journals elected to remove the word limits on their methods sections . It brings to mind Fermat’s aside to his infamous last theorem written in Artimetica in 1637, claiming that the proof for what he stated was ‘too large to fit in the margin’ .
Where is the value in the research article?
Research only has value if the methods used are appropriate and it is reproducible . However, in modern biomedical research, the majority of published research claims may in fact be impossible to reproduce [6-8]. Many reported results are later refuted, and controversy is seen across the entire range of research designs, from randomised controlled trials (RCTs) to traditional epidemiological studies [9-11]. Even for studies following ‘gold standard’ reporting and open data policies, researchers face difficulties in replicating them .
One possible explanation for this, as hypothesized by Ioannidis et al., is that controversial data are attractive to investigators and editors, making contradictory results more likely to be published than confirmatory ones [7,13]. However, reviews of published trials consistently show that, even for those articles that are published, key information is frequently missing . There is also growing evidence that space pressures influence the way that researchers choose to write up their studies, with a bias in favour of selecting those outcomes and analyses that are statistically significant [15,16].
It is concerns like these that led to widespread calls for registering trials [17,18], pre-specifying the research outcomes and methods. Similarly, reporting guidelines were created to outline the minimum information required for a full and complete report, with evidence that the adoption of reporting guidelines, such as the CONSORT Statement, has led to improved reporting . Journals like Trials also encourage prospective publication of study protocols, which had rarely been possible in paper-based journals [20,21]; publication of study protocols allows for more detailed discussion of methodological issues, which can be referenced when reporting the main trial results .
However, researchers need access to all of the relevant information, to reliably evaluate bias or selective reporting in clinical trials. As any systematic reviewer can tell you, identifying all publications related to a single clinical trial can be a Sisyphean task. Indeed, there are initiatives in the works to assist with this effort , but regardless of the success of these initiatives, this simply serves to highlight the absurdity of having separate ‘protocol’ papers and ‘results’ papers. These are all solutions to a problem that we ourselves have created.
Indeed in 1963, Peter Medawar asked whether the scientific paper itself was a fraud. He maintained that the research article was a ‘travesty […] which editors themselves often insist upon’, insisting that research articles give ‘a totally misleading narrative of the processes of thought that go into the making of scientific discoveries’. A paper’s fraud, Medawar argued, lay mainly in its form .
A ‘living’ document
It is time to ask ourselves whether the research article itself has now become an anachronism. In contrast to an article of the print era, an article that has been published online is not a sealed black box. It can be updated, amended, extended and indeed directly linked to other articles and data.
So why do we with persist with this paradigm whereby each new ‘stage’ in the research cycle results in a separate publication? It is time for the research article to move beyond the now-obsolete print model and truly embrace the freedom that online publication gives us, moving towards living documents, with a single article for a single piece of research.
While the article would evolve over time, substantive additions to the article that were judged to impact the scientific validity of the literature would require peer review, as shown in Figure 1. In these cases, the article could be frozen into a discrete version, with the reviewer reports associated with it. This model is already used by journals that operate on a post-publication peer review process, such as F1000Research and ScienceOpen [27,28]. Citations to the document would then be required to include the access date, which would uniquely identify the version of the article referred to.
Creation of a living document that could be updated as required, would allow researchers to capture the information in real time, allowing for simpler concurrent research projects and facilitating reporting, as the authors would only need to focus on a specific section at any one time, rather than attempt to identify and follow all the relevant reporting guidelines for the study from over two hundred , when finally writing it up.
This concept of an evolving document is already demonstrated for systematic reviews by the Living Reviews series of open access journals, which allow the authors to regularly update their articles to incorporate the latest developments in the field ; however, it has not been applied to primary research. Extending this concept to primary research could cause the article to become unwieldy under the traditional IMRAD headings, particularly for large clinical trials with an associated large number of analyses; however, this is already the case for traditional results papers. These concerns have led to journals requiring core statistical methods to be included in the figure captions of presented results, as well as innovative navigation tools to allow readers to view the research methods and analyses simultaneously, for example, eLife Lens .
Reproducibility also requires the ability to manipulate and re-analyse data; therefore, as stated by Claerbout, in addition to any summary results included to support the written interpretation, the document should link to the raw clinical data and all code used to obtain the result . An immense amount of work has gone into the creation of reproducible research platforms and the concept of ‘literate programming’. This has led to the development of a whole programming format, SWeave, which allows the creation of dynamic reports with code integrated into LaTeX documents, which can be updated automatically if data or analyses change . Similarly, Kauppinen et al. established and defined Linked Open Science, an approach to interconnect scientific assets to enable transparent, reproducible and transdisciplinary research .
The dramatic decrease in data storage costs  and emergence of virtual environments, such as Arvados , make it possible to enable reproducibility of data analysis with versioned scripts and tools. Trialists can deposit the data, tools and scripts they used to analyse the data, allowing readers to see how robust the visualisations and statistics embedded in the paper are.
Underpinning the results and interpretations with the original data and analyses tools has obvious benefits for conducting meta-analyses and systematic reviews, as well as for reproducibility of research. Similarly, creation of an evolving document for a single research project would make evaluation of selective reporting of both analyses and outcomes straightforward, as all the necessary information and methods would be reported in the same place. However, there are limitations compared with the existing publication paradigm. As the article is able to continuously evolve, there is no permanent ‘version of record’; therefore, the articles would need ongoing curation, which could cause issues in the event of a journal closure. As stated by Barnes in the Science Code Manifesto, ‘Source code must remain available, linked to related materials, for the useful lifetime of the publication’ . While a discrete version could be created in such instances, it would prevent further updating of the article, which could lead to the literature being incomplete.
Furthermore, by encouraging and facilitating reproduction, this raises the issue of how to combine original research articles with follow-up replication or analyses by a different group of authors. Including these follow-up studies in the original living document could cause issues with accreditation; however, it could also help to emphasise that reproduction is a fundamental part of research, leading to large research consortia, as currently seen in physics and genetics. An alternative to this would be to adapt the existing ‘update’ article types, creating a separate citation, but accessed in tandem to the original article.
A continuously-evolving document would also undermine existing methods of evaluating the impact of a piece of work, particularly metrics like the Impact Factor or any article- or journal-level metric that relies on the date of publication. As study protocols are seldom cited, a living document is unlikely to be cited regularly until the article has been expanded to include the results and interpretation; however, this means that citations to the article could come a number of years after original publication and, therefore, would not be included in the Impact Factor calculations. However, this could also prove an advantage, as implementation of living documents, as described above, would require a journal to commit to publishing the results of a piece of research based on the methodological quality of the protocol, regardless of outcome or significance of findings, or considered level of interest. This could help to move away from a results focus to considerations of the question asked and the processes used, when evaluating scientific validity.
Current technology means that this form of publication is theoretically possible already. However, contemporary cultural attitudes and workflows, within both publishing and academia, along with research conduct and evaluation, present barriers to its implementation. Evaluating research prospectively, based on its hypothesis and methodology as stated in the study protocol, and then continuously updating the article as results and data become available, moves us past considering serendipitous results as being synonymous with quality, while also giving us the opportunity to reliably evaluate bias or selective reporting in the published literature.
The current incarnation of the research article has persisted for over 300 years; however, evolving technology makes it, not simply anachronistic, but effectively fraudulent. While cultural attitudes and establishments remain a large hurdle, both within the publishing and academic communities, the ongoing drive towards transparency and reproducibility make it no longer acceptable to continue to perpetuate a centuries-old absurdity.
Consolidated Standards of Reporting Trials
Introduction, Methods, Results and Discussion
- Nielsen M. Reinventing Discovery: The New Era of Networked Science. Princeton: Princeton University Press; 2011.View ArticleGoogle Scholar
- Velterop JJM. Keeping the minutes of science. In: Electronic library and visual information research (ELVIRA 2). In: Collier M, Arnold K, editors. Proceedings of the second ELVIRA conference at De Montfort University. Milton Keynes: Aslib; 1995.
- Joining the reproducibility initiative. Nat Nanotechnol. 2014; 9:949.
- Singh S. Fermat’s Last Theorem. London: Fourth Estate Ltd.; 2002.Google Scholar
- Lang T, Altman D. Basic statistical reporting for articles published in clinical medical journals: the Statistical Analyses and Methods in the Published Literature, or SAMPL Guidelines. In: Science Editors’ Handbook. Edited by Smart P, Maisonneuve H, Polderman A: European Association of Science Editors; 2013. 175–179.
- Prinz F, Schlange T, Asadullah K. Believe it or not: how much can we rely on published data on potential drug targets? Nat Rev Drug Discov. 2011;10:712.View ArticlePubMedGoogle Scholar
- Mobley A, Linder SK, Braeuer R, Ellis LM, Zwelling L. A survey on data reproducibility in cancer research provides insights into our limited ability to translate findings from the laboratory to the clinic. PLoS One. 2013;8:e63221.View ArticlePubMedPubMed CentralGoogle Scholar
- Begley CG, Ellis LM. Drug development: Raise standards for preclinical cancer research. Nature. 2012;483:531–3.View ArticlePubMedGoogle Scholar
- Ioannidis JP, Haidich AB, Lau J. Any casualties in the clash of randomised and observational evidence? BMJ. 2001;322:879–80.View ArticlePubMedPubMed CentralGoogle Scholar
- Lawlor DA, Davey Smith G, Kundu D, Bruckdorfer KR, Ebrahim S. Those confounded vitamins: what can we learn from the differences between observational versus randomised trial evidence? Lancet. 2004;363:1724–7.View ArticlePubMedGoogle Scholar
- Vandenbroucke JP. When are observational studies as credible as randomised trials? Lancet. 2004;363:1728–31.View ArticlePubMedGoogle Scholar
- Morrison SJ. Reproducibility project: cancer biology: time to do something about reproducibility. eLife. 2014;3:e03981.View ArticlePubMed CentralGoogle Scholar
- Ioannidis JPA. Early extreme contradictory estimates may appear in published research: the Proteus phenomenon in molecular genetics research and randomized trials. J Clin Epidemiol. 2008;58:543–8.View ArticleGoogle Scholar
- Moher D, Hopewell S, Schulz KF, Montori V, Gøtzsche PC, Devereaux PJ, et al. CONSORT 2010 explanation and elaboration: updated guidelines for reporting parallel group randomised trials. BMJ. 2010;340:c869.View ArticlePubMedPubMed CentralGoogle Scholar
- Chan AW, Hrobjartsson A, Haahr MT, Gotzsche PC, Altman DG. Empirical evidence for selective reporting of outcomes in randomized trials: Comparison of protocols to published articles. JAMA. 2004; 291: 291(20):2457-65.
- Dwan K, Altman DG, Clarke M, Gamble C, Higgins JP, Sterne JA, et al. Evidence for the selective reporting of analyses and discrepancies in clinical trials: a systematic review of cohort studies of clinical trials. PLoS Med. 2014;11:e1001666.View ArticlePubMedPubMed CentralGoogle Scholar
- Simes RJ. Publication bias: the case for an international registry of clinical trials. J Clin Oncol. 1986;4:1529–41.PubMedGoogle Scholar
- International Committee of Medical Journal Editors (ICMJE). Clinical Trial Registration. http://www.icmje.org/recommendations/browse/publishing-and-editorial-issues/clinical-trial-registration.html. Accessed 19 March 2015.
- Turner L, Shamseer L, Altman DG, Schulz KF, Moher D. Does use of the CONSORT Statement impact the completeness of reporting of randomised controlled trials published in medical journals? A Cochrane Review Syst Rev. 2012;1:60.PubMedGoogle Scholar
- Altman DG, Furberg CD, Grimshaw JM, Rothwell PM. Lead editorial Trials - using the opportunities of electronic publishing to improve the reporting of randomised trials. Trials. 2006;7:6.View ArticlePubMedPubMed CentralGoogle Scholar
- Chalmers I, Altman DG. How can medical journals help prevent poor medical research? Some opportunities presented by electronic publishing. Lancet. 1999;353:490–3.View ArticlePubMedGoogle Scholar
- Chan AW, Tetzlaff JM, Gøtzsche PC, Altman DG, Mann H, Berlin J, et al. SPIRIT 2013 explanation and elaboration: guidance for protocols of clinical trials. BMJ. 2013;346:e7586.View ArticlePubMedPubMed CentralGoogle Scholar
- Altman DG, Furberg CD, Grimshaw JM, Shanahan DR. Linked publications from a single trial: a thread of evidence. Trials. 2014;15:369.View ArticlePubMedPubMed CentralGoogle Scholar
- Horton R. The scientific paper: fraudulent or formative? Abstract presented at Third International Congress on Biomedical Peer Review and Global Communications, Prague, September 1997. www.peerreviewcongress.org/abstracts_1997.html#thfo. Accessed 19 March 2015.
- Cochrane J. The Third International Stroke Trial (IST-3) - an exemplary threaded publication? http://blogs.biomedcentral.com/bmcblog/2012/05/25/the-third-international-stroke-trial-ist-3-an-exemplary-threaded-publication/ Accessed 19 March 2015.
- WHO Data Set http://www.who.int/ictrp/network/trds/en/. Accessed 19 March 2015.
- Amsen E. What is post-publication peer review? http://blog.f1000research.com/2014/07/08/what-is-post-publication-peer-review/. Accessed 19 March 2015.
- Grossman, A. Science Open: the next wave of Open Access? http://www.euroscientist.com/scienceopen-next-wave-open-access/. Accessed 19 March 2015.
- EQUATOR Network http://www.equator-network.org/. Accessed 19 March 2015.
- Wheary J, Wild L, Schutz B, Weyher C. Living reviews in relativity: thinking and developing electronically. J Electronic Publishing. 1998; 4(2).
- elife Science introduces eLife Lens. http://elifesciences.org/elife-news/elife-sciences-introduces-elife-lens-a-cutting-edge-open-source-tool-for-reading-using-content-online. Accessed 19 March 2015.
- Fomel S, Claerbout J. Guest editors’ introduction: reproducible research. Comput Sci Engineering. 2009;11:5–7.View ArticleGoogle Scholar
- Leisch F. Sweave, Part I: Mixing R and LaTeX: a short introduction to the Sweave file format and corresponding R functions. R News. 2002;2:28–31.Google Scholar
- Kauppinen T, Espindola GMD. Linked open science-communicating, sharing and evaluating data methods and results for executable papers. Procedia Comput Sci. 2011;4:726.View ArticleGoogle Scholar
- A history of storage costs (update) http://www.mkomo.com/cost-per-gigabyte-update/. Accessed 19 March 2015.
- Nature encode http://www.nature.com/encode/#/threads/. Accessed 19 March 2015.
- Science Code Manifesto homepage. http://sciencecodemanifesto.org/. Accessed 19 March 2015.
This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly credited. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.