By Guido España
IntroductionThere are several difficulties for researchers who try to replicate findings from other researchers. For instance, the instructions could be incomplete, and the code and data might not be available. ELife’s first computational reproducible article aims to solve this issue by allowing readers to interact with code that generates every figure in the paper. This open-source project (Stencila) is a step towards reproducible research. However, there are other obstacles that authors need to address in their work habits to be able to produce research findings that can be reproduced by others, but more critically, by themselves. Five obstacles towards reproducible researchPublishing a manuscript is an iterative process with several steps that go from a preliminary idea, an initial draft, and to, hopefully, publication. This long process involves many changes in the manuscript, code, and figures. These changes are due to numerous revisions from collaborators and journal reviewers, or from the researchers themselves. Many things can get on the way of reproducibility. Personally, I have experienced problems with keeping results up to date because often there are misconnections or manual steps in the process of gathering and analyzing data, and creating figures or tables. It is particularly difficult to remember all the steps to create a figure several months after the first draft of the manuscript. In my opinion, there are five main obstacles in the path of reproducible computational research: 1. Keeping results up to date, 2. Remembering previous versions of the manuscript, 3. Collaborating with co-authors, 4. Responding to reviewers, 5. Sharing reproducible research with other researchers. File managementThe first step towards reproducibility is file management. It is very important to name files in such a way that they are easy to understand for humans and computers. Files also need to be easy to recall. For humans, this means that names should be meaningful, for computers, this means that files should not contain strange characters (spaces, commas, periods, or others like #$%^). Directory structure is important to keep research files organized. For instance, to differenciate the manuscript files from the scripts and data, one could have a main directory ( Make filesGNU make is a tool to compile files (often executable files) that depend on many other files. GNU make is often used For emacs enthusiasts, org-mode is a powerful alternative. See this tutorial for an introduction to manuscript writing with org-mode. PandocUsing LaTeX (or knitr, or org-mode) would be ideal for anyone collaborating with a team of LaTeX users. However, this isn’t always the case. In many fields, MS Word is the prevalent word processor. If you want to use the capabilities of git and LaTeX, but you still want to share word documents with others, pandoc is probably the best alternative to convert between .tex and .docx documents. A brief introduction for this purpose is Alexander Branham’s guide to pandoc. Using pandoc, LaTeX documents can be converted to MS Word to share with collaborators, then their changes can be incorporated into the main LaTeX document. GitHub/GitLabGitHub or GitLab can be used to create a remote repository and share it with collaborators. If the workflow to create the manuscript has been automated, then sharing with others should be straightforward with GitHub or GitLab, where other researchers could replicate the manuscript results and use them in their own research. To share with the scientific community, a manuscript published in a scientific journal could include the URL for the GitHub repository with all the necessary steps to create the manuscript. Other resourcesThis post is a brief summary of tools useful for reproducible computational research. A lot of the content on this post comes from Gandrud’s book on reproducible research. I also created some slides and some demos of reproducible manuscripts using the tools mentioned here: LaTeX, knitr, and org-mode. You can download the slides here. Furthermore, there are many more materials online that are helpful:
|