Multi-university collaboration developing platform to improve research reproducibility

Author: Brandi Klingerman

Whole Tale

In a report by Nature, 52 percent of surveyed researchers expressed that they believed there is a significant crisis when it comes to reproducing research. From engineering to biology, there is at least some concern of whether or not a given study’s results can be reproduced and therefore utilized in another study. To overcome this challenge, computational scientists from five research universities, including the University of Notre Dame, are developing a cyberinfrastructure and supporting tools that allow researchers to conduct and track their work – including data and methodologies – in a reproducible way.

At the end of February, Notre Dame hosted about twenty computer and social scientists and software developers to work on the multi-university collaboration, called Whole Tale, at McCourtney Hall. The meeting not only provided an opportunity for the principal investigators to discuss the latest developments of the project, but also for the software developers from across each of the research institutions to discuss larger programming challenges of the project. One of these challenges includes finding a balance between the goals of the Whole Tale system – making research more easily reproducible – and ensuring the platform is user-friendly so it is usable for a large breadth of researchers.

“Although the vast majority of research published today is computationally-based, a gap has developed in how the scientific community shares and therefore reproduces research,” said Jarek Nabrzyski, director of the Center for Research Computing (CRC) at the University of Notre Dame and co-investigator on the project. “Whole Tale is designed as a platform to fill this gap, by providing an outlet where researchers can carry-out and disseminate their research.”

The goal of National Science Foundation-funded project is to enable the creation of “living publications” that integrate and link data, computations, and scholarly articles. This way, once research is completed, another scientist or engineer could more effectively recreate the study using the same data sets, methods, and anything else that is necessary for reproducing the study. To create these living publications, researchers will work in the Whole Tale platform, an online environment that allows users to register, store data, and compute data. More importantly though, scientists will also be able upload their research as a “tale.” 

In explanation, Bertram Ludäscher, professor at the School of Information Sciences at the University of Illinois at Urbana-Champaign and lead principal investigator on the project said, “A tale captures the whole story of a user’s research. This means that when someone else goes to view that research tale, they will be able to see the source data used in analyses and models as well as the resulting data and variations all within the working environment the research was conducted in.”

Once a tale is created, it could potentially be linked in a corresponding, published journal article. This ensures that other researchers that would like to build off of or reproduce a study’s results have transparent access to essential information. Currently, scientists and researchers are unlikely to gain access to the data sets used in a study, let alone the methodologies, and original working environment. 

“Whole Tale could impact different facets of the research ecosystem, from what it means to be a good citizen scientist – responsibly tracking your work for the research community so that it can be reproduced – to how professional journals publish research, and universities set standards for faculty hiring.” said Victoria Stodden, associate professor of information sciences at the University of Illinois Urbana-Champaign and co-investigator on the project.

The Whole Tale online platform is currently in alpha, or still being developed. Since the platform is intended to address real scientific cases, the project is currently allowing a number of science and cyberinfrastructure working groups to work in the platform. Otherwise, the dashboard will open to the research community at a later date. Until then, those interested can learn more about the project by visiting http://wholetale.org

The Whole Tale team consists of researchers and programmers, which engage with the community through a number of working groups. Other co-investigators include Kyle Chard, fellow at the computational institute at the University of Chicago; Niall Gaffney, director of Data Intensive Computing at the Texas Advanced Computing Center at the University of Texas at Austin; Matthew Jones, director of Informatics Research and Development at the National Center for Ecological Analysis & Synthesis at the University of California, Santa Barbara, and Matthew Turk, assistant professor at the University of Illinois at Urbana-Champaign. 

Additional University of Notre Dame contributors include Ian Taylor, distributed computing and data science research professor, and two software developers at the CRC: Sebastian Wyngaard and Adam Brinckman

Contact

Brandi R. Klingerman / Research Communications Specialist

Notre Dame Research / University of Notre Dame

bklinger@nd.edu / 574.631.8183

research.nd.edu / @UNDResearch

About Notre Dame Research

The University of Notre Dame is a private research and teaching university inspired by its Catholic mission. Located in South Bend, Indiana, its researchers are advancing human understanding through research, scholarship, education, and creative endeavor in order to be a repository for knowledge and a powerful means for doing good in the world. For more information, please see research.nd.edu or @UNDResearch.