FAIR biodiversity data in Pensoft journals thanks to a routine data auditing workflow

Data audit workflow provided for data papers submitted to Pensoft journals.

To avoid publication of openly accessible, yet unusable datasets, fated to result in irreproducible and inoperable biological diversity research at some point down the road, Pensoft takes care for auditing data described in data paper manuscripts upon their submission to applicable journals in the publisher’s portfolio, including Biodiversity Data JournalZooKeysPhytoKeysMycoKeys and many others.

Once the dataset is clean and the paper is published, biodiversity data, such as taxa, occurrence records, observations, specimens and related information, become FAIR (findable, accessible, interoperable and reusable), so that they can be merged, reformatted and incorporated into novel and visionary projects, regardless of whether they are accessed by a human researcher or a data-mining computation.

As part of the pre-review technical evaluation of a data paper submitted to a Pensoft journal, the associated datasets are subjected to data audit meant to identify any issues that could make the data inoperable. This check is conducted regardless of whether the dataset are provided as supplementary material within the data paper manuscript or linked from the Global Biodiversity Information Facility (GBIF) or another external repository. The features that undergo the audit can be found in a data quality checklist made available from the website of each journal alongside key recommendations for submitting authors.

Once the check is complete, the submitting author receives an audit report providing improvement recommendations, similarly to the commentaries he/she would receive following the peer review stage of the data paper. In case there are major issues with the dataset, the data paper can be rejected prior to assignment to a subject editor, but resubmitted after the necessary corrections are applied. At this step, authors who have already published their data via an external repository are also reminded to correct those accordingly.

“It all started back in 2010, when we joined forces with GBIF on a quite advanced idea in the domain of biodiversity: a data paper workflow as a means to recognise both the scientific value of rich metadata and the efforts of the the data collectors and curators. Together we figured that those data could be published most efficiently as citable academic papers,” says Pensoft’s founder and Managing director Prof. Lyubomir Penev.
“From there, with the kind help and support of Dr Robert Mesibov, the concept evolved into a data audit workflow, meant to ‘proofread’ the data in those data papers the way a copy editor would go through the text,” he adds.
“The data auditing we do is not a check on whether a scientific name is properly spelled, or a bibliographic reference is correct, or a locality has the correct latitude and longitude”, explains Dr Mesibov. “Instead, we aim to ensure that there are no broken or duplicated records, disagreements between fields, misuses of the Darwin Core recommendations, or any of the many technical issues, such as character encoding errors, that can be an obstacle to data processing.”

At Pensoft, the publication of openly accessible, easy to access, find, re-use and archive data is seen as a crucial responsibility of researchers aiming to deliver high-quality and viable scientific output intended to stand the test of time and serve the public good.

CASE STUDY: Data audit for the “Vascular plants dataset of the COFC herbarium (University of Cordoba, Spain)”, a data paper in PhytoKeys

To explain how and why biodiversity data should be published in full compliance with the best (open) science practices, the team behind Pensoft and long-year collaborators published a guidelines paper, titled “Strategies and guidelines for scholarly publishing of biodiversity data” in the open science journal Research Ideas and Outcomes (RIO Journal).

Recipe for Reusability: Biodiversity Data Journal integrated with Profeza’s CREDIT Suite

Through their new collaboration, the partners encourage publication of dynamic additional research outcomes to support reusability and reproducibility in science

In a new partnership between open-access Biodiversity Data Journal (BDJ) and workflow software development platform Profeza, authors submitting their research to the scholarly journal will be invited to prepare a Reuse Recipe Document via CREDIT Suite to encourage reusability and reproducibility in science. Once published, their articles will feature a special widget linking to additional research output, such as raw, experimental repetitions, null or negative results, protocols and datasets.

A Reuse Recipe Document is a collection of additional research outputs, which could serve as a guidelines to another researcher trying to reproduce or build on the previously published work. In contrast to a research article, it is a dynamic ‘evolving’ research item, which can be later updated and also tracked back in time, thanks to a revision history feature.

Both the Recipe Document and the Reproducible Links, which connect subsequent outputs to the original publication, are assigned with their own DOIs, so that reuse instances can be easily captured, recognised, tracked and rewarded with increased citability.

With these events appearing on both the original author’s and any reuser’s ORCID, the former can easily gain further credibility for his/her work because of his/her work’s enhanced reproducibility, while the latter increases his/her own by showcasing how he/she has put what he/she has cited into use.

Furthermore, the transparency and interconnectivity between the separate works allow for promoting intra- and inter-disciplinary collaboration between researchers.

“At BDJ, we strongly encourage our authors to use CREDIT Suite to submit any additional research outputs that could help fellow scientists speed up progress in biodiversity knowledge through reproducibility and reusability,” says Prof. Lyubomir Penev, founder of the journal and its scholarly publisher – Pensoft. “Our new partnership with Profeza is in itself a sign that collaboration and integrity in academia is the way to good open science practices.”

“Our partnership with Pensoft is a great step towards gathering crucial feedback and insight concerning reproducibility and continuity in research. This is now possible with Reuse Recipe Documents, which allow for authors and reusers to engage and team up with each other,” says Sheevendra, Co-Founder of Profeza.