Digitising UK Natural History Collections is vital to understand life on Earth, reports the Natural History Museum

In a paper published in the journal Research Ideas and Outcomes, authors estimate £18 million has been saved in efficiencies by researchers accessing digital specimens rather than physical collections.

· Scientists from the Natural History Museum (NHM) deep-dive into the uses and users of natural history collections held in the UK

· Modest estimates report a saving of £18 million in efficiencies by researchers accessing digital data rather than physical collections

· Today, software can complete in a week what it would take a human two years to achieve

· Call for investment to secure the UK’s stance as a world superpower in science and tech, and for a future in which both people and planet thrive

A new report has evaluated the use and impact of digitised natural science collections held in the UK and how they contribute to scientific, commercial and societal benefits.

UK natural science collections hold more than 137 million items spanning an incredible 4.56-billion-year history of life on Earth. These collections have emerged as a pivotal data resource to understanding the Earth in its past and current state – and will continue to inform the investors and policy-makers of the future.

UK natural science data in demand

GBIF—the Global Biodiversity Information Facility—is an international database providing open access data on all types of life on Earth. In this paper led by the NHM, scientists report that there are 7.6 million specimens, less than 6% of total UK natural science collections sampled, freely accessible on GBIF.

They found that 12% of the total peer-reviewed journal articles citing GBIF data specifically cite UK natural science collections. These data currently make up just 0.3% of total occurrences on GBIF, meaning they punch an incredible 40 times above their weight.

When asked previously, over 90% of GBIF users linked their use of these data to advancing the UN Sustainable Development Goals which look to reduce hunger, poverty and inequality, and spur economic growth while tackling climate change and protecting the oceans and forests.

The case for digitising UK natural science collections

The introduction of these collections onto a digital platform has revolutionised scientific research. In this paper published in the journal Research Ideas and Outcomes, the authors estimate £18 million has been saved in efficiencies by researchers accessing digital specimens rather than physical collections, assuming a minimal single physical visit replaced per citation. Of this, £1.4 million has been attributed to UK researchers, money which can be reinvested back into UK science institutions – those at the forefront of finding solutions to real world problems.

Lead author and Deputy Head of Digital, Data and Informatics, Helen Hardy says, ‘The advancement of digitisation has been truly transformational to the scientific community. Today it’s possible to use software that takes a week to achieve the type of information gathering it would take a human over 3,000 hours, or two years, to complete – individuals realising an entire life’s work in just a few months! Anticipation is high for further innovations such as the further integration of artificial intelligence into taxonomic work.’

UK government want the UK to be a science and technology superpower, and natural science collections provide a unique opportunity to achieve this. To unlock the true potential of collections data, UK Natural Science collections are joining forces through the Distributed System of Scientific Collections UK (DiSSCo) to make the case for investment of £155 million in a research infrastructure which is expected to unlock at least a seven- to ten- fold economic return on investment. Working alongside the Arts & Humanities Research Council (AHRC) and UK Research and Innovation (UKRI) to digitise the critical mass of collections, the data will be available through a robust technological infrastructure and continually developed in line with recent innovations.

Ken Norris, Deputy Director of Science at the NHM says, ‘In the midst of a planetary emergency, and what some experts believe to be the Earth’s sixth mass extinction event, estimates say that over 50% of the world’s GDP, which equates to approx. 44 trillion dollars, is dependent on the natural world. By understanding what is in collections now, both on a national and international scale, we can identify trends, necessary actions, and what we need to collect to underpin policy and investment decisions for a future where people and planet thrive.’

Hardy H, Livermore L, Kersey P, Norris K, Smith V, Pullar J (2023) Users and uses of UK Natural History Collections – a Summary, https://doi.org/10.5281/zenodo.8403318

A longer paper on this study including further detail on the methodology and findings is also available:

Hardy H, Livermore L, Kersey P, Norris K, Smith V (2023) Understanding the users and uses of UK Natural History Collections. Research Ideas and Outcomes 8: e113378 https://doi.org/10.3897/rio.9.e113378

Photo credit: Trustees of the Natural History Museum

Follow Research Ideas and Outcomes on Facebook, Twitter, and LinkedIn.

New way to browse interlinked biodiversity data: Biodiversity Knowledge Hub NOW ONLINE!

The Biodiversity Knowledge Hub is a one-stop portal that allows users to access FAIR and interlinked biodiversity data and services in a few clicks.

The Horizon 2020 BiCIKL Project is proud to announce that the Biodiversity Knowledge Hub (BKH) is now online.

BKH is a one-stop portal that allows users to access FAIR and interlinked biodiversity data and services in a few clicks. BKH was designed to support a new emerging community of users over time and across the entire biodiversity research cycle providing its services to anybody, anywhere and anytime.

The Knowledge Hub is the main product from our BiCIKL consortium, and we are delighted with the result!

BKH can easily be seen as the beginning of the major shift in the way we search interlinked biodiversity information.”

Biodiversity researchers, research infrastructures and publishers interested in fields ranging from taxonomy to ecology and bioinformatics can now freely use BKH as a compass to navigate the oceans of biodiversity data. BKH will do the linkages.

says Prof. Lyubomir Penev, BiCIKL’s Project coordinator and Founder of Pensoft Publishers
The BKH is designed to serve a new emerging community of users over time and across the entire biodiversity research cycle. 

We have invested our best energies and resources in the development of BKH and the Fair Data Place (FDP), which is the beating heart of the portal,”

BKH has been designed to support a new emerging community of users across the entire biodiversity research cycle.

Its purpose goes beyond the BiCIKL project itself: we are thrilled to say that BKH is meant to stay, aiming to reshape the way biodiversity knowledge is accessed and used.

says Dr Christos Arvanitidis, CEO of LifeWatch ERIC.

The BKH outlines how users can navigate and access the linked data, tools and services of the infrastructures cooperating in BiCIKL.

By revealing how they harvest, liberate and reuse data, these increasingly integrated sources enable researchers in the natural sciences to move more seamlessly between specimens and material samples, genomic and metagenomic data, scientific literature, and taxonomic names and units.

said Dr Joe Miller, Executive Secretary of GBIF—the Global Biodiversity Information Facility.

A training programme on how to best utilise the platform is currently being developed by the Consortium of European Taxonomic Facilities (CETAF), Pensoft PublishersPlaziMeise Botanic GardenEMBL’s European Bioinformatics Institute (EMBL-EBI), ELIXIR HubGBIF – the Global Biodiversity Information Facility, and LifeWatch ERIC and will be finalised in the coming months.

***

A detailed description of the BKH tools and services provided by its contributing organisations is available at: https://biodiversityknowledgehub.eu.

***

Find more information about the BiCIKL consortium partners on the project’s website.

***

Follow BiCIKL Project on Twitter and Facebook. Join the conversation on Twitter at #BiCIKL_H2020.

Museum of New Zealand’s journal Tūhinga moves to Pensoft’s ARPHA Publishing Platform

Having decided to turn Tūhinga “into a 21st-century”, digital-native diamond open-access journal, the Museum of New Zealand Te Papa Tongarewa signed with scholarly publisher and technology provider Pensoft and its publishing platform ARPHA. As part of their agreement, not only is the journal to make its future content easy to read and discover by readers and computer algorithms, but will also do so for its legacy content.

Having decided to turn Tūhinga “into a 21st-century”, digital-native diamond open-access journal, the Museum of New Zealand Te Papa Tongarewa signed with scholarly publisher and technology provider Pensoft and its publishing platform ARPHA. As part of the agreement, not only is the journal to make its future content easy to read and discover by readers and computer algorithms, but will also do so for its legacy publications previously available solely in print. 

Tūhinga: Records of the Museum of New Zealand Te Papa Tongarewa is the successor of the Museum of New Zealand Records, the National Museum of New Zealand Records, and the Dominion Museum Records in Ethnology. Together, the outlets have acquired a nearly two century-worth of scientific knowledge provided by the museum’s curators, collection managers, and research associates across disciplines, from archaeology to zoology.

The renovated Tūhinga is to utilise the whole package of signature services provided by the platform, including ARPHA’s fast-track, end-to-end publishing system, which benefits readers, authors, reviewers and editors alike. 

This means that each submitted manuscript is to be carried through the review, editing, publication, dissemination and archiving stages without leaving the platform’s collaboration-centred online environment. The articles themselves are to be openly available in PDF, machine-readable JATS XML formats, and semantically enriched HTML for better reader experience. Thus, the journal’s articles will be as easy to discover, access, reuse and cite as possible. Once published, the content is to be indexed and archived instantaneously and its underlying data exported to relevant specialised databases. Simultaneously, a suite of various metrics is to be enabled to facilitate tracking the usage of articles and sub-article elements – like figures and tables – in real time.

The journal’s legacy content is to also become machine-discoverable and more user-friendly. Each of these papers will also be assigned with DOI and registered at CrossRef, while their metadata will be indexed at relevant databases. On the new journal website, they will be displayed as embedded PDF documents, while the reader will be able to do a full-text search of the article’s content.

Tūhinga welcomes original collections-based research in the natural sciences and humanities, including museological research, where its multidisciplinarity reflects the breadth and range of museum-based scholarship. The journal focuses primarily on New Zealand and the Pacific, but all contributions are considered. Having opted for a Diamond Open Access policy, the journal is to charge neither its readers, nor the authors.

“It’s a great honour to sign with the Museum of New Zealand Te Papa Tongarewa and provide our publishing services to Tūhinga. Particularly, we take pride in letting the whole wide world straight into the holdings of Te Papa and the knowledge they have prompted in the distant past: something that would not typically be possible had they remained only on paper,”

says Prof. Dr Lyubomir Penev, founder and CEO at ARPHA and Pensoft.

48 years of Australian collecting trips in one data package

From 1973 to 2020, Australian zoologist Dr Robert Mesibov kept careful records of the “where” and “when” of his plant and invertebrate collecting trips. Now, he has made those valuable biodiversity data freely and easily accessible via the Zenodo open-data repository, so that future researchers can rely on this “authority file” when using museum specimens collected from those events in their own studies. The new dataset is described in the open-access, peer-reviewed Biodiversity Data Journal.

While checking museum records, Dr Robert Mesibov found there were occasional errors in the dates and places for specimens he had collected many years before. He was not surprised.

“It’s easy to make mistakes when entering data on a computer from paper specimen labels”, said Mesibov. “I also found specimen records that said I was the collector, but I know I wasn’t!”

One solution to this problem was what librarians and others have long called an “authority file”.

“It’s an authoritative reference, in this case with the correct details of where I collected and when”, he explained.

“I kept records of almost all my collecting trips from 1973 until I retired from field work in 2020. The earliest records were on paper, but I began storing the key details in digital form in the 1990s.”

The 48-year record has now been made publicly available via the Zenodo open-data repository after conversion to the Darwin Core data format, which is widely used for sharing biodiversity information. With this “authority file”, described in detail in the open-access, peer-reviewed Biodiversity Data Journal, future researchers will be able to rely on sound, interoperable and easy to access data, when using those museum specimens in their own studies, instead of repeating and further spreading unintentional errors.

“There are 3829 collecting events in the authority file”, said Mesibov, “from six Australian states and territories. For each collecting event there are geospatial and date details, plus notes on the collection.”

Mesibov hopes the authority file will be used by museums to correct errors in their catalogues.

“It should also save museums a fair bit of work in future”, he explained. “No need to transcribe details on specimen labels into digital form in a database, because the details are already in digital form in the authority file.”

Mesibov points out that in the 19th and 20th centuries, lists of collecting events were often included in the reports of major scientific expeditions.

“Those lists were authority files, but in the pre-digital days it was probably just as easy to copy collection data from specimen labels.”

“In the 21st century there’s a big push to digitise museum specimen collections”, he said. “Museum databases often have lookup tables with scientific names and the names of collectors. These lookup tables save data entry time and help to avoid errors in digitising.”

“Authority files for collecting events are the next logical step,” said Mesibov. “They can be used as lookup tables for all the important details of individual collections: where, when, by whom and how.”

###

Research paper:

Mesibov RE (2021) An Australian collector’s authority file, 1973–2020. Biodiversity Data Journal 9: e70463. https://doi.org/10.3897/BDJ.9.e70463

###

Robert Mesibov’s webpage: https://www.datafix.com.au/mesibov.html

Robert Mesibov’s ORCID page: https://orcid.org/0000-0003-3466-5038