Digitising UK Natural History Collections is vital to understand life on Earth, reports the Natural History Museum

In a paper published in the journal Research Ideas and Outcomes, authors estimate £18 million has been saved in efficiencies by researchers accessing digital specimens rather than physical collections.

· Scientists from the Natural History Museum (NHM) deep-dive into the uses and users of natural history collections held in the UK

· Modest estimates report a saving of £18 million in efficiencies by researchers accessing digital data rather than physical collections

· Today, software can complete in a week what it would take a human two years to achieve

· Call for investment to secure the UK’s stance as a world superpower in science and tech, and for a future in which both people and planet thrive

A new report has evaluated the use and impact of digitised natural science collections held in the UK and how they contribute to scientific, commercial and societal benefits.

UK natural science collections hold more than 137 million items spanning an incredible 4.56-billion-year history of life on Earth. These collections have emerged as a pivotal data resource to understanding the Earth in its past and current state – and will continue to inform the investors and policy-makers of the future.

UK natural science data in demand

GBIF—the Global Biodiversity Information Facility—is an international database providing open access data on all types of life on Earth. In this paper led by the NHM, scientists report that there are 7.6 million specimens, less than 6% of total UK natural science collections sampled, freely accessible on GBIF.

They found that 12% of the total peer-reviewed journal articles citing GBIF data specifically cite UK natural science collections. These data currently make up just 0.3% of total occurrences on GBIF, meaning they punch an incredible 40 times above their weight.

When asked previously, over 90% of GBIF users linked their use of these data to advancing the UN Sustainable Development Goals which look to reduce hunger, poverty and inequality, and spur economic growth while tackling climate change and protecting the oceans and forests.

The case for digitising UK natural science collections

The introduction of these collections onto a digital platform has revolutionised scientific research. In this paper published in the journal Research Ideas and Outcomes, the authors estimate £18 million has been saved in efficiencies by researchers accessing digital specimens rather than physical collections, assuming a minimal single physical visit replaced per citation. Of this, £1.4 million has been attributed to UK researchers, money which can be reinvested back into UK science institutions – those at the forefront of finding solutions to real world problems.

Lead author and Deputy Head of Digital, Data and Informatics, Helen Hardy says, ‘The advancement of digitisation has been truly transformational to the scientific community. Today it’s possible to use software that takes a week to achieve the type of information gathering it would take a human over 3,000 hours, or two years, to complete – individuals realising an entire life’s work in just a few months! Anticipation is high for further innovations such as the further integration of artificial intelligence into taxonomic work.’

UK government want the UK to be a science and technology superpower, and natural science collections provide a unique opportunity to achieve this. To unlock the true potential of collections data, UK Natural Science collections are joining forces through the Distributed System of Scientific Collections UK (DiSSCo) to make the case for investment of £155 million in a research infrastructure which is expected to unlock at least a seven- to ten- fold economic return on investment. Working alongside the Arts & Humanities Research Council (AHRC) and UK Research and Innovation (UKRI) to digitise the critical mass of collections, the data will be available through a robust technological infrastructure and continually developed in line with recent innovations.

Ken Norris, Deputy Director of Science at the NHM says, ‘In the midst of a planetary emergency, and what some experts believe to be the Earth’s sixth mass extinction event, estimates say that over 50% of the world’s GDP, which equates to approx. 44 trillion dollars, is dependent on the natural world. By understanding what is in collections now, both on a national and international scale, we can identify trends, necessary actions, and what we need to collect to underpin policy and investment decisions for a future where people and planet thrive.’

Hardy H, Livermore L, Kersey P, Norris K, Smith V, Pullar J (2023) Users and uses of UK Natural History Collections – a Summary, https://doi.org/10.5281/zenodo.8403318

A longer paper on this study including further detail on the methodology and findings is also available:

Hardy H, Livermore L, Kersey P, Norris K, Smith V (2023) Understanding the users and uses of UK Natural History Collections. Research Ideas and Outcomes 8: e113378 https://doi.org/10.3897/rio.9.e113378

Photo credit: Trustees of the Natural History Museum

Follow Research Ideas and Outcomes on Facebook, Twitter, and LinkedIn.

Digitising beans to feed the world

In 2018, NHM London’s digitisation team started a project to digitise non-type herbarium material from the legume family. A recent data paper in the Biodiversity Data Journal reports on the outcomes.

You can find the original blog post by the Natural History Museum of London, reposted here with minor edits.

Legumes are a group of plants that include soybeans, peas, chickpeas, peanuts and lentils. They are a significant source of protein, fibre, carbohydrates, and minerals in our diet and some, like the cowpea, are resistant to droughts.

In 2018, the Natural History Museum of London’s (NHM London) digitisation team started a project in collaboration with project leader Royal Botanic Gardens Kew and the Royal Botanic Garden Edinburgh.

The project’s outcomes were published in a data paper in the Biodiversity Data Journal. Within the project, the digitisation team aimed to collectively digitise non-type herbarium material from the legume family. This includes rosewood trees (Dalbergia), padauk trees (Pterocarpus) and the Phaseolinae subtribe that contains many of the beans cultivated for human and animal food.

This project was made possible through the Department for Environment Food & Rural Affairs (DEFRA)-allocated Official Development Assistance (ODA) funding, distributed by the UK government in its “global efforts to defeat poverty, tackle instability and create prosperity in developing countries”.

AfricanGuinea, Ethiopia, Sudan, Kenya, Uganda, Tanzania, Mozambique, Malawi and Madagascar
AsianBangladesh, Myanmar, Nepal, New Guinea and India
Southern and Central AmericanGuatemala, Honduras, El Salvador, Nicaragua, Bolivia, Argentina and Brazil
ODA-listed Countries

The legume groups: Dalbergia, Pterocarpus and Phaseolinae,were chosen for digitisation to support the development of dry beans as a sustainable and resilient crop, and to aid conservation and sustainable use of rosewood and padauk trees. Some of these beans, especially cow pea and pigeon pea, are sustainable and resilient crops, as they can be grown in poor-quality soils and are drought stress resistant. This makes them particularly suitable for agricultural production where the growing of other crops would be difficult.

Digitally discoverable herbarium specimens can provide important information about the distribution of individual species, as well as highlighting which species occur naturally together.

While there have been collaborative efforts between herbaria in the past, these have tended to prioritise digitisation of type specimens: the example specimens for which a species is named.

Types are important to identification, but being individual specimens, they don’t offer insights into species distribution over time. By focusing on the non-types across the world and over the last 200 years, we have released a brand-new resource to the global scientific community.

Searching for beans

This collection was digitised by creating an inventory record for each specimen, attaching images of each herbarium sheet, and then transcribing more data and georeferencing the specimens, providing an accurate locality in space and time for their collection. 

We originally had four months and three members of staff to digitise over 11,000 specimens. The Covid-19 lockdown was ironically rather lucky for this project as it enabled us to have more time to transcribe and georeference all of the records. 

say the researchers behind the digitisation project.
Map showing breakdown of records by country.

“We were able to assign country-level data to 10,857 out of the total number of 11,222 records. We were also able to transcribe the collectors’ names from the majority of our specimen labels (10,879 out of 11,222). Only 770 out of the 2,226 individuals identified during this project collected their specimens in ODA listed countries. The highest contributors were: Richard Beddome (130 specimens), Charles Clarke (110), Hans Schlieben (98) and Nathaniel Wallich (79). The breakdown of records by ODA country can be seen in the chart below. “

Map showing breakdown of records by country and pie chart showing distribution by ODA listed countries.

From our data, we can see the peak decade of collection was the 1930s, with almost half (4,583 specimens or 49,43%) collected between 1900 and 1950 (Fig. 10).

This peak can be attributed to three of our most prolific collectors: Arthur Kerr, John Gossweiler and Georges Le Testu, all of whom were most active in the 1930s. The oldest specimen (BM013713473) was collected by Mark Catesby (1683-1749) in the Bahamas in 1726.

they explain.

An interesting, but perhaps unsurprising, finding is that our collection is strongly male-dominated.

There are only two women (Caroline Whitefoord and Ynes Mexia) in the list of our top 50 plant collectors and they are not close to the most prolific collectors.

We identified more women in the rest of our records, but their contribution is on average less than 25 specimens per person in the dataset consisting of more than 10,000 specimens. In contrast, the top five male collectors contributed 10% of our collection. 

they continued

Releasing Rosewoods

Both the Pterocarpus and Dalbergia genera include species that are used as expensive good quality timber that is prone to illegal logging. Many species such as Pterocarpus tinctorius are also listed on the International Union for Conservation of Nature (IUCN) Red List of Threatened Species. By releasing this new resource of information on all these plants from three of the biggest herbaria in the world, we can share this datа with the people who are taking care of biodiversity in these countries. The data can be used to identify hotspots, where the tree is naturally growing and protect these areas. These data would also allow much closer attention to be paid to areas that could be targets for illegal logging activity.

Pterocarpus tinctorius is a species of padauk tree that is listed as endangered on the IUCN Red List.
Cowpea (Vigna unguiculata) is a food and animal feed crop grown in the semi-arid tropics.

The ODA-listed countries are economically impoverished and disproportionately prone to be disadvantaged with the changing climate whether from flood or drought or increase in temperature.

Using data to identify good, nutritious plant species that can be grown in such conditions can therefore benefit local communities, potentially reducing dependence on imports, aid and on less resilient crops. 

the team adds in conclusion.

***

This dataset is now openly available on the Museum’s Data Portal and a data paper about this work has been released in the Biodiversity Data Journal.

***

Stay in touch with the Digitisation team by following us on Instagram and Twitter

Don’t forget to also follow the Biodiversity Data Journal on Twitter and Facebook.

Digitising the Natural History Museum London’s entire collection could contribute over £2 billion to the global economy

In a world first, the Natural History Museum, London, has collaborated with economic consultants, Frontier Economics Ltd, to explore the economic and societal value of digitising natural history collections and concluded that digitisation has the potential to see a seven to tenfold return on investment. Whilst significant progress is already being made at the Museum, additional investment is needed in order to unlock the full potential of the Museum’s vast collections – more than 80 million objects. The project’s report is published in the open science scientific journal Research Ideas and Outcomes (RIO Journal).

One of the Museum’s digitisers imaging a butterfly to join the 4.93 million specimens already available online. 
© The Trustees of the Natural History Museum, London

The societal benefits of digitising natural history collections extends to global advancements in food security, biodiversity conservation, medicine discovery, minerals exploration, and beyond. Brand new, rigorous economic report predicts investing in digitising natural history museum collections could also result in a tenfold return. The Natural History Museum, London, has so far made over 4.9 million digitised specimens available freely online – over 28 billion records have been downloaded over 429,000 download events over the past six years. 

Digitisation at the Natural History Museum, London 

Digitisation is the process of creating and sharing the data associated with Museum specimens. To digitise a specimen, all its related information is added to an online database. This typically includes where and when it was collected and who found it, and can include photographs, scans and other molecular data if available. Natural history collections are a unique record of biodiversity dating back hundreds of years, and geodiversity dating back millennia. Creating and sharing data this way enables science that would have otherwise been impossible, and we accelerate the rate at which important discoveries are made from our collections.  

The Natural History Museum’s collection of 80 million items is one of the largest and most historically and geographically diverse in the world. By unlocking the collection online, the Museum provides free and open access for global researchers, scientists, artists and more. Since 2015, the Museum has made 4.9 million specimens available on the Museum’s Data Portal, which have seen more than 28 billion downloads over 427,000 download events. 

This means the Museum has digitised  about 6% of its collections to date. Because digitisation is expensive, costing tens of millions of pounds, it is difficult to make a case for further investment without better understanding the value of this digitisation and its benefits. 

In 2021, the Museum decided to explore the economic impacts of collections data in more depth, and commissioned Frontier Economics to undertake modelling, resulting in this project report, now made publicly available in the open-science journal Research Ideas and Outcomes (RIO Journal), and confirming benefits in excess of £2 billion over 30 years. While the methods in this report are relevant to collections globally, this modelling focuses on benefits to the UK, and is intended to support the Museum’s own digitisation work, as well as a current scoping study funded by the Arts & Humanities Research Council about the case for digitising all UK natural science collections as a research infrastructure.

Sharing data from our collections can transform scientific research and help find solutions for nature and from nature. Our digitised collections have helped establish the baseline plant biodiversity in the Amazon, find wheat crops that are more resilient to climate change and support research into potential zoonotic origins of Covid-19. The research that comes from sharing our specimens has immense potential to transform our world and help both people and the planet thrive,

says Helen Hardy, Science Digital Programme Manager at the Natural History Museum.

How digitisation impacts scientific research?

The data from museum collections accelerates scientific research, which in turn creates benefits for society and the economy across a wide range of sectors. Frontier Economics Ltd have looked at the impact of collections data in five of these sectors: biodiversity conservation, invasive species, medicines discovery, agricultural research and development and mineral exploration. 

The Natural History Museum’s collection is a real treasure trove which, if made easily accessible to scientists all over the world through digitisation, has the potential to unlock ground-breaking research in any number of areas. Predicting exactly how the data will be used in future is clearly very uncertain. We have looked at the potential value that new research could create in just five areas focussing on a relatively narrow set of outcomes. We find that the value at stake is extremely large, running into billions,”

says Dan Popov, Economist at Frontier Economics Ltd.

The new analyses attempt to estimate the economic value of these benefits using a range of approaches, with the results in broad agreement that the benefits of digitisation are at least ten times greater than the costs. This represents a compelling case for investment in museum digital infrastructure without which the many benefits will not be realised.

This new analysis shows that the data locked up in our collections has significant societal and economic value, but we need investment to help us release it,

adds Professor Ken Norris, Head of the Life Sciences Department at the Natural History Museum.

Other benefits could include improvements to the resilience of agricultural crops by better understanding their wild relatives, research into invasive species which can cause significant damage to ecosystems and crops, and improving the accuracy of mining.  

Finally, there are other impacts that such work could have on how science is conducted itself. The very act of digitising specimens means that researchers anywhere on the planet can access these collections, saving time and money that may have been spent as scientists travelled to see specific objects.

The value of research enabled by digitisation of natural history collections can be estimated by looking at specific areas where the Museum’s collections contribute towards scientific research and subsequently impact the wider economy. 
© Frontier Economics Ltd.

Original source: 

Popov D, Roychoudhury P, Hardy H, Livermore L, Norris K (2021) The Value of Digitising Natural History Collections. Research Ideas and Outcomes 7: e78844. https://doi.org/10.3897/rio.7.e78844

DNA study in the Pacific reveals 2000% increase in our knowledge of mollusc biodiversity

Lead author Dr Helena Wiklund examining specimens on the RV Melville in October 2013
Lead author Dr Helena Wiklund examining specimens on the RV Melville in October 2013

Scientists working in the new frontier for deep-sea mining have revealed a remarkable 2000% increase in our knowledge of the biodiversity of seafloor molluscs.

The 21 mollusc species newly described thanks to the latest DNA-taxonomy methodology
The 21 mollusc species newly described thanks to the latest DNA-taxonomy methodology

Tweny-one species, where only one was previously known, are reported as a result of the research which applied the latest DNA-taxonomy methodology to mollusc specimens collected from the central Pacific Clarion Clipperton Zone (CCZ) in 2013. They are all described in the open access journal ZooKeys.

Among the discoveries is a monoplacophoran mollusc species regarded as a ‘living fossil’, since it is one the ancestors of all molluscs. This is the first DNA to be collected from this species and the first record of it from the CCZ mining exploration zone – a vast 5-million-km² region of the central Pacific that is regulated for seabed mining by the International Seabed Authority.

“Despite over 100 survey expeditions to the region over 40 years of mineral prospecting, there has been almost no taxonomy done on the molluscs from this area,” says lead author Dr Helena Wiklund of the The Natural History Museum in London (NHM).

Dr Wiklund undertook a comprehensive DNA-based study of the molluscs to confirm species identities and make data available for future taxonomic study. This was coupled with the expertise of the NHM’s Dr John Taylor, who led the morphological work.

The molluscs were found in samples taken on and in the mud surrounding the potato-sized polymetallic nodules that are present in high abundance across the CCZ. These nodules are the target for potential deep-sea mining being rich in cobalt, copper, nickel, manganese and other valuable minerals.

The data are vital for the future environmental regulation of deep-sea mining, but have also revealed surprising patterns.

“I was amazed to discover that specimens collected during the 19th century by HMS Challenger were probably the same as ours over a range of 7000 km, but that data lodged on genetic databases from closer but shallower depths is likely to be from a different species,” comments Dr Thomas Dahlgren, population geneticist at Uni Research, Norway and University of Gothenburg, Sweden, who studied in detail a species called Nucula profundorum.

“Our efforts are now focussing on studying the DNA from many more samples of this species to examine connectivity and potential resilience to deep-sea mining,” he added.

Dr Thomas Dahlgren sieving sediments to find new clam and snail
Dr Thomas Dahlgren sieving sediments to find new clam and snail species

“It is a simple truth that we cannot move forward on regulatory approval for deep-sea mining without fundamental baseline data on what animals actually live in these regions,” says Principal Investigator of the NHM Deep-sea Systematics and Ecology Research Group, Dr Adrian Glover.

“Our work has highlighted obvious gaps in our knowledge, but also shown that with even relatively modest effort, we can greatly increase our understanding of baseline biodiversity using DNA-taxonomy.”

Creating a library of archived DNA-sequenced samples from known species allows for the future possibility of using the latest environmental DNA (eDNA) methods to ‘search’ for these species using just tiny samples of mud or seawater.

“Its akin to forensic science’, says Dr Glover. “You can’t use eDNA to find the criminals or species unless you have a library of information to compare them too”.

All data and specimens from the study have been lodged at the NHM and online repositories to make them accessible for future study. Of particular importance are the frozen tissue collections, which are housed in the state-of-the-art Molecular Collections Facility at the NHM and available for loan or further DNA work.

 

Original source:

Wiklund H, Taylor JD, Dahlgren TG, Todt C, Ikebe C, Rabone M, Glover AG (2017) Abyssal fauna of the UK-1 polymetallic nodule exploration area, Clarion-Clipperton Zone, central Pacific Ocean: Mollusca. ZooKeys 707: 1–46. https://doi.org/10.3897/zookeys.707.13042