Image recognition to the rescue of natural history museums by enabling curators to identify specimens on the fly

New Research Idea, published in RIO Journal presents a promising machine-learning ecosystem to unite experts around the world and make up for lacking taxonomic expertise.

In their Research Idea, published in Research Ideas and Outcomes (RIO Journal), Swiss-Dutch research team present a promising machine-learning ecosystem to unite experts around the world and make up for lacking expert staff

Guest blog post by Luc Willemse, Senior collection manager at Naturalis Biodiversity Centre (Leiden, Netherlands)

Imagine the workday of a curator in a national natural history museum. Having spent several decades learning about a specific subgroup of grasshoppers, that person is now busy working on the identification and organisation of the holdings of the institution. To do this, the curator needs to study in detail a huge number of undescribed grasshoppers collected from all sorts of habitats around the world. 

The problem here, however, is that a curator at a smaller natural history institution – is usually responsible for all insects kept at the museum, ranging from butterflies to beetles, flies and so on. In total, we know of around 1 million described insect species worldwide. Meanwhile, another 3,000 are being added each year, while many more are redescribed, as a result of further study and new discoveries. Becoming a specialist for grasshoppers was already a laborious activity that took decades, how about knowing all insects of the world? That’s simply impossible. 

Then, how could we expect from one person to sort and update all collections at a museum: an activity that is the cornerstone of biodiversity research? A part of the solution, hiring and training additional staff, is costly and time-consuming, especially when we know that experts on certain species groups are already scarce on a global scale. 

We believe that automated image recognition holds the key to reliable and sustainable practises at natural history institutions. 

Today, image recognition tools integrated in mobile apps are already being used even by citizen scientists to identify plants and animals in the field. Based on an image taken by a smartphone, those tools identify specimens on the fly and estimate the accuracy of their results. What’s more is the fact that those identifications have proven to be almost as accurate as those done by humans. This gives us hope that we could help curators at museums worldwide take better and more timely care of the collections they are responsible for. 

However, specimen identification for the use of natural history institutions is still much more complex than the tools used in the field. After all, the information they store and should be able to provide is meant to serve as a knowledge hub for educational and reference purposes for present and future generations of researchers around the globe.

This is why we propose a sustainable system where images, knowledge, trained recognition models and tools are exchanged between institutes, and where an international collaboration between museums from all sizes is crucial. The aim is to have a system that will benefit the entire community of natural history collections in providing further access to their invaluable collections. 

We propose four elements to this system: 

  1. A central library of already trained image recognition models (algorithms) needs to be created. It will be openly accessible, so any other institute can profit from models trained by others.
Mock-up of a Central Library of Algorithms.
  1. A central library of datasets accessing images of collection specimens that have recently been identified by experts. This will provide an indispensable source of images for training new algorithms.
Mock-up of a Central Library of Datasets.
  1. A digital workbench that provides an easy-to-use interface for inexperienced users to customise the algorithms and datasets to the particular needs in their own collections. 
  2. As the entire system depends on international collaboration as well as sharing of algorithms and datasets, a user forum is essential to discuss issues, coordinate, evaluate, test or implement novel technologies.

How would this work on a daily basis for curators? We provide two examples of use cases.

First, let’s zoom in to a case where a curator needs to identify a box of insects, for example bush crickets, to a lower taxonomic level. Here, he/she would take an image of the box and split it into segments of individual specimens. Then, image recognition will identify the bush crickets to a lower taxonomic level. The result, which we present in the table below – will be used to update object-level registration or to physically rearrange specimens into more accurate boxes. This entire step can also be done by non-specialist staff. 

Mock-up of box with grasshoppers mentioned in the above table

Results of automated image recognition identify specimens to a lower taxonomic level.

Another example is to incorporate image recognition tools into digitisation processes that include imaging specimens. In this case, image recognition tools can be used on the fly to check or confirm the identifications and thus improve data quality.

Mock-up of an interface for automated taxon identification. 

Using image recognition tools to identify specimens in museum collections is likely to become common practice in the future. It is a technical tool that will enable the community to share available taxonomic expertise. 

Using image recognition tools creates the possibility to identify species groups for which there is very limited to none in-house expertise. Such practises would substantially reduce costs and time spent per treated item. 

Image recognition applications carry metadata like version numbers and/or datasets used for training. Additionally, such an approach would make identification more transparent than the one carried out by humans whose expertise is, by design, in no way standardised or transparent.

*

Follow RIO Journal on Twitter and Facebook.

*

Research publication:

Greeff M, Caspers M, Kalkman V, Willemse L, Sunderland BD, Bánki O, Hogeweg L (2022) Sharing taxonomic expertise between natural history collections using image recognition. Research Ideas and Outcomes 8: e79187. https://doi.org/10.3897/rio.8.e79187

A Red List of insect experts in Europe

New EC-funded project will identify trends in taxonomic expertise across Europe to identify gaps in expert knowledge

Europe’s largest bumblebee, Bombus fragrans, is currently assessed as an Endangered species.
Illustration by Denitza Peneva.

Insects are the largest taxonomic group in the animal kingdom. Three out of four described animal species belong to the class Insecta. They are widely distributed in terrestrial and aquatic environments. Indispensable to the ecosystem, insects drive key processes such as pollination, decomposition, soil formation and supply an essential part of the food webs.

Yet, insect populations have been catastrophically plummeting. For example, recent studies have shown a decrease of 75% of insect biomass in German Nature Reserves in less than 30 years, and the situation is probably no less dramatic anywhere in Europe. According to the European Red List of threatened species, one in ten bee species and a quarter of all grasshopper species are at risk of extinction. As it becomes clear how dependent on insects our ecosystems and our economy are, people gradually realise the dramatic consequences of insect decline.

One less known aspect of this global crisis is on the agenda today: the shrinking number of insect taxonomists, the scientists on whose highly specialised skills we depend to obtain knowledge on the diversity of organisms. Without taxonomists, no study of species or ecosystems would be possible, as we would not be able to recognise what biodiversity we are losing.

Here is why the European Commission has funded a new project to embark on the pioneer task to assess the status of taxonomic expertise on insects in Europe. A “Red List” of taxonomists will be compiled for the first time for any group of organisms. The effort is being undertaken by a diverse and interdisciplinary team of experts, including the organisation uniting the most important and largest European natural science collections (CETAF) and the world’s authority on assessing the risk of extinction of organisms: IUCN (the International Union for Conservation of Nature).

As with typical European Red List (ERL) assessments, normally applied to species level, the project involves the collection and evaluation of the available information about the number, location, qualification and field of specialisation of insect taxonomists and the application of systematic criteria to assess the risk of their “extinction”. This concept has never been applied to scientists before, but by using the ERL analogy, the project aims to combine those groups of insects and those countries that bear the highest risk of losing the associated taxonomic expertise and potential gaps.

Bringing together individual scientists, research institutions and learned societies from across Europe, the project will compare the trends and pull up recommendations to overcoming the risks, preserving and further evolving the expert capacity of this scientific community. Unlike species extinctions, the loss of taxonomic knowledge is reversible, especially when the needs are clear and the necessary resources are invested in education, training, career development and recognition.

###

Additional information:

CETAF is the European organization of Natural History Museums, Botanic Gardens and Research Centers with their associated natural science collections comprising 71 of the largest taxonomic institutions from 22 European countries (18 EU, 1 EEA and 3 non-EU), gathering expertise of more than 5,000 researchers. Their collections contain a wide range of specimens including animals, plants, fungi and rocks, and genetic resources which are used for scientific research and exhibitions. CETAF aims to promote training, research collaborations and understanding in taxonomy and systematic biology as well as to facilitate access to our natural heritage by sharing the information derived from the collections.

IUCN (the International Union for Conservation of Nature) is a membership Union composed of both government and civil society organisations. It harnesses the experience, resources and reach of its more than 1,400 Member organisations and the input of more than 17,000 experts. This diversity and vast expertise makes IUCN the global authority on the status of the natural world and the measures needed to safeguard it.

Pensoft is an independent academic publishing company and technology provider, well known worldwide for its novel cutting-edge publishing tools, workflows and methods for text and data publishing of journals, books and conference materials. Through its Research and Technical Development department, the company is involved in various research and technology projects. Founded in 1992 “by scientists, for scientists” and initially focusing on book publishing, Pensoft is now a leading publisher of innovative open access journals in taxonomy and biodiversity science.