Everything (in museums) Everywhere, All at Once

Guest blog post by Julia Sigwart.

Imagine walking into a museum and realising that every specimen—a rare deep-sea snail, a giant fossil bone, a pressed plant, the DNA bank, endless drawers of perfectly pinned insects, even the notebooks and dusty photographs in the archive—is part of a vast, interconnected web of knowledge. Now, imagine if all of these specimens—across every museum in the world—were seamlessly linked, their data unified and accessible to scientists, historians, educators, and conservationists everywhere. This vision is at the heart of collectomics—a groundbreaking new term introduced in a recent paper published in Natural History Collections and Museomics.

Examples of natural history objects.
Top: for more than 200 years, relevant object information was most often recorded in the form of hand-written labels and inventories. Bottom: Natural history museums directly intersect with social sciences, although the connections often go unrecognised. Top left: Jan-Peter Kasper/Universität Jena, Top right: Sigrid Hof / Senckenberg Research Institute and Museum Frankfurt, Bottom left: image of Dr Fritz Haas (seated) and unnamed companions (men and women), in the act of collecting a new species Unio valentinus, Bottom right: natural history objects also appear in the context of art objects, photo: Emőke Dénes.

What is Collectomics, and Why Does It Matter?

At its core, the collectomics concept represents a holistic modern view of museum collections. This is not only about digitising collections, or about preserving species; rather this new approach shifts the perspective to treating collections as a single global dataset. Museums represent a dynamic and growing resource that can help answer some of the most pressing challenges in science and conservation. With the integration of digital tools, standardized data practices, and a commitment to open accessibility, collectomics offers a way to transform fragmented collections into a powerful, collective resource that also integrates the cultural and historical aspects of museum collections.

Collectomics envisions museums as interconnected nodes in a worldwide network, rather than isolated repositories of knowledge, and holds this ambition as the primary goal of collections digitisation. This framework allows researchers to trace the movement of species, monitor environmental changes over time, and predict future ecological shifts with greater accuracy. More importantly, it connects beyond the realm of natural sciences to other disciplines. Natural history specimens are objects that were collected by people—including often-uncredited local knowledge holders.

The accessory information about the life and work of those human facets informs our use of museum objects. For example, if we can identify the handwriting on an original collection label, the lifetime of that person can constrain the collecting date even if it was not written down, and this adds to the biological knowledge about the specimen. Conversely, the types of objects and observations recorded by a person inform the understanding of the historical context.

For an increasingly diverse range of scientists, museum data contribute to work without actually depending on physically examining the original objects. They can analyse high-resolution images, genetic data, and historical records without leaving their own labs. Collectomics puts the original objects as the centre of gravity, acknowledging that preserved specimens  underpin the scientific replicability of this rapidly growing suite of applications. 

Natural history museum collection.
Natural history collections are iconic in biodiversity research and yet much of their potential impact remains untapped. Photograph by Sven Tränkner, Senckenberg Museum Frankfurt, Germany.

Looking Ahead: How Collectomics Can Shape the Future

Beyond envisioned technical advancements, collectomics is fundamentally about people. It is about the researchers who dedicate their lives to studying biodiversity, the curators who meticulously preserve specimens, and the students who might one day make groundbreaking discoveries using these collections.

A database is more than just a digital version of a collection—it is structured, searchable, and interconnected, allowing for new patterns and insights to emerge. Physical collections, like a library, must follow a particular a priori organisation. Books on a shelf might be arranged by subject, author, the colour of the dustjacket, or just the order they were unpacked. Zoological and botanical collections are typically arranged taxonomically, while geological collections are organised stratigraphically. And just like running your eyes across a bookshelf, physically browsing a collection often turns up serendipitous inspiration and discovery. Once specimens and their associated data are digitised, different kinds of unexpected relationships and trends can be uncovered. In a collection organised based on systematics, it is almost impossible to answer simple geographical questions like “How many specimens do you have from Malaysia?” because the relevant material is scattered across countless diverse taxonomic groups. The power of digitisation is enabling cross-cutting queries, on geography, time, and the activities of human contributors. This does not replace the need for well organised, well maintained physical collections, but instead unlocks the full potential.

Graph showing the great extent to which records are undigitised.
Digital records are only a small fraction of global museum records. The black line represents a linear increase of the number of collection objects in time from the late 1700s. The dashed lines show three model projections for digitisation: in the best-case model prediction, museums might achieve complete digitisation at the earliest around the year 2071, but if there is no acceleration (red line) the global digitisation gap will continue to increase.

The importance of collections digitisation has long been recognised. However, this has progressed in a patchwork of small projects, often funded for specific research interests. As collections are continuously growing, the rate of growth may be outpacing even modern digitsation efforts. Collectomics offers an outlook that depends on, and also motivates, a total-collections approach. The power of collectomics emerges only when it is applied to everything, everywhere, in interconnecting museum collections including natural history and beyond.

By making collections more accessible, collectomics also contributes to democratising and diversifying science. Historically, access to rare specimens was limited to those with the resources to travel or with institutional connections. But with a collectomics approach, a high school student in a small town can study the same butterfly as a leading entomologist at a major university. A researcher in the Global South can contribute just as meaningfully to biodiversity studies as someone in the Global North. By embracing this new framework, museums are not only preserving history—we are unlocking its full potential.

Original source

Sigwart JD, Schleuning M, Brandt A, Pfenninger M, Saeedi H, Borsch T, Häffner E, Lücking R, Güntsch A, Trischler H, Töpfer T, Wesche K, Consortium C (2025) Collectomics – towards a new framework to integrate museum collections to address global challenges. Natural History Collections and Museomics 2: 1-20. https://doi.org/10.3897/nhcm.2.148855

Follow Natural History Collections and Museomics on XFacebook and Bluesky.

Artificial neural networks could power up curation of natural history collections

Deep learning techniques manage to differentiate between similar plant families with up to 99 percent accuracy, Smithsonian researchers reveal

Millions, if not billions, of specimens reside in the world’s natural history collections, but most of these have not been carefully studied, or even looked at, in decades. While containing critical data for many scientific endeavors, most objects are quietly sitting in their own little cabinets of curiosity.

Thus, mass digitization of natural history collections has become a major goal at museums around the world. Having brought together numerous biologists, curators, volunteers and citizens scientists, such initiatives have already generated large datasets from these collections and provided unprecedented insight.

Now, a study, recently published in the open access Biodiversity Data Journal, suggests that the latest advances in both digitization and machine learning might together be able to assist museum curators in their efforts to care for and learn from this incredible global resource.

A team of researchers from the Smithsonian Department of BotanyData Science Lab, and Digitization Program Office recently collaborated with NVIDIA to carry out a pilot project using deep learning approaches to dig into digitized herbarium specimens.

Smithsonian researchers classifying digitized herbarium sheets.
Smithsonian researchers classifying digitized herbarium sheets.

Their study is among the first to describe the use of deep learning methods to enhance our understanding of digitized collection samples. It is also the first to demonstrate that a deep convolutional neural network–a computing system modelled after the neuron activity in animal brains that can basically learn on its own–can effectively differentiate between similar plants with an amazing accuracy of nearly 100%.

In the paper, the scientists describe two different neural networks that they trained to perform tasks on the digitized portion (currently 1.2 million specimens) of the United States National Herbarium.

The team first trained a net to automatically recognize herbarium sheets that had been stained with mercury crystals, since mercury was commonly used by some early collectors to protect the plant collections from insect damage. The second net was trained to discriminate between two families of plants that share a strikingly similar superficial appearance.

Sample herbarium specimen image of stained clubmoss
Sample herbarium specimen image of stained clubmoss.

The trained neural nets performed with 90% and 96% accuracy respectively (or 94% and 99% if the most challenging specimens were discarded), confirming that deep learning is a useful and important technology for the future analysis of digitized museum collections.

“The results can be leveraged both to improve curation and unlock new avenues of research,” conclude the scientists.

“This research paper is a wonderful proof of concept. We now know that we can apply machine learning to digitized natural history specimens to solve curatorial and identification problems. The future will be using these tools combined with large shared data sets to test fundamental hypotheses about the evolution and distribution of plants and animals,” says Dr. Laurence J. Dorr, Chair of the Smithsonian Department of Botany.

 

###

Original source:

Schuettpelz E, Frandsen P, Dikow R, Brown A, Orli S, Peters M, Metallo A, Funk V, Dorr L (2017) Applications of deep convolutional neural networks to digitized natural history collections. Biodiversity Data Journal 5: e21139. https://doi.org/10.3897/BDJ.5.e21139