‘Nature’s Envelope’ – a simple device that reveals the scope and scale of all biological processes

All processes fit into a broad S-shaped envelope extending from the briefest to the most enduring biological events. For the first time, we have the first simple model that depicts the scope and scale of biology.

Arctic tern by Mark Stock, Schleswig-Holstein Wadden Sea National Park. License: CC BY-SA.

As biology is progressing into a digital age, it is creating new opportunities for discovery. 

Increasingly, information from investigations into aspects of biology from ecology to molecular biology is available in a digital form. Older ‘legacy’ information is being digitized. Together, the digital information is accumulated in databases from which it can be harvested and examined with an increasing array of algorithmic and visualization tools.

From this trend has emerged a vision that, one day, we should be able to analyze any and all aspects of biology in this digital world. 

However, before this can happen, there will need to be an infrastructure that gathers information from ALL sources, reshapes it as standardized data using universal metadata and ontologies, and made freely available for analysis. 

That information also must make its way to trustworthy repositories to guarantee the permanent access to the data in a polished and fully suited for re-use state.

The first layer in the infrastructure is the one that gathers all old and new information, whether it be about the migrations of ocean mammals, the sequence of bases in ribosomal RNA, or the known locations of particular species of ciliated protozoa.

How many of these subdomains will be there?

To answer this, we need to have a sense of the scope and scale of biology.

With the Nature’s Envelope we have, for the first time, a simple model that depicts the scope and scale of biology. Presented as a rhetorical device by its author Dr David J. Patterson (University of Sydney, Australia), the Nature’s Envelope is described in a Forum Paper, published in the open-science journal Research Ideas and Outcomes (RIO).

This is achieved by compiling information about the processes conducted by all living organisms. The processes occur at all levels of organization, from sub-molecular transactions, such as those that underpin nervous impulses, to those within and among plants, animals, fungi, protists and prokaryotes. Further, they are also the actions and reactions of individuals and communities; but also the sum of the interactions that make up an ecosystem; and finally, the consequences of the biosphere as a whole system. 

Nature’s Envelope, in green, includes all processes carried out by, involving, or the result of the activities of any and all organisms. The axes depict the duration of events and the sizes of participants using a log10 scale. Image by David J. Patterson. License: CC BY.

In the Nature’s Envelope, information on sizes of participants and durations of processes from all levels of organization are plotted on a grid. The grid uses a logarithmic (base 10) scale, which has about 21 orders of magnitude of size and 35 orders of magnitude of time. Information on processes ranging from the subatomic, through molecular, cellular, tissue, organismic, species, communities to ecosystems is assigned to the appropriate decadal blocks. 

Examples include movements from the stepping motion of molecules like kinesin that move forward 8 nanometres in about 10 milliseconds; or the migrations of Arctic terns which follow routes of 30,000 km or more from Europe to Antarctica over 3 to 4 months.

The extremes of life processes are determined by the smallest and largest entities to participate, and the briefest and most enduring processes.

The briefest event to be included is the transfer of energy from a photon to a photosynthetic pigment as the photon passes through a chlorophyll molecule several nanometres in width at a speed of 300,000 km per second. That transaction is conducted in about 10-17 seconds. As it involves the smallest subatomic particles, it defines the lower left corner of the grid. 

The most enduring is the process of evolution that has been progressing for almost 4 billion years. The influence of the latter has created the biosphere (the largest living object) and affects the gas content of the atmosphere. This process established the upper right extreme of the grid.

All biological processes fit into a broad S-shaped envelope that includes about half of the decadal blocks in the grid. The envelope drawn round the initial examples is Nature’s Envelope.

Nature’s envelope will be a useful addition to many discussions, whether they deal with the infrastructure that will manage the digital age of biology, or provide the context for education on the diversity and range of processes that living systems engage in.

The version of Nature’s Envelope published in the RIO journal is seen as a first version, to be refined and enhanced through community participation,”

comments Patterson.

***

Original source:

Patterson DJ (2022) The scope and scale of the life sciences (‘Nature’s envelope’). Research Ideas and Outcomes 8: e96132. https://doi.org/10.3897/rio.8.e96132

***

Follow Research Ideas and Outcomes (RIO Journal) on Twitter, Facebook and Linkedin.

Digitising beans to feed the world

In 2018, NHM London’s digitisation team started a project to digitise non-type herbarium material from the legume family. A recent data paper in the Biodiversity Data Journal reports on the outcomes.

You can find the original blog post by the Natural History Museum of London, reposted here with minor edits.

Legumes are a group of plants that include soybeans, peas, chickpeas, peanuts and lentils. They are a significant source of protein, fibre, carbohydrates, and minerals in our diet and some, like the cowpea, are resistant to droughts.

In 2018, the Natural History Museum of London’s (NHM London) digitisation team started a project in collaboration with project leader Royal Botanic Gardens Kew and the Royal Botanic Garden Edinburgh.

The project’s outcomes were published in a data paper in the Biodiversity Data Journal. Within the project, the digitisation team aimed to collectively digitise non-type herbarium material from the legume family. This includes rosewood trees (Dalbergia), padauk trees (Pterocarpus) and the Phaseolinae subtribe that contains many of the beans cultivated for human and animal food.

This project was made possible through the Department for Environment Food & Rural Affairs (DEFRA)-allocated Official Development Assistance (ODA) funding, distributed by the UK government in its “global efforts to defeat poverty, tackle instability and create prosperity in developing countries”.

AfricanGuinea, Ethiopia, Sudan, Kenya, Uganda, Tanzania, Mozambique, Malawi and Madagascar
AsianBangladesh, Myanmar, Nepal, New Guinea and India
Southern and Central AmericanGuatemala, Honduras, El Salvador, Nicaragua, Bolivia, Argentina and Brazil
ODA-listed Countries

The legume groups: Dalbergia, Pterocarpus and Phaseolinae,were chosen for digitisation to support the development of dry beans as a sustainable and resilient crop, and to aid conservation and sustainable use of rosewood and padauk trees. Some of these beans, especially cow pea and pigeon pea, are sustainable and resilient crops, as they can be grown in poor-quality soils and are drought stress resistant. This makes them particularly suitable for agricultural production where the growing of other crops would be difficult.

Digitally discoverable herbarium specimens can provide important information about the distribution of individual species, as well as highlighting which species occur naturally together.

While there have been collaborative efforts between herbaria in the past, these have tended to prioritise digitisation of type specimens: the example specimens for which a species is named.

Types are important to identification, but being individual specimens, they don’t offer insights into species distribution over time. By focusing on the non-types across the world and over the last 200 years, we have released a brand-new resource to the global scientific community.

Searching for beans

This collection was digitised by creating an inventory record for each specimen, attaching images of each herbarium sheet, and then transcribing more data and georeferencing the specimens, providing an accurate locality in space and time for their collection. 

We originally had four months and three members of staff to digitise over 11,000 specimens. The Covid-19 lockdown was ironically rather lucky for this project as it enabled us to have more time to transcribe and georeference all of the records. 

say the researchers behind the digitisation project.
Map showing breakdown of records by country.

“We were able to assign country-level data to 10,857 out of the total number of 11,222 records. We were also able to transcribe the collectors’ names from the majority of our specimen labels (10,879 out of 11,222). Only 770 out of the 2,226 individuals identified during this project collected their specimens in ODA listed countries. The highest contributors were: Richard Beddome (130 specimens), Charles Clarke (110), Hans Schlieben (98) and Nathaniel Wallich (79). The breakdown of records by ODA country can be seen in the chart below. “

Map showing breakdown of records by country and pie chart showing distribution by ODA listed countries.

From our data, we can see the peak decade of collection was the 1930s, with almost half (4,583 specimens or 49,43%) collected between 1900 and 1950 (Fig. 10).

This peak can be attributed to three of our most prolific collectors: Arthur Kerr, John Gossweiler and Georges Le Testu, all of whom were most active in the 1930s. The oldest specimen (BM013713473) was collected by Mark Catesby (1683-1749) in the Bahamas in 1726.

they explain.

An interesting, but perhaps unsurprising, finding is that our collection is strongly male-dominated.

There are only two women (Caroline Whitefoord and Ynes Mexia) in the list of our top 50 plant collectors and they are not close to the most prolific collectors.

We identified more women in the rest of our records, but their contribution is on average less than 25 specimens per person in the dataset consisting of more than 10,000 specimens. In contrast, the top five male collectors contributed 10% of our collection. 

they continued

Releasing Rosewoods

Both the Pterocarpus and Dalbergia genera include species that are used as expensive good quality timber that is prone to illegal logging. Many species such as Pterocarpus tinctorius are also listed on the International Union for Conservation of Nature (IUCN) Red List of Threatened Species. By releasing this new resource of information on all these plants from three of the biggest herbaria in the world, we can share this datа with the people who are taking care of biodiversity in these countries. The data can be used to identify hotspots, where the tree is naturally growing and protect these areas. These data would also allow much closer attention to be paid to areas that could be targets for illegal logging activity.

Pterocarpus tinctorius is a species of padauk tree that is listed as endangered on the IUCN Red List.
Cowpea (Vigna unguiculata) is a food and animal feed crop grown in the semi-arid tropics.

The ODA-listed countries are economically impoverished and disproportionately prone to be disadvantaged with the changing climate whether from flood or drought or increase in temperature.

Using data to identify good, nutritious plant species that can be grown in such conditions can therefore benefit local communities, potentially reducing dependence on imports, aid and on less resilient crops. 

the team adds in conclusion.

***

This dataset is now openly available on the Museum’s Data Portal and a data paper about this work has been released in the Biodiversity Data Journal.

***

Stay in touch with the Digitisation team by following us on Instagram and Twitter

Don’t forget to also follow the Biodiversity Data Journal on Twitter and Facebook.

Digitising the Natural History Museum London’s entire collection could contribute over £2 billion to the global economy

In a world first, the Natural History Museum, London, has collaborated with economic consultants, Frontier Economics Ltd, to explore the economic and societal value of digitising natural history collections and concluded that digitisation has the potential to see a seven to tenfold return on investment. Whilst significant progress is already being made at the Museum, additional investment is needed in order to unlock the full potential of the Museum’s vast collections – more than 80 million objects. The project’s report is published in the open science scientific journal Research Ideas and Outcomes (RIO Journal).

One of the Museum’s digitisers imaging a butterfly to join the 4.93 million specimens already available online. 
© The Trustees of the Natural History Museum, London

The societal benefits of digitising natural history collections extends to global advancements in food security, biodiversity conservation, medicine discovery, minerals exploration, and beyond. Brand new, rigorous economic report predicts investing in digitising natural history museum collections could also result in a tenfold return. The Natural History Museum, London, has so far made over 4.9 million digitised specimens available freely online – over 28 billion records have been downloaded over 429,000 download events over the past six years. 

Digitisation at the Natural History Museum, London 

Digitisation is the process of creating and sharing the data associated with Museum specimens. To digitise a specimen, all its related information is added to an online database. This typically includes where and when it was collected and who found it, and can include photographs, scans and other molecular data if available. Natural history collections are a unique record of biodiversity dating back hundreds of years, and geodiversity dating back millennia. Creating and sharing data this way enables science that would have otherwise been impossible, and we accelerate the rate at which important discoveries are made from our collections.  

The Natural History Museum’s collection of 80 million items is one of the largest and most historically and geographically diverse in the world. By unlocking the collection online, the Museum provides free and open access for global researchers, scientists, artists and more. Since 2015, the Museum has made 4.9 million specimens available on the Museum’s Data Portal, which have seen more than 28 billion downloads over 427,000 download events. 

This means the Museum has digitised  about 6% of its collections to date. Because digitisation is expensive, costing tens of millions of pounds, it is difficult to make a case for further investment without better understanding the value of this digitisation and its benefits. 

In 2021, the Museum decided to explore the economic impacts of collections data in more depth, and commissioned Frontier Economics to undertake modelling, resulting in this project report, now made publicly available in the open-science journal Research Ideas and Outcomes (RIO Journal), and confirming benefits in excess of £2 billion over 30 years. While the methods in this report are relevant to collections globally, this modelling focuses on benefits to the UK, and is intended to support the Museum’s own digitisation work, as well as a current scoping study funded by the Arts & Humanities Research Council about the case for digitising all UK natural science collections as a research infrastructure.

Sharing data from our collections can transform scientific research and help find solutions for nature and from nature. Our digitised collections have helped establish the baseline plant biodiversity in the Amazon, find wheat crops that are more resilient to climate change and support research into potential zoonotic origins of Covid-19. The research that comes from sharing our specimens has immense potential to transform our world and help both people and the planet thrive,

says Helen Hardy, Science Digital Programme Manager at the Natural History Museum.

How digitisation impacts scientific research?

The data from museum collections accelerates scientific research, which in turn creates benefits for society and the economy across a wide range of sectors. Frontier Economics Ltd have looked at the impact of collections data in five of these sectors: biodiversity conservation, invasive species, medicines discovery, agricultural research and development and mineral exploration. 

The Natural History Museum’s collection is a real treasure trove which, if made easily accessible to scientists all over the world through digitisation, has the potential to unlock ground-breaking research in any number of areas. Predicting exactly how the data will be used in future is clearly very uncertain. We have looked at the potential value that new research could create in just five areas focussing on a relatively narrow set of outcomes. We find that the value at stake is extremely large, running into billions,”

says Dan Popov, Economist at Frontier Economics Ltd.

The new analyses attempt to estimate the economic value of these benefits using a range of approaches, with the results in broad agreement that the benefits of digitisation are at least ten times greater than the costs. This represents a compelling case for investment in museum digital infrastructure without which the many benefits will not be realised.

This new analysis shows that the data locked up in our collections has significant societal and economic value, but we need investment to help us release it,

adds Professor Ken Norris, Head of the Life Sciences Department at the Natural History Museum.

Other benefits could include improvements to the resilience of agricultural crops by better understanding their wild relatives, research into invasive species which can cause significant damage to ecosystems and crops, and improving the accuracy of mining.  

Finally, there are other impacts that such work could have on how science is conducted itself. The very act of digitising specimens means that researchers anywhere on the planet can access these collections, saving time and money that may have been spent as scientists travelled to see specific objects.

The value of research enabled by digitisation of natural history collections can be estimated by looking at specific areas where the Museum’s collections contribute towards scientific research and subsequently impact the wider economy. 
© Frontier Economics Ltd.

Original source: 

Popov D, Roychoudhury P, Hardy H, Livermore L, Norris K (2021) The Value of Digitising Natural History Collections. Research Ideas and Outcomes 7: e78844. https://doi.org/10.3897/rio.7.e78844

Artificial neural networks could power up curation of natural history collections

Deep learning techniques manage to differentiate between similar plant families with up to 99 percent accuracy, Smithsonian researchers reveal

Millions, if not billions, of specimens reside in the world’s natural history collections, but most of these have not been carefully studied, or even looked at, in decades. While containing critical data for many scientific endeavors, most objects are quietly sitting in their own little cabinets of curiosity.

Thus, mass digitization of natural history collections has become a major goal at museums around the world. Having brought together numerous biologists, curators, volunteers and citizens scientists, such initiatives have already generated large datasets from these collections and provided unprecedented insight.

Now, a study, recently published in the open access Biodiversity Data Journal, suggests that the latest advances in both digitization and machine learning might together be able to assist museum curators in their efforts to care for and learn from this incredible global resource.

A team of researchers from the Smithsonian Department of BotanyData Science Lab, and Digitization Program Office recently collaborated with NVIDIA to carry out a pilot project using deep learning approaches to dig into digitized herbarium specimens.

Smithsonian researchers classifying digitized herbarium sheets.
Smithsonian researchers classifying digitized herbarium sheets.

Their study is among the first to describe the use of deep learning methods to enhance our understanding of digitized collection samples. It is also the first to demonstrate that a deep convolutional neural network–a computing system modelled after the neuron activity in animal brains that can basically learn on its own–can effectively differentiate between similar plants with an amazing accuracy of nearly 100%.

In the paper, the scientists describe two different neural networks that they trained to perform tasks on the digitized portion (currently 1.2 million specimens) of the United States National Herbarium.

The team first trained a net to automatically recognize herbarium sheets that had been stained with mercury crystals, since mercury was commonly used by some early collectors to protect the plant collections from insect damage. The second net was trained to discriminate between two families of plants that share a strikingly similar superficial appearance.

Sample herbarium specimen image of stained clubmoss
Sample herbarium specimen image of stained clubmoss.

The trained neural nets performed with 90% and 96% accuracy respectively (or 94% and 99% if the most challenging specimens were discarded), confirming that deep learning is a useful and important technology for the future analysis of digitized museum collections.

“The results can be leveraged both to improve curation and unlock new avenues of research,” conclude the scientists.

“This research paper is a wonderful proof of concept. We now know that we can apply machine learning to digitized natural history specimens to solve curatorial and identification problems. The future will be using these tools combined with large shared data sets to test fundamental hypotheses about the evolution and distribution of plants and animals,” says Dr. Laurence J. Dorr, Chair of the Smithsonian Department of Botany.

 

###

Original source:

Schuettpelz E, Frandsen P, Dikow R, Brown A, Orli S, Peters M, Metallo A, Funk V, Dorr L (2017) Applications of deep convolutional neural networks to digitized natural history collections. Biodiversity Data Journal 5: e21139. https://doi.org/10.3897/BDJ.5.e21139