Data mining applied to scholarly publications to finally reveal Earth’s biodiversity

At a time when a million species are at risk of extinction, according to a recent UN report, ironically, we don’t know how many species there are on Earth, nor have we noted down all those that we have come to know on a single list. In fact, we don’t even know how many species we would have put on such a list.

The combined research including over 2,000 natural history institutions worldwide, produced an estimated ~500 million pages of scholarly publications and tens of millions of illustrations and species descriptions, comprising all we currently know about the diversity of life. However, most of it isn’t digitally accessible. Even if it were digital, our current publishing systems wouldn’t be able to keep up, given that there are about 50 species described as new to science every day, with all of these published in plain text and PDF format, where the data cannot be mined by machines, thereby requiring a human to extract them. Furthermore, those publications would often appear in subscription (closed access) journals.

The Biodiversity Literature Repository (BLR), a joint project ofPlaziPensoft and Zenodo at CERN, takes on the challenge to open up the access to the data trapped in scientific publications, and find out how many species we know so far, what are their most important characteristics (also referred to as descriptions or taxonomic treatments), and how they look on various images. To do so, BLR uses highly standardised formats and terminology, typical for scientific publications, to discover and extract data from text written primarily for human consumption.

By relying on state-of-the-art data mining algorithms, BLR allows for the detection, extraction and enrichment of data, including DNA sequences, specimen collecting data or related descriptions, as well as providing implicit links to their sources: collections, repositories etc. As a result, BLR is the world’s largest public domain database of taxonomic treatments, images and associated original publications.

Once the data are available, they are immediately distributed to global biodiversity platforms, such as GBIF–the Global Biodiversity Information Facility. As of now, there are about 42,000 species, whose original scientific descriptions are only accessible because of BLR.

The very basic principle in science to cite previous information allows us to trace back the history of a particular species, to understand how the knowledge about it grew over time, and even whether and how its name has changed through the years. As a result, this service is one avenue to uncover the catalogue of life by means of simple lookups.

So far, the lessons learned have led to the development of TaxPub, an extension of the United States National Library of Medicine Journal Tag Suite and its application in a new class of 26 scientific journals. As a result, the data associated with articles in these journals are machine-accessible from the beginning of the publishing process. Thus, as soon as the paper comes out, the data are automatically added to GBIF.

While BLR is expected to open up millions of scientific illustrations and descriptions, the system is unique in that it makes all the extracted data findable, accessible, interoperable and reusable (FAIR), as well as open to anybody, anywhere, at any time. Most of all, its purpose is to create a novel way to access scientific literature.

To date, BLR has extracted ~350,000 taxonomic treatments and ~200,000 figures from over 38,000 publications. This includes the descriptions of 55,800 new species, 3,744 new genera, and 28 new families. BLR has contributed to the discovery of over 30% of the ~17,000 species described annually.

Prof. Lyubomir Penev, founder and CEO of Pensoft says,

“It is such a great satisfaction to see how the development process of the TaxPub standard, started by Plazi some 15 years ago and implemented as a routine publishing workflow at Pensoft’s journals in 2010, has now resulted in an entire infrastructure that allows automated extraction and distribution of biodiversity data from various journals across the globe. With the recent announcement from the Consortium of European Taxonomic Facilities (CETAF) that their European Journal of Taxonomy is joining the TaxPub club, we are even more confident that we are paving the right way to fully grasping the dimensions of the world’s biodiversity.”

Dr Donat Agosti, co-founder and president of Plazi, adds:

“Finally, information technology allows us to create a comprehensive, extended catalogue of life and bring to light this huge corpus of cultural and scientific heritage – the description of life on Earth – for everybody. The nature of taxonomic treatments as a network of citations and syntheses of what scientists have discovered about a species allows us to link distinct fields such as genomics and taxonomy to specimens in natural history museums.”

Dr Tim Smith, Head of Collaboration, Devices and Applications Group at CERN, comments:

“Moving the focus away from the papers, where concepts are communicated, to the concepts themselves is a hugely significant step. It enables BLR to offer a unique new interconnected view of the species of our world, where the taxonomic treatments, their provenance, histories and their illustrations are all linked, accessible and findable. This is inspirational for the digital liberation of other fields of study!”

###

Additional information:

BLR is a joint project led by Plazi in partnership with Pensoft and Zenodo at CERN.

Currently, BLR is supported by a grant from Arcadia, a charitable fund of Lisbet Rausing and Peter Baldwin.

‘Insectageddon’ is ‘alarmist by bad design’: Scientists point out the study’s major flaws

Many insects species require pristine environments, including old-growth forests. Photo by Atte Komonen.

Earlier this year, a research article triggered a media frenzy by predicting that as a result of an ongoing rapid decline, nearly half of the world’s insects will be no more pretty soon

Amidst worldwide publicity and talks about ‘Insectageddon’: the extinction of 40% of the world’s insects, as estimated in a recent scientific reviewa critical response was published in the open-access journal Rethinking Ecology.

Query- and geographically-biased summaries; mismatch between objectives and cited literature; and misuse of existing conservation data have all been identified in the alarming study, according to Drs Atte Komonen, Panu Halme and Janne Kotiaho of the University of Jyväskylä (Finland). Despite the claims of the review paper’s authors that their work serves as a wake-up call for the wider community, the Finnish team explain that it could rather compromise the credibility of conservation science.

The first problem about the paper, titled “Worldwide decline of the entomofauna: A review of its drivers” and published in the journal Biological Conservation, is that its authors have queried the Web of Science database specifically using the keywords “insect”, “decline” and “survey”.

“If you search for declines, you will find declines. We are not questioning the conclusion that insects are declining,” Komonen and his team point out, “but we do question the rate and extent of declines.”

Many butterflies have declined globally. Scolitantides orion, for example, is an endangered species in Finland. Photo by Atte Komonen.

The Finnish research team also note that there are mismatches between methods and literature, and misuse of IUCN Red List categories. The review is criticised for grouping together species, whose conservation status according to the International Union for Conservation of Nature (IUCN) is Data Deficient with those deemed Vulnerable. By definition, there are no data for Data Deficient species to assess their declines.

In addition, the review paper is seen to use “unusually forceful terms for a peer-reviewed scientific paper,” as the Finnish researchers quote a recent news story published in The Guardian. Having given the words dramatic, compelling, extensive, shocking, drastic, dreadful, devastating as examples, they add that that such strong intensifiers “should not be acceptable” in research articles.

“As actively popularising conservation scientists, we are concerned that such development is eroding the importance of the biodiversity crisis, making the work of conservationists harder, and undermining the credibility of conservation science,” the researchers explain the motivation behind their response.

###

Original source:

Komonen A, Halme P, Kotiaho JS (2019) Alarmist by bad design: Strongly popularized unsubstantiated claims undermine credibility of conservation science. Rethinking Ecology 4: 17-19. https://doi.org/10.3897/rethinkingecology.4.34440

First-ever fern checklist for Togo to help decision makers in the face of threats to biodiversity

Maidenhair fern (Adiantum schweinfurthii) occurring in dense forests.

Ferns and their allied species, which together comprise the pteridophytes, are vascular non-flowering plants that reproduce via spores. Many of their species are admired for their aesthetics.

However, despite being excellent bioindicators that allow for scientists and decision-makers to monitor the state of ecosystems in the face of climate change and global biodiversity crisis, these species are too often overlooked due to their relatively small size and lack of vivid colours.

Spike moss (Selaginella versicolor) with a preference for very humid and shaded forests.

To bridge the existing gaps in the knowledge about the diversity of ferns and their allied species, while also seeking to identify the ways these plants select their habitats and react to the changes occurring there later on, a research team from Togo and France launched an ambitious biodiversity project in 2013. As for the setting of their long-term study, they chose Togo – an amazingly species-rich country in Western Africa, whose flora expectedly turned out to be hugely understudied.

Having concluded their fern project in 2017, scientists Komla Elikplim Abotsi and Kouami Kokou from the Laboratory of Forestry Research, University of Lomé, Togo, who teamed up with Jean-Yves Dubuisson and Germinal Rouhan, both affiliated with the Institute of Systematics Evolution and Biodiversity (UMR 7205), France, have their first findings published in a taxonomic paper in the open access Biodiversity Data Journal.

In this first-of-a-kind checklist of Togolese ferns, the researchers record as many as 73 species previously not known to inhabit the country, including 12 species introduced for horticultural purposes. As a result of their 4-year study, the pteridophyte diversity of Togo – a country barely taking up 56,600 km² – now counts a total of 134 species.

Still, the authors believe that there are even more species waiting to be discovered on both national and global level.

“Additional investigations in the difficult to access areas of the far north of the country, and Togo Mountains are still needed to fill possible biodiversity data gaps and enable decision-makers to make the right decisions,” say the researchers.

The triangular staghorn species Platycerium stemaria living on a coffee tree branch.

In addition to their taxonomic paper, the authors are also set to publish an illustrated guide to the pteridophytes of Togo, in order to familiarise amateur botanists with this fascinating biodiversity.

 

Original source:
Abotsi KE, Kokou K, Dubuisson J-Y, Rouhan G (2018) A first checklist of the Pteridophytes of Togo (West Africa). Biodiversity Data Journal 6: e24137. https://doi.org/10.3897/BDJ.6.e24137

Special Nature Conservation issue: Monitoring protected insects in the European Union

A collection of thirteen research papers has been published to address the conservation of saproxylic beetles and other insects listed in the Habitats Directive

With biodiversity loss well underway, conservation measures are urgent on a global scale and the European Union is no exception. However, for efficient strategies and actions to be put in place, plenty of information, acquired primarily through monitoring, is needed to identify priorities for the conservation of threatened species, also for the elusive saproxylic insects, an ecological group of species that depends on dead wood.

Monitoring and conservation of elusive invertebrates is a particularly complex task, as shown in the papers comprising the special issue “Monitoring of saproxylic beetles and other insects protected in the European Union,” supported by the EU’s LIFE Programme and published in the open access journal Nature Conservation. This special issue was produced in the framework of the Life Project “Monitoring of insects with public participation” (LIFE11 NAT/IT/000252 MIPP) and is a direct result of a European Workshop held in Mantova in May, 2017.

Colonel Franco Mason, project manager of the MIPP project, notes that the European Workshop was aimed primarily at monitoring of saproxylic beetles. The project MIPP resulted in two special issues: “Monitoring of saproxylic beetles and other insects protected in the European Union” and “Guidelines for the Monitoring of saproxylic beetles and other insects protected in the European Union“. The first one is now available in the open access journal Nature Conservation.

This is a female European stag beetle equipped with a radio transmitter in order to detect oviposition sites.
This is a female European stag beetle equipped with a radio transmitter in order to detect oviposition sites.

“No knowledge exists of the success rate of monitoring elusive invertebrates,” writes Dr. Arno Thomaes, Research Institute for Nature and Forest, Belgium, and his team in their paper assessing the feasibility of monitoring the European stag beetle. Having conducted their analysis, though, the scientists conclude that, “monitoring of stag beetles is feasible and the effort is not greater than that which has been found for other invertebrates.”

Alessandro Campanaro, a researcher at the “Bosco Fontana” National Center of Carabinieri, highlights the fundamental role of Citizen Science as an essential tool for acquiring data on species, while simultaneously increasing the public awareness about Natura 2000 and the role of saproxylic species in forests.

###

 

Additional information:

About the Life project MIPP

The main objective of the project MIPP is to develop and test methods for the monitoring of five beetle species listed in the Annexes II and IV of the Habitats Directive (Osmoderma eremita, Lucanus cervus, Cerambyx cerdo, Rosalia alpina, Morimus funereus).

Cost-benefit analysis of strategies against severely harmful giant hogweed in Germany

While invasive species are considered to be a primary driver of biodiversity loss across the globe, species such as the alien for Germany giant hogweed pose even greater risks, including health hazards to humans, limited accessibility to sites, trails and amenity areas, as well as ecological damages.

Since 1st January 2015, EU member states are obligated to develop concrete action plans against (further) spread of invasive alien species. In order to do so, however, policymakers need adequate knowledge about data of the current spread situation as well as information about costs and benefits of control measures. Therefore, German researchers analyse the present situation and control measures, as well as the cost-effectiveness of the possible eradication strategies. Their analysis is published in the open access journal NeoBiota.

Largely spread across Germany, the giant hogweed (H. mantegazzianum) grows in a wide range of habitats, including roadsides, grasslands, riparian habitats and woodland margins. The highest invasion percentage (18.5%) was found for abandoned grasslands, field and grassland margins, and tall-forb stands.

While the species poses a serious threat on native biodiversity through competitive displacement of native plants, it is particularly dangerous to human health. Its watery sap contains several chemical agents. In contact with the skin, this sap can cause severe blistering if the person is simultaneously exposed to sunlight. Furthermore, the hypersensitivity of the skin towards sunlight may persist for a number of years. Additionally, the giant hogweed can limit public accessibility to sites, trails and amenity areas, as well as inflict ecological damages, such as erosion at riverbanks.

In order to provide policymakers with the information needed for adequate control measures, Dr. Sandra Rajmis from the Julius Kühn-Institute, Dr. Jan Thiele from the University of Münster, and Prof. Dr. Rainer Marggraf from Georg-August-Universität Göttingen examine costs and benefits of controlling giant hogweed in Germany.

To address these challenges, the scientists firstly study the present state and costs of control measures, based on survey data received from German nature authorities. Then, they analyse the identified control options in terms of cost effectiveness with regard to the invaded area types and sizes in the infested German districts. To estimate the benefits of the eradication strategies, they turn to a choice experiment survey conducted in German households.

“Only in light of these findings, policymakers can properly understand about the societal costs and benefits of alternatives and decide about societal favored control options in Germany,” point out the researchers.

The team also notes that cost-effectiveness of eradication strategies depends on the length of the period over which they are implemented and observed.

“As this is the first cost-benefit analysis estimating welfare effects and societal importance of giant hogweed invasion control, it could serve as guideline for assessments of eradication control in other European countries and support the implementation of the EU directive 1143/2014,” they conclude.

###

Original source: Rajmis S, Thiele J, Marggraf R (2016) A cost-benefit analysis of controlling giant hogweed (Heracleum mantegazzianum) in Germany using a choice experiment approach.NeoBiota 31: 19-41. doi: 10.3897/neobiota.31.8103