Tag: biodiversity data

Fifteen years & 20 million insects later: Sweden’s impressive effort to document its insect fauna in a changing world

The Swedish Malaise Trap Project (SMTP) was launched in 2003 with the aim of making a complete list of the insect diversity of Sweden. Over the past fifteen years, an estimated total of 20 million insects, collected during the project, have been processed for scientific study. Recently, the team behind this effort published the resulting inventory in the open-access journal Biodiversity Data Journal. In their paper, they also document the project all the way from its inception to its current status by reporting on its background, organisation, methodology and logistics.

The SMTP deployed a total of 73 Malaise traps – a Swedish invention designed to capture flying insects – and placed them across the country, where they remained from 2003 to 2006. Subsequently, the samples were sorted by a dedicated team of staff, students and volunteers into over 300 groups of insects ready for further study by expert entomologists. At the present time, this material can be considered as a unique timestamp of the Swedish insect fauna and an invaluable source of baseline data, which is especially relevant as reports of terrifying insect declines keep on making the headlines across the world.

The first author and Project Manager of the SMTP, Dave Karlsson started his academic paper on the project’s results years ago by compiling various tips, tricks, lessons and stories that he had accumulated over his years as SMTP’s Project Manager. Some fun examples include the time when one of the Malaise traps was destroyed by a moose bull rubbing his antlers against it, or when another trap was attacked and eaten by a group of 20 reindeer. The project even had a trap taken out by Sweden’s military! Karlsson’s intention was that, by sharing the details of the project, he would inspire and encourage similar efforts around the globe.

Animals were not as kind to our traps as humans,” recall the scientists behind the project. One of the Malaise traps, located in the Brännbergets Nature Reserve in Västerbotten, was destroyed by a bull moose rubbing his antlers against it.
Photo by Anna Wenngren

Karlsson has worked with and trained dozens of workers in the SMTP lab over the past decade and a half. Some were paid staff, some were enthusiastic volunteers and a good number were researchers and students using SMTP material for projects and theses. Thus, he witnessed first-hand how much excitement and enthusiasm the work on insect samples under a microscope can generate, even in those who had been hesitant about “bugs” at first.

Stressing the benefits of traditional morphological approaches to inventory work, he says: “Appreciation for nature is something you miss when you go ‘hi-tech’ with inventory work. We have created a unique resource for specialists in our sorted material while fostering a passion for natural history.”

Sorted SMTP material is now available to specialists. Hundreds of thousands of specimens have already been handed over to experts, resulting in over 1,300 species newly added to the Swedish fauna. A total of 87 species have been recognised as new to science from the project thus far, while hundreds more await description.

The SMTP is part of the Swedish Taxonomy Initiative, from where it also receives its funding. In its turn, the latter is a project by the Swedish Species Information Center, a ground-breaking initiative funded by the Swedish Parliament since 2002 with the aim of documenting all multicellular life in Sweden.

The SMTP is based at Station Linné, a field station named after the famous Swedish naturalist and father of taxonomy, Carl Linneaus. Situated on the Baltic island of Öland, the station is managed by Dave Karlsson. Co-authors Emily Hartop and Mathias Jaschhof are also based at the station, while Mattias Forshage and Fredrik Ronquist (SMTP Project Co-Founder) are based at the Swedish Museum of Natural History.

###

Original source:

Karlsson D, Hartop E, Forshage M, Jaschhof M, Ronquist F (2020) The Swedish Malaise Trap Project: A 15 Year Retrospective on a Countrywide Insect Inventory. Biodiversity Data Journal 8: e47255. https://doi.org/10.3897/BDJ.8.e47255

On the edge between science & art: historical biodiversity data from Japanese “gyotaku”

Japanese cultural art of ‘gyotaku’, which means “fish impression” or “fish rubbing”, captures accurate images of fish specimens. It has been used by recreational fishermen and artists since the Edo Period. Distributional data from 261 ‘Gyotaku’ rubbings were extracted for 218 individual specimens, roughly representing regional fish fauna and common fishing targets in Japan through the years. The results of the research are presented in a paper published by Japanese scientists in open-access journal Zookeys.

Japanese cultural art of ‘gyotaku’, which means “fish impression” or “fish rubbing”, captures accurate images of fish specimens. It has been used by recreational fishermen and artists since the Edo Period. Distributional data from 261 ‘Gyotaku’ rubbings were extracted for 218 individual specimens, roughly representing regional fish fauna and common fishing targets in Japan through the years. The results of the research are presented in a paper published by Japanese scientists in open-access journal Zookeys.

Historical biodiversity data is being obtained from museum specimens, literature, classic monographs and old photographs, yet those sources can be damaged, lost or not completely adequate. That brings us to the need of finding additional, even if non-traditional, sources.

In Japan many recreational fishers have recorded their memorable catches as ‘gyotaku’ (魚拓), which means fish impression or fish rubbing in English. ‘Gyotaku’ is made directly from the fish specimen and usually includes information such as sampling date and locality, the name of the fisherman, its witnesses, the fish species (frequently its local name), and fishing tackle used. This art has existed since the last Edo period. Currently, the oldest ‘gyotaku’ material is the collection of the Tsuruoka City Library made in 1839.

Traditionally, ‘gyotaku’ is printed by using black writing ink, but over the last decades colour versions of ‘gyotaku’ have become better developed and they are now used for art and educational purposes. Though, the colour prints are made just for the means of art and rarely include specimen data, sampling locality and date.

In the sense of modern technological progress, it’s getting rarer and rarer that people are using ‘gyotaku’ to save their “fishing impressions”. The number of personally managed fishing-related shops is decreasing and the number of original ‘gyotaku’ prints and recreational fishermen might start to decrease not before long.

Smartphones and photo cameras are significantly reducing the amount of produced ‘gyotaku’, while the data from the old art pieces are in danger of either getting lost or diminished in private collections. That’s why the research on existing ‘gyotaku’ as a data source is required.

A Japanese research team, led by Mr. Yusuke Miyazaki, has conducted multiple surveys among recreational fishing shops in different regions of Japan in order to understand if ‘gyotaku’ information is available within all the territory of the country, including latitudinal limits (from subarctic to subtropical regions) and gather historical biodiversity data from it.

In total, 261 ‘gyotaku’ rubbings with 325 printed individual specimens were found among the targeted shops and these data were integrated to the ‘gyotaku’ database. Distributional data about a total of 235 individuals were obtained within the study.

The observed species compositions reflected the biogeography of the regions and can be representative enough to identify rare Red-listed species in particular areas. Some of the studied species are listed as endangered in national and prefectural Red Lists which prohibits the capture, holding, receiving and giving off, and other interactions with the species without the prefectural governor’s permission. Given the rarity of these threatened species in some regions, ‘gyotaku’ are probably important vouchers for estimating historical population status and factors of decline or extinction.

“Overall, the species composition displayed in the ‘gyotaku’ approximately reflected the fish faunas of each biogeographic region. We suggest that Japanese recreational fishers may be continuing to use the ‘gyotaku’ method in addition to digital photography to record their memorable catches” , concludes author of the research, Mr. Yusuke Miyazaki.

*Gyotaku rubbing from the fish store in Miyazaki Prefecture*
*Credit: Yusuke Miyazaki*
*License: CC-BY 4.0*

*Gyotaku rubbing of the specimen from Kanagawa found in the shop in Tokyo*
*Credit: Yusuke Miyazaki*
*License: CC-BY 4.0*

###

Original source:

Miyazaki Y, Murase A (2019) Fish rubbings, ‘gyotaku’, as a source of historical biodiversity data. ZooKeys 904: 89-101. https://doi.org/10.3897/zookeys.904.47721

Data mining applied to scholarly publications to finally reveal Earth’s biodiversity

At a time when a million species are at risk of extinction, according to a recent UN report, ironically, we don’t know how many species there are on Earth, nor have we noted down all those that we have come to know on a single list. In fact, we don’t even know how many species we would have put on such a list.

The combined research including over 2,000 natural history institutions worldwide, produced an estimated ~500 million pages of scholarly publications and tens of millions of illustrations and species descriptions, comprising all we currently know about the diversity of life. However, most of it isn’t digitally accessible. Even if it were digital, our current publishing systems wouldn’t be able to keep up, given that there are about 50 species described as new to science every day, with all of these published in plain text and PDF format, where the data cannot be mined by machines, thereby requiring a human to extract them. Furthermore, those publications would often appear in subscription (closed access) journals.

The Biodiversity Literature Repository (BLR), a joint project ofPlazi, Pensoft and Zenodo at CERN, takes on the challenge to open up the access to the data trapped in scientific publications, and find out how many species we know so far, what are their most important characteristics (also referred to as descriptions or taxonomic treatments), and how they look on various images. To do so, BLR uses highly standardised formats and terminology, typical for scientific publications, to discover and extract data from text written primarily for human consumption.

By relying on state-of-the-art data mining algorithms, BLR allows for the detection, extraction and enrichment of data, including DNA sequences, specimen collecting data or related descriptions, as well as providing implicit links to their sources: collections, repositories etc. As a result, BLR is the world’s largest public domain database of taxonomic treatments, images and associated original publications.

Once the data are available, they are immediately distributed to global biodiversity platforms, such as GBIF–the Global Biodiversity Information Facility. As of now, there are about 42,000 species, whose original scientific descriptions are only accessible because of BLR.

The very basic principle in science to cite previous information allows us to trace back the history of a particular species, to understand how the knowledge about it grew over time, and even whether and how its name has changed through the years. As a result, this service is one avenue to uncover the catalogue of life by means of simple lookups.

So far, the lessons learned have led to the development of TaxPub, an extension of the United States National Library of Medicine Journal Tag Suite and its application in a new class of 26 scientific journals. As a result, the data associated with articles in these journals are machine-accessible from the beginning of the publishing process. Thus, as soon as the paper comes out, the data are automatically added to GBIF.

While BLR is expected to open up millions of scientific illustrations and descriptions, the system is unique in that it makes all the extracted data findable, accessible, interoperable and reusable (FAIR), as well as open to anybody, anywhere, at any time. Most of all, its purpose is to create a novel way to access scientific literature.

To date, BLR has extracted ~350,000 taxonomic treatments and ~200,000 figures from over 38,000 publications. This includes the descriptions of 55,800 new species, 3,744 new genera, and 28 new families. BLR has contributed to the discovery of over 30% of the ~17,000 species described annually.

Prof. Lyubomir Penev, founder and CEO of Pensoft says,

“It is such a great satisfaction to see how the development process of the TaxPub standard, started by Plazi some 15 years ago and implemented as a routine publishing workflow at Pensoft’s journals in 2010, has now resulted in an entire infrastructure that allows automated extraction and distribution of biodiversity data from various journals across the globe. With the recent announcement from the Consortium of European Taxonomic Facilities (CETAF) that their European Journal of Taxonomy is joining the TaxPub club, we are even more confident that we are paving the right way to fully grasping the dimensions of the world’s biodiversity.”

Dr Donat Agosti, co-founder and president of Plazi, adds:

“Finally, information technology allows us to create a comprehensive, extended catalogue of life and bring to light this huge corpus of cultural and scientific heritage – the description of life on Earth – for everybody. The nature of taxonomic treatments as a network of citations and syntheses of what scientists have discovered about a species allows us to link distinct fields such as genomics and taxonomy to specimens in natural history museums.”

Dr Tim Smith, Head of Collaboration, Devices and Applications Group at CERN, comments:

“Moving the focus away from the papers, where concepts are communicated, to the concepts themselves is a hugely significant step. It enables BLR to offer a unique new interconnected view of the species of our world, where the taxonomic treatments, their provenance, histories and their illustrations are all linked, accessible and findable. This is inspirational for the digital liberation of other fields of study!”

###

Additional information:

BLR is a joint project led by Plazi in partnership with Pensoft and Zenodo at CERN.

Currently, BLR is supported by a grant from Arcadia, a charitable fund of Lisbet Rausing and Peter Baldwin.

FAIR biodiversity data in Pensoft journals thanks to a routine data auditing workflow

*Data audit workflow provided for data papers submitted to Pensoft journals.*

To avoid publication of openly accessible, yet unusable datasets, fated to result in irreproducible and inoperable biological diversity research at some point down the road, Pensoft takes care for auditing data described in data paper manuscripts upon their submission to applicable journals in the publisher’s portfolio, including Biodiversity Data Journal, ZooKeys, PhytoKeys, MycoKeys and many others.

Once the dataset is clean and the paper is published, biodiversity data, such as taxa, occurrence records, observations, specimens and related information, become FAIR (findable, accessible, interoperable and reusable), so that they can be merged, reformatted and incorporated into novel and visionary projects, regardless of whether they are accessed by a human researcher or a data-mining computation.

As part of the pre-review technical evaluation of a data paper submitted to a Pensoft journal, the associated datasets are subjected to data audit meant to identify any issues that could make the data inoperable. This check is conducted regardless of whether the dataset are provided as supplementary material within the data paper manuscript or linked from the Global Biodiversity Information Facility (GBIF) or another external repository. The features that undergo the audit can be found in a data quality checklist made available from the website of each journal alongside key recommendations for submitting authors.

Once the check is complete, the submitting author receives an audit report providing improvement recommendations, similarly to the commentaries he/she would receive following the peer review stage of the data paper. In case there are major issues with the dataset, the data paper can be rejected prior to assignment to a subject editor, but resubmitted after the necessary corrections are applied. At this step, authors who have already published their data via an external repository are also reminded to correct those accordingly.

“It all started back in 2010, when we joined forces with GBIF on a quite advanced idea in the domain of biodiversity: a data paper workflow as a means to recognise both the scientific value of rich metadata and the efforts of the the data collectors and curators. Together we figured that those data could be published most efficiently as citable academic papers,” says Pensoft’s founder and Managing director Prof. Lyubomir Penev.

“From there, with the kind help and support of Dr Robert Mesibov, the concept evolved into a data audit workflow, meant to ‘proofread’ the data in those data papers the way a copy editor would go through the text,” he adds.

“The data auditing we do is not a check on whether a scientific name is properly spelled, or a bibliographic reference is correct, or a locality has the correct latitude and longitude”, explains Dr Mesibov. “Instead, we aim to ensure that there are no broken or duplicated records, disagreements between fields, misuses of the Darwin Core recommendations, or any of the many technical issues, such as character encoding errors, that can be an obstacle to data processing.”

At Pensoft, the publication of openly accessible, easy to access, find, re-use and archive data is seen as a crucial responsibility of researchers aiming to deliver high-quality and viable scientific output intended to stand the test of time and serve the public good.

CASE STUDY: Data audit for the “Vascular plants dataset of the COFC herbarium (University of Cordoba, Spain)”, a data paper in PhytoKeys

To explain how and why biodiversity data should be published in full compliance with the best (open) science practices, the team behind Pensoft and long-year collaborators published a guidelines paper, titled “Strategies and guidelines for scholarly publishing of biodiversity data” in the open science journal Research Ideas and Outcomes (RIO Journal).

Sir Charles Lyell’s historical fossils kept at London’s Natural History Museum accessible online

The Lyell Project team: First row, seated from left to right: Martha Richter (Principal Curator in Charge of Vertebrates), Consuelo Sendino (with white coat, curator of bryozoans holding a Lyell fossil gastropod from Canaries), Noel Morris (Scientific Associate of Invertebrates), Claire Mellish (Senior Curator of arthropods), Sandra Chapman (curator of reptiles) and Emma Bernard (curator of fishes, holding the lectotype of Cephalaspis lyelli). Second row, standing on from left to right: Jill Darrell (curator of cnidarians), Zoe Hughes (curator of brachiopods) and Kevin Webb (science photographer). Photo by Nelly Perez-Larvor.

More than 1,700 animal and plant specimens from the collection of eminent British geologist Sir Charles Lyell – known as the pioneer of modern geology – were organised, digitised and made openly accessible via the NHM Data Portal in a pilot project, led by Dr Consuelo Sendino, curator at the Department of Earth Sciences (Natural History Museum, London). They are described in a data paper published in the open-access Biodiversity Data Journal.

*Curator of plants Peta Hayes (left) and curator of bryozoans Consuelo Sendino (right) looking at a Lyell fossil plant from Madeira in the collection area. Photo by Mark Lewis.*

The records contain the data from the specimens’ labels (species name, geographical details, geological age and collection details), alongside high-resolution photographs, most of which were ‘stacked’ with the help of specialised software to re-create a 3D model.

Sir Charles Lyell’s fossil collection comprises a total of 1,735 specimens of fossil molluscs, filter-feeding moss animals and fish, as well as 51 more recent shells, including nine specimens originally collected by Charles Darwin from Tierra del Fuego or Galapagos, and later gifted to the geologist. The first specimen of the collection was deposited in distant 1846 by Charles Lyell himself, while the last one – in 1980 by one of his heirs.

With as much as 95% of the specimens having been found at the Macaronesian archipelagos of the Canaries and Madeira and dating to the Cenozoic era, the collection provides a key insight into the volcano formation and palaeontology of Macaronesia and the North Atlantic Ocean. By digitising the collection and making it easy to find and access for researchers from around the globe, the database is to serve as a stepping stone for studies in taxonomy, stratigraphy and volcanology at once.

*Sites where the Earth Sciences’ Lyell Collection specimens originate.*

“The display of this data virtually eliminates the need for specimen handling by researchers and will greatly speed up response time to collection enquiries,” explains Dr Sendino.

Furthermore, the pilot project and its workflow provide an invaluable example to future digitisation initiatives. In her data paper, Dr Sendino lists the limited resources she needed to complete the task in just over a year.

In terms of staff, the curator was joined by MSc student Teresa Máñez (University of Valencia, Spain) for six weeks while locating the specimens and collecting all the information about them; volunteer Jane Barnbrook, who re-boxed 1,500 specimens working one day per week for a year; NHM’s science photographer Kevin Webb and University of Lisbon’s researcher Carlos Góis-Marques, who imaged the specimens; and a research associate, who provided broad identification of the specimens, working one day per week for two months. Each of the curators for the collections, where the Lyell specimens were kept, helped Dr Sendino for less than a day. On the other hand, the additional costs comprised consumables such as plastazote, acid-free trays, archival pens, and archival paper for new labels.

“The success of this was due to advanced planning and resource tracking,” comments Dr Sendino.

“This is a good example of reduced cost for digitisation infrastructure creation maintaining a high public profile for digitisation,” she concludes.

###

Original source:

Sendino C (2019) The Lyell Collection at the Earth Sciences Department, Natural History Museum, London (UK). Biodiversity Data Journal 7: e33504. https://doi.org/10.3897/BDJ.7.e33504

###

About NHM Data Portal:

Committed to open access and open science, the Natural History Museum (London, UK) has launched the Data Portal to make its research and collections datasets available online. It allows anyone to explore, download and reuse the data for their own research.

The portal’s main dataset consists of specimens from the Museum’s collection database, with 4,224,171 records from the Museum’s Palaeontology, Mineralogy, Botany, Entomology and Zoology collections.

Recipe for Reusability: Biodiversity Data Journal integrated with Profeza’s CREDIT Suite

Through their new collaboration, the partners encourage publication of dynamic additional research outcomes to support reusability and reproducibility in science

In a new partnership between open-access Biodiversity Data Journal (BDJ) and workflow software development platform Profeza, authors submitting their research to the scholarly journal will be invited to prepare a Reuse Recipe Document via CREDIT Suite to encourage reusability and reproducibility in science. Once published, their articles will feature a special widget linking to additional research output, such as raw, experimental repetitions, null or negative results, protocols and datasets.

A Reuse Recipe Document is a collection of additional research outputs, which could serve as a guidelines to another researcher trying to reproduce or build on the previously published work. In contrast to a research article, it is a dynamic ‘evolving’ research item, which can be later updated and also tracked back in time, thanks to a revision history feature.

Both the Recipe Document and the Reproducible Links, which connect subsequent outputs to the original publication, are assigned with their own DOIs, so that reuse instances can be easily captured, recognised, tracked and rewarded with increased citability.

With these events appearing on both the original author’s and any reuser’s ORCID, the former can easily gain further credibility for his/her work because of his/her work’s enhanced reproducibility, while the latter increases his/her own by showcasing how he/she has put what he/she has cited into use.

Furthermore, the transparency and interconnectivity between the separate works allow for promoting intra- and inter-disciplinary collaboration between researchers.

“At BDJ, we strongly encourage our authors to use CREDIT Suite to submit any additional research outputs that could help fellow scientists speed up progress in biodiversity knowledge through reproducibility and reusability,” says Prof. Lyubomir Penev, founder of the journal and its scholarly publisher – Pensoft. “Our new partnership with Profeza is in itself a sign that collaboration and integrity in academia is the way to good open science practices.”

“Our partnership with Pensoft is a great step towards gathering crucial feedback and insight concerning reproducibility and continuity in research. This is now possible with Reuse Recipe Documents, which allow for authors and reusers to engage and team up with each other,” says Sheevendra, Co-Founder of Profeza.

Museum collection reveals distribution of Carolina parakeet 100 years after its extinction

While 2018 marks the centenary of the death of the last captive Carolina parakeet – North America’s only native parrot, a team of researchers have shed new light on the previously known geographical range of the species, which was officially declared extinct in 1920.

Combining observations and specimen data, the new Carolina parakeet occurrence dataset, recently published in the open access Biodiversity Data Journal by Dr Kevin Burgio, , Dr Colin Carlson, University of Maryland and Georgetown University, and Dr Alexander Bond, Natural History Museum of London, is the most comprehensive ever produced.

The new study provides unprecedented information on the birds range providing a window into the past ecology of a lost species.

“Making these data freely available to other researchers will hopefully help unlock the mysteries surrounding the extinction and ecology of this iconic species. Parrots are the most at-risk group of birds and anything we can learn about past extinctions may be useful going forward,” says the study’s lead author, Kevin Burgio.

The observational recordings included in the study have been gleaned from a wide variety of sources, including the correspondence of well-known historical figures such as Thomas Jefferson and the explorers Lewis and Clark.

The study team referenced recorded sightings spanning nearly 400 years. The oldest recorded sighting dates back to 1564, and was found in a description of the current state of Florida written by Rene Laudonniere in 1602.

Alongside the written accounts, the researchers included location data from museum specimens. These include 25 bird skins from the Natural History Museum’s Tring site, whose skin collection is the second largest of its kind in the world, with almost 750,000 specimens representing about 95 per cent of the world’s bird species. Thereby, the study proves what invaluable resources museum collections can be.

“The unique combination of historical research and museum specimens is the only way we can learn about the range of this now-extinct species. Museums are archives of the natural world and research collections like that of the Natural History Museum are incredibly important in helping to increase our understanding of biodiversity conservation and extinction,” says Alex Bond.

“By digitising museum collections, we can unlock the potential of millions of specimens, helping us to answer some of today’s big questions in biodiversity science and conservation.”

It is hoped that this research will be the beginning of a wider reaching work that will explore further into the ecology of this long lost species.

###

Original source:

Burgio KR, Carlson CJ, Bond AL (2018) Georeferenced sighting and specimen occurrence data of the extinct Carolina Parakeet (Conuropsis carolinensis) from 1564 – 1944. Biodiversity Data Journal 6: e25280. https://doi.org/10.3897/BDJ.6.e25280

Plazi and the Biodiversity Literature Repository (BLR) awarded EUR 1.1 million from Arcadia Fund to grant free access to biodiversity data

Plazi has received a grant of EUR 1.1 million from Arcadia – the charitable fund of Lisbet Rausing and Peter Baldwin – to liberate data, such as taxonomic treatments and images, trapped in scholarly biodiversity publications.

The project will expand the existing corpus of the Biodiversity Literature Repository (BLR), a joint venture of Plazi and Pensoft, hosted on Zenodo at CERN. The project aims to add hundreds of thousands of figures and taxonomic treatments extracted from publications, and further develop and hone the tools to search through the corpus.

The BLR is an open science community platform to make the data contained in scholarly publications findable, accessible, interoperable and reusable (FAIR). BLR is hosted on Zenodo, the open science repository at CERN, and maintained by the Switzerland-based Plazi association and the open access publisher Pensoft.

In its short existence, BLR has already grown to a considerate size: 35,000+ articles have been added, and extracted from 600+ journals. From these articles, more than 180,000 images have also been extracted and uploaded to BLR, and 225,000+ sub-article components, including biological names, taxonomic treatments or equivalent defined blocks of text have been deposited at Plazi’s TreatmentBank. Additionally, over a million bibliographic references have been extracted and added to Refbank.

The articles, images and all other sub-article elements are fully FAIR compliant and citable. In case an article is behind a paywall, a user can still access its underlying metadata, the link to the original article, and use the DOI assigned to it by BLR for persistent citation.

“Generally speaking, scientific illustrations and taxonomic treatments, such as species descriptions, are one of the best kept ‘secrets’ in science as they are neither indexed, nor are they citable or accessible. At best, they are implicitly referenced,” said Donat Agosti, president of Plazi. “Meanwhile, their value is undisputed, as shown by the huge effort to create them in standard, comparative ways. From day one, our project has been an eye-opener and a catalyst for the open science scene,” he concluded.

Though the target scientific domain is biodiversity, the Plazi workflow and tools are open source and can be applied to other domains – being a catalyst is one of the project’s goals.

While access to biodiversity images has already proven useful to scientists, but also inspirational to artists, for example, the people behind Plazi are certain that such a well-documented, machine-readable interface is sure to lead to many more innovative uses.

To promote BLR’s approach to make these important data accessible, Plazi seeks collaborations with the community and publishers, to remove hurdles in liberating the data contained in scholarly publications and make them FAIR.

The robust legal aspects of the project are a core basis of BLR’s operation. By extracting the non-copyrightable elements from the publications and making them findable, accessible and re-usable for free, the initiative drives the move beyond the PDF and HTML formats to structured data.

###

To participate in the project or for further questions, please contact Donat Agosti, President at Plazi at info@plazi.org

Additional information:

About Plazi:

Plazi is an association supporting and promoting the development of persistent and openly accessible digital taxonomic literature. To this end, Plazi maintains TreatmentBank, a digital taxonomic literature repository to enable archiving of taxonomic treatments; develops and maintains TaxPub, an extension of the National Library of Medicine / National Center for Biotechnology Informatics Journal Article Tag Suite for taxonomic treatments; is co-founder of the Biodiversity Literature Repository at Zenodo, participates in the development of new models for publishing taxonomic treatments in order to maximize interoperability with other relevant cyberinfrastructure components such as name servers and biodiversity resources; and advocates and educates about the vital importance of maintaining free and open access to scientific discourse and data. Plazi is a major contributor to the Global Biodiversity Information Facility.

About Arcadia Fund:

Arcadia is a charitable fund of Lisbet Rausing and Peter Baldwin. It supports charities and scholarly institutions that preserve cultural heritage and the environment. Arcadia also supports projects that promote open access and all of its awards are granted on the condition that any materials produced are made available for free online. Since 2002, Arcadia has awarded more than $500 million to projects around the world.

Integration of Freshwater Biodiversity Information for Decision-Making in Rwanda

Teams from Ghana, Malawi, Namibia and Rwanda during the inception meeting of the African Biodiversity Challenge Project in Kigali, Rwanda. Photo by Yvette Umurungi.

The establishment and implementation of a long-term strategy for freshwater biodiversity data mobilisation, sharing, processing and reporting in Rwanda is to support environment monitoring and the implementation of Rwanda’s National Biodiversity Strategy (NBSAP). In addition, it is to also help us understand how economic transformation and environmental change is affecting freshwater biodiversity and its resulting ecosystem services.

As part of this strategy, the Center of Excellence in Biodiversity and Natural Resource Management (CoEB) at the University of Rwanda, jointly with the Rwanda Environment Management Authority (REMA) and the Albertine Rift Conservation Society (ARCOS), are implementing the African Biodiversity Challenge (ABC) project “Integration of Freshwater Biodiversity Information for Decision-Making in Rwanda.”

The conference abstract for this project has been published in the open access journal Biodiversity Information Science and Standards (BISS).

The CoEB has a national mandate to lead on biodiversity data mobilisation and implementation of the NBSAP in collaboration with REMA. This includes digitising data from reports, conducting analyses and reporting for policy and research, as indicated in Rwanda’s NBSAP.

The collation of the data will follow the international standards and will be available online, so that they can be accessed and reused from around the world. In fact, CoEB aspires to become a Global Biodiversity Informatics Facility (GBIF) node, thereby strengthening its capacity for biodiversity data mobilisation.

Data use training for the African Biodiversity Challenges at the South African National Biodiversity Institute (SANBI), South Africa. Photo by Yvette Umurungi.

The mobilised data will be organised using GBIF standards, and the project will leverage the tools developed by GBIF to facilitate data publication. Additionally, it will also provide an opportunity for ARCOS to strengthen its collaboration with CoEB as part of its endeavor to establish a regional network for biodiversity data management in the Albertine Rift Region.

The project is expected to conclude with at least six datasets, which will be published through the ARCOS Biodiversity Information System. These are to include three datasets for the Kagera River Basin; one on freshwater macro-invertebrates from the Congo and Nile Basins; one for the Rwanda Development Board archive of research reports from protected areas; and one from thesis reports from master’s and bachelor’s students at the University of Rwanda.

The project will also produce and release the first “Rwandan State of Freshwater Biodiversity”, a document which will describe the status of biodiversity in freshwater ecosystems in Rwanda and present socio-economic conditions affecting human interactions with this biodiversity.

The page of Center of Excellence in Biodiversity and Natural Resource Management (CoEB) at University of Rwanda on the Global Biodiversity Information Facility portal. Image by Yvette Umurungi.

***

The ABC project is a competition coordinated by the South African National Biodiversity Institute (SANBI) and funded by the JRS Biodiversity Foundation. The competition is part of the JRS-funded project, “Mobilising Policy and Decision-making Relevant Biodiversity Data,” and supports the Biodiversity Information Management activities of the GBIF Africa network.

Original source:

Umurungi Y, Kanyamibwa S, Gashakamba F, Kaplin B (2018) African Biodiversity Challenge: Integrating Freshwater Biodiversity Information to Guide Informed Decision-Making in Rwanda. Biodiversity Information Science and Standards 2: e26367. https://doi.org/10.3897/biss.2.26367

Audit finds biodiversity data aggregators ‘lose and confuse’ data

In an effort to improve the quality of biodiversity records, the Atlas of Living Australia (ALA) and the Global Biodiversity Information Facility (GBIF) use automated data processing to check individual data items. The records are provided to the ALA and GBIF by museums, herbaria and other biodiversity data sources.

However, an independent analysis of such records reports that ALA and GBIF data processing also leads to data loss and unjustified changes in scientific names.

The study was carried out by Dr Robert Mesibov, an Australian millipede specialist who also works as a data auditor. Dr Mesibov checked around 800,000 records retrieved from the Australian Museum, Museums Victoria and the New Zealand Arthropod Collection. His results are published in the open access journal ZooKeys, and also archived in a public data repository.

“I was mainly interested in changes made by the aggregators to the genus and species names in the records,” said Dr Mesibov.

“I found that names in up to 1 in 5 records were changed, often because the aggregator couldn’t find the name in the look-up table it used.”

Another worrying result concerned type specimens – the reference specimens upon which scientific names are based. On a number of occasions, the aggregators were found to have replaced the name of a type specimen with a name tied to an entirely different type specimen.

The biggest surprise, according to Dr Mesibov, was the major disagreement on names between aggregators.

“There was very little agreement,” he explained. “One aggregator would change a name and the other wouldn’t, or would change it in a different way.”

Furthermore, dates, names and locality information were sometimes lost from records, mainly due to programming errors in the software used by aggregators to check data items. In some data fields the loss reached 100%, with no original data items surviving the processing.

“The lesson from this audit is that biodiversity data aggregation isn’t harmless,” said Dr Mesibov. “It can lose and confuse perfectly good data.”

“Users of aggregated data should always download both original and processed data items, and should check for data loss or modification, and for replacement of names,” he concluded.

###

Original source:

Mesibov R (2018) An audit of some filtering effects in aggregated occurrence records. ZooKeys 751: 129-146. https://doi.org/10.3897/zookeys.751.24791