A data paper at the click of a button: Streamlining metadata conversion into scholarly manuscripts for GBIF and DataONE data

At the time of the writing of this post, the Biodiversity Information Standards conference, TDWG 2015, is on in Kenya and everyone around the world can listen to the live audio stream. Data sharing, data re-use, and data discovery are being brought up in almost every talk. We might have entered the age of big data twenty years ago, but it is now that scientists face the real challenge – storing and searching through the deluge of data to find what they need.

As the rate at which we exponentially generate data exceeds the rate at which data storage technologies improve, the field of data management seems to be greatly challenged. Worse, this means the more new data is generated, the more of the older ones will be lost. In order to know what to keep and what to delete, we need to describe the data as much as possible, and judge the importance of datasets. This post is about a novel way to automatically generate scientific papers describing a dataset, which will be referred to as data papers.

The common characters of the records, i.e. descriptions of the object of study, the measurement apparatus and the statistical summaries used to quantify the records, the personal notes of the researcher, and so on are called metadata. Major web portals such as DataONE or the Global Biodiversity Information Facility store metadata in conjunction with a given dataset as one or more text files, usually structured in special formats enabling the parsing of the metadata by algorithms.

To make the metadata and the corresponding datasets discoverable and citable, the concept of the data paper was introduced in the early 2000’s by the Ecological Society of America. This concept was brought to the attention of the biodiversity community by Chavan and Penev (2011) with the introduction of a new data paper concept, based on a metadata standard, such as the Ecological Metadata Language, and derived from metadata content stored at large data platforms, in this case the Global Biodiversity Information Facility (GBIF). You can read this article for an in-depth discussion of the topic.

Pensoft’s Biodiversity Data Journal (BDJ) is to the best of our knowledge the first academic journal to have implemented a one-hundred-percent online authoring system for data papers, called ARPHA. Moreover, BDJ and the other Pensoft journals, such as ZooKeys, have already published more than seventy data papers. Therefore, in the remainder of this post we will explain how to use an automated approach to publish a data paper describing an online dataset in Biodiversity Data Journal. The ARPHA system will convert the metadata describing your dataset into a manuscript for you after reading in the metadata! We will illustrate the workflow on the previously mentioned DataONE and GBIF.

The Data Observation Network for Earth (DataONE) is a distributed cyberinfrastructure funded by the U.S. National Science Foundation. It links together over twenty five nodes, primarily in the U.S., hosting biodiversity and biodiversity-related data, and provides an interface to search for data in all of them.

Since butterflies are neat, let’s search for datasets about butterflies at DataONE! Type “Lepidoptera” in the search field and scroll down to the dataset describing “The Effects of Edge Proximity on Butterfly Biodiversity.” You should see something like this:

ONEMercury

As you can notice, this resource has two objects associated with it: metadata, which has been highlighted, and the dataset itself. Let’s download the metadata from the cloud! The resulting text file, “Blandy.235.1.xml”, or whatever you want to call it, can be read by humans, but is somewhat cryptic because of all the XML tags. Now, you can import this file into the ARPHA writing platform and the information stored in it would be used to create a data paper! Go to the ARPHA web-site, pwt.pensoft.net, and click on “Start a manuscript,” then scroll all the way down and click on “Import manuscript.”

ARPHA Import

Upload the “blandy” file and you will see an “Authors’ page,” where you can select which of the authors mentioned in the metadata must be included as authors of the data paper itself. Note that the user of ARPHA uploading the metadata is added to the list of the authors if they is not included in the metadata. After the selection is done, a scholarly article is created by the system with the information from the metadata already in respective sections of the article:

ARPHA Manuscript

Now, the authors can add some description, edit out errors, tell a story, cite someone – all of this without leaving ARPHA – i.e. do whatever it takes to produce a high-quality scholarly text. After they are done, they can submit their article for peer-review and it could be published in a matter of hours. Voila!

Let’s look at GBIF. Go to “Data -> Explore by country” and select “Saint Vincent and the Grenadines,” an English-speaking Caribbean island. There are, as of the time of writing of this post, 166 occurrence datasets containing data about the islands. Select the dataset from the Museum of Comparative Zoology at Harvard. If you scroll down, you will see the GBIF annotated EML. Download this as a separate text file (if you are in Chrome you can view the source and then use copy-paste). Do the exact same steps as before – go to “Import manuscript” in ARPHA and upload the EML file. The result should be something like this, ready to finalize:
ARPHA Manuscript 2

Now, allow us to give a disclaimer here: the authors of this blog post have nothing to do with the two datasets. They have not contributed to any of them, nor do they know the authors. The datasets have been chosen more or less randomly since the authors wanted to demonstrate the functionality with a real-world example. You should only publish data papers if you know the authors or you are the author of the dataset itself. During the actual review process of the paper, the authors that have been included will get an email from the journal!

Having said that, we want to leave you with some caveats and topics for further discussions. Till today, useful and descriptive metadata has not always been present. There are two challenges: metadata completeness and metadata standards. The invention of the EML standard was one of the first efforts to standardize how metadata should be stored in the field of ecology and biodiversity science. Currently, our import system supports the last two versions of the EML standard: 2.1.1 and 2.1.0, but we hope to expand this functionality. In an upcoming version of their search interface, DataONE will provide infographics on the prevalence of the metadata standards at their site (see figure), so there is still work to be done, but if there is a positive feedback from the community, we will definitely keep expanding this feature.

DataONE

Credit: DataONE

Regarding metadata completeness, our hope is that by enabling scientists to create scholarly papers from their metadata with a single-step process, they will be incentivized to produce high-quality metadata.
This project has received funding from the European Union’s  FP7 project EU BON (Building the European Biodiversity Observation Network), grant agreement No 308454, and Horizon 2020 research and innovation project BIG4 (Biosystematics, informatics and genomics of the big 4 insect groups: training tomorrow’s researchers and entrepreneurs) under the Marie Sklodovska-Curie grant agreement No. 542241 for a PhD project titled Technological Implications of the Open Biodiversity Knowledge Management System.

How small is the smallest? New record of the tiniest free-living insect provides precision

The long-lasting search and debate around the size and identity of the World’s smallest free-living insect seems to have now been ended with the precise measurement and second record of the featherwing beetle species.

Described back in 1999 based on only several specimens found in Nicaragua, as many as 85 individuals of the minute beetle species have recently been retrieved from Colombia and thoroughly examined. The smallest of them measured the astounding 0.325 mm. The finding made by Dr. Alexey Polilov, Lomonosov Moscow State University, Moscow, is available in the open access journal ZooKeys.

The World’s smallest beetle and tiniest non-parasitoid insect, called Scydosella musawasensis, is morphologically characterised by its elongated oval body, yellowish-brown colouration and antennae split into 10 segments. It is also the only representative of this featherwing beetle genus.

Not able to precisely measure its size because of the preserved specimens being embedded in preparations for microscopy studies, Dr. Polilov used new individuals, collected in Chicaque National Park, Colombia in early 2015. To conclude the length of the smallest one as 325 µm (0.325 mm) the scientist used a specialised software and digital micrographs.

The recent survey is the second record of the tiny beetle species, which also proved that the range of its distribution is actually much wider. Thereafter, so are the localities of the fungi that the insect feeds on.

###

Original source:

Polilov AA (2015) How small is the smallest? New record and remeasuring of Scydosella musawasensis Hall, 1999 (Coleoptera, Ptiliidae), the smallest known free-living insect. ZooKeys526: 61-64. doi: 10.3897/zookeys.526.6531 ZooBank: urn:lsid:zoobank.org:pub:E38CA5AE-0C65-45D0-9116-E74A1E889BDE

Novel cybercatalog of flower-loving flies suggests the digital future of taxonomy

Charting Earth’s biodiversity is the goal of taxonomy and to do so the scientists need to create an extensive citation network based on several hundred million pages of scientific literature. By providing a novel taxonomic ‘cybercatalog’ of southern African flower-loving (apiocerid) flies, Drs. Torsten Dikow and Donat Agosti demonstrate how the network of taxonomic knowledge can be made available through links provided to online data providers. Their work is available in the open-access Biodiversity Data Journal.

The present research showcases that the information cannot only be made available to the reader who follows the links, but also to machines that use the growing number of digital, online resources that are linked through persistent identifiers.

Primary data providers for taxonomic information such as species names (ZooBank), specimen images (Morphbank), species descriptions (Plazi), and digitized literature (BHL, Biodiversity Heritage Library; BioStor; and BLR, Biodiversity Literature Repository) play an important role in making data on species available in electronic form. Aggregators such as the Global Biodiversity Information Facility (GBIF) and the Encyclopedia of Life (EoL) gather this information automatically to distribute it even further to audiences beyond the reach of the life sciences.

In contrast to previous species catalogs, in cybercatalogs access to information is provided through links to open-access, online data repositories such as the ones listed above. Taxonomists and other users can now access this literature, species descriptions, and specimen records immediately without a search in a natural history library or collection. The cybercatalog takes advantage of a new publishing platform within the Biodiversity Data Journal that makes it easy to upload species information and links to data about these species through a CheckList template. Furthermore, the Biodiversity Data Journal now allows future updates and re-publications of the cybercatalog with the new unique persistent identifier (DOI, Digital Object Identifier) whenever a new species is described or other taxonomic changes take place.

The authors argue that cybercatalogs are indeed the future of taxonomic catalogs since the online data in them are easily accessible to anyone.

“It is a taxonomist’s dream to have online access to all previously published information on a species and through this step the discipline of taxonomy can (re-)position itself as a central resource within the life sciences and beyond to the public and society at large,” add the authors. “Online access will also help to narrow the gap between the South and the North as a fantastic example of unhindered access to our knowledge of the global biological diversity, which is increasingly under pressure from human populations.”

###

For the realization of this project Plazi and Pensoft were partially supported by the EC-FP7 EU BON project (ENV 30845) (Building the European Biodiversity Observation Network).

###

Original source:

Dikow T, Agosti D (2015) Utilizing online resources for taxonomy: a cybercatalog of Afrotropical apiocerid flies (Insecta: Diptera: Apioceridae). Biodiversity Data Journal 3: e5707. doi: 10.3897/BDJ.3.e5707

One new fly species, zero dead bodies: First insect description solely from photographs

The importance of collecting dead specimens or not when verifying a new species has been a hot ongoing discussion for quite a while now. Amid voiced opinions ranging from specimen collection being “no longer required” to relying on anything but physical evidence being defined as mere “malpractice,” science is now witnessing the first description of an insect species based solely on high-resolution photographs.

The unequivocally new bee fly species belongs to an extremely rare genus and was described by Drs. Stephen A. Marshall from the University of Guelph, Canada, and Neal Evenhuis from the Bishop Museum, Hawaii. Their research along with their commentary on the controversial topic are published in the open-access journal ZooKeys.

The authors in no way denounce dead specimen collection and dissection and even speak of it as the “gold standard” in new species description, they stress the fact that given the continued increased difficulty in obtaining permits to collect in many areas, and the resulting low probability of collecting and preserving specimens, there ought to be an alternative.

The newly described bee fly species, called Marleyimyia xylocopae, is a huge fly with a remarkable resemblance to a co-occurring carpenter bee. The new species might be a parasite of the bee, but not much is known about its behaviour. Therefore, the scientists stress that more observations are needed, something that will be encouraged by the availability of a name and an associated image.

Speaking of their own experience while studying their presently described new species, the scientists point out that relying on several high-resolution photographs has not only increased their knowledge of the biodiversity of the area and the genus, but has also provided some “interesting ecological and biological information”.

“As these image collections become curated just as dead specimens are curated today, the digital specimens will find their way into the work of practicing taxonomists, and they will need names,” the team explained. “It is unrealistic to think that distinct and diagnosable new taxa known only from good photographs and appropriate associated metadata should be organized and referred to only as “undescribed species” when they can and should be organized and named using the existing rules of nomenclature.”

###

Original source:

Marshall SA, Evenhuis NL (2015) New species without dead bodies: a case for photo-based descriptions, illustrated by a striking new species of Marleyimyia Hesse (Diptera, Bombyliidae) from South Africa. ZooKeys 525: 117-127. doi: 10.3897/zookeys.525.6143

Saucer-like shields protect 2 new ‘door head’ ant species from Africa and their nests

Shaped like saucers, or concave shields, and covered with camouflaging layers of debris, the heads of two “door head” ant species are found to differentiate them as new taxa. They use their peculiar features to block the entrances of their nests against intruders like other predatory ants and invertebrates.

Being only the second case of such highly specialized morphologies discovered in Africa, the new representatives of the genus Carebara have been retrieved from sifted leaf-litter collected in rainforests in Western Kenya and the Ivory Coast.

Because of difficulties usually met while studying and identifying ants through dry specimens retrieved from standardised, passive collection methods, the two new species have so far been taxonomically misplaced. The new discovery was made by an international research team, led by Dr. Georg Fischer and Prof. Evan Economo, Okinawa Institute of Science and Technology Graduate University, Japan. The findings are available in the open access journal ZooKeys.

The “door head” ant individuals are a special worker subcaste that stands out among the other ant colony’s workers, who are responsible for vital tasks such as foraging and brood care. Dr. Georg Fischer and his colleagues analysed the herein described species with next-generation DNA sequencing to show that all different subcastes belong to the same species, despite their highly differing morphologies.

To assure the safety of their nestmates, the queen and the larvae, the two new species have evolved the special worker subcaste with heads covered by a layer of debris such as soil or even organic material, so that they blend in with their surroundings. While the shape of their heads allows them to perfectly fit into the nest entrance, the special armor shields their vulnerable eyes, antennae and mouthparts, as well as highly reduces the chance of enemies intruding into the nest.

The new Carebara species have been given the names C. phragmotica and C. lilith. The former is derived from the term phragmosis, in relation to the special function of their head shape, while the latter comes from the name of a female demon in Jewish mythology.

###

 

Original source:

Fischer G, Azorsa F, Hita Garcia F, Mikheyev AS, Economo EP (2015) Two new phragmotic ant species from Africa: morphology and next-generation sequencing solve a caste association problem in the genus Carebara Westwood. ZooKeys 525: 77-105. doi: 10.3897/zookeys.525.6057

Night calls reveal two new rainforest arboreal frog species from western New Guinea

Tracked by their calls at night after heavy rains, two species of narrow-mouthed frogs have been recorded as new. During the examinations it turned out that one of the studied specimens is a hermaphrodite and another one represents the first record of the genus Cophixalus for the Misool Island.

The field work, conducted by Steve Richards, South Australian Museum, Adelaide, and his team, took place in the Raja Ampat Islands, Indonesian part of New Guinea. Their findings, compiled by Dr. Rainer Guenther, Museum fur Naturkunde, Berlin, are available in the open access journal Zoosystematics and Evolution.

Belonging to the narrow-mouthed frog genus Cophixalus that occurs mainly in New Guinea and northern Australia, the two new species have been differentiated by their morphological features along with the specificity of their advertisement calls, produced by males to attract their partners. Both are characterised by small and slender bodies, measuring less than 23 mm in length.

Curious enough, when dissected one of the male specimens, assigned to the new species C. salawatiensis, revealed a female reproductive system with well-developed eggs. Simultaneously, neither its sound-producing organs, nor its calls differed in any way from the rest of the observed males from the same species. Therefore, it is to be considered a hermaphrodite.

Both new frog species have been retrieved from logged lowland rainforests. There the scientists noted that after heavy rains at night the males perched on leaves of bushes and produced sounds, characteristic for each species.

All specimens have been placed in the collection of the Museum Zoologicum Bogoriense (MZB) in Cibinong (Bogor), Indonesia.

###

Original source:

Guenther R, Richards S, Tjaturadi B, Krey K (2015) Two new species of the genus Cophixalusfrom the Raja Ampat Islands west of New Guinea (Amphibia, Anura, Microhylidae).Zoosystematics and Evolution 91(2): 199-213.doi: 10.3897/zse.91.5411

Known from flower stalls as ‘Big Pink’ orchid proved to be an undescribed wild species

As easy as it might seem, seeking new species among cultivated plants could be actually quite tricky. While looking into the undescribed orchid, known at the market as ‘Big Pink’, Bobby Sulistyo and his team were likely to find yet another man-made hybrid. In reality, they are now describing as ‘new’ a wild orchid species that has been sitting at the flower stalls since 2013. The story behind their discovery is published in the open access journal PhytoKeys.

While studying a cultivated plant might be quite a motivator and serve as a starting point for scientific quests around the world, the assumptions that one has found a new species at the florist’s could easily be wrong. Not only is the place of origin, written on the label, often doubtful, but there is always the chance of accidentally describing a man-made hybrid as a new species.

Such could have been the case of Bobby Sulistyo and his team when they discovered that although previously assumed impossible, the relatives of ‘Big Pink’, they were surveying, could also make human-assisted hybrids. Moreover, both of the specimens they have had at hand had come from uncertain place of origin.

However, the scientists conducted a series of sophisticated DNA analyses to conclude that firstly, ‘Big Pink’ is a separate species within its genus and then, that there is no evidence for it being an artificial hybrid. Eventually, the species was found in the wild as well. As a result, the orchid species was given the official name Dendrochilum hampelii.

In the wild, ‘Big Pink’ is found at around 1,200 m above sea level in the Philippines, where it harmlessly plants its roots on tree trunks and branches among mosses.

So far, little is known about the orchid’s distribution in nature, so the researchers suggest its conservation status to be considered as Data Deficient according to the IUCN Red List of Threatened Species (IUCN 2012).

 

###

Original source:

Sulistyo B, Boos R, Cootes J, Gravendeel B (2015) Dendrochilum hampelii (Coelogyninae, Epidendroideae, Orchidaceae) traded as ‘Big Pink’ is a new species, not a hybrid: evidence from nrITS, matK and ycf1 sequence data. PhytoKeys 56: 83-97. doi:10.3897/phytokeys.56.5432

Tiny, record-breaking Chinese land snails fit almost 10 times into the eye of a needle

Minuscule snails defy current knowledge and scientific terminology about terrestrial “microsnails”. While examining soil samples collected from the base of limestone rocks in Guangxi Province, Southern China, scientists Barna Páll-Gergely and Takahiro Asami from Shinshu University, Adrienne Jochum, University and Natural History Museum of Bern, and András Hunyadi, found several minute empty light grey shells, which measured an astounding height of less than 1 mm.

The single known shell of Angustopila dominikae, named after the wife of the first author, was measured a mere 0.86 mm in shell height. Thus, it is considered to be perhaps the World’s smallest land snail species when focusing on the largest diameter of the shell. With very few reported instances of species demonstrating this degree of tininess, the team have described a total of seven new land snail species in their paper, published in the open access journal ZooKeys.

Another of the herein described new species, called Angustopila subelevata, measured 0.83-0.91 mm (mean = 0.87 mm) in height.

Two of the authors have previously described other species of tiny land snails from China and Korea in the same journal.

In their present paper, Dr. Pall-Gergely and his team also discuss the challenges faced by scientists surveying small molluscs, since finding living specimens is still very difficult. Thus, the evolutionary relationships between these species, as well as the number of existing species are yet little known.

“Extremes in body size of organisms not only attract attention from the public, but also incite interest regarding their adaptation to their environment,” remind the researchers. “Investigating tiny-shelled land snails is important for assessing biodiversity and natural history as well as for establishing the foundation for studying the evolution of dwarfism in invertebrate animals.”

“We hope that these results provide the taxonomic groundwork for future studies concerning the evolution of dwarfism in invertebrates,” they finished up.

###

Original source:

Páll-Gergely B, Hunyadi A, Jochum A, Asami T (2015) Seven new hypselostomatid species from China, including some of the world’s smallest land snails (Gastropoda, Pulmonata, Orthurethra). ZooKeys 523: 31-62. doi: 10.3897/zookeys.523.6114

Diggers from down under: 11 new wasp species discovered in Australia

After being mostly neglected for more than a hundred years, a group of digger wasps from Australia has been given a major overhaul in terms of species descriptions and identification methods. This approach has led to an almost 50% rise in the number of recognized species of these wasps on the continent. The study was published in the open access journal ZooKeys.

They call them with names like “Great Golden Digger” or “Great Black Wasp” in the US and there is a good reason behind it. However, some of these digger wasp species do not impress solely with their looks, but also with their wide range of distribution. Members of the wasp genusSphex can be found in almost every area of the world. Two researchers from the Museum für Naturkunde in Berlin, Thorleif Dörfel and Dr. Michael Ohl, have now reexamined the species diversity of Sphex in Australia.

More than a century has passed since the last revision of this group in the Down Under. Using pinned, dried individuals from museum collections all over the world, Dörfel and Ohl inspected over 900 specimens and recorded the morphological characters that they deemed most useful for species differentiation.

A very different lifestyle sets apart some species in the genus Sphex from the common idea that most people evoke on hearing the term “wasp”. Not being eusocial, each female constructs a separate, subterranean nest for their offspring, which is then filled with grasshoppers (or other insects, depending on the wasp species) that have been paralyzed by a sting as a food supply for the larvae. These wasps avoid contact with humans and generally do not show aggressive behavior toward us.

With 23 species known from Australia before this study, now the number has risen to 34. Most of these newly discovered species come from large quantities of material which had not been identified up to species level before. Dörfel and Ohl’s work also provides an up-to-date identification key, both in a regular and in an interactive form, that covers all known Australian species of the genus. Specifically designed to be easily usable and containing many helpful images, it can be utilized by anyone with even minimal prior training.

“Many insect groups are in urgent need of a revision or reclassification”, explained Thorleif Dörfel. “Our understanding of ecosystems depends on the ability to identify the species that are a part of them. The focus of this study was merely a single continent, but we are currently preparing a follow-up project in which we plan to examine representatives of this wasp genus from every major geographic area. Hopefully, this is going to help everybody who works on these animals, whether now or in another one hundred years.”

###

Original source:

Dörfel TH, Ohl M (2015) A revision of the Australian digger wasps in the genus Sphex(Hymenoptera, Sphecidae). ZooKeys 521: 1-104. doi: 10.3897/zookeys.521.5995

Bush Blitz: The largest Australian nature discovery project finds 4 new bee species

Four new native bee species were recognised as part of the largest Australian nature discovery project, called ‘Bush Blitz‘. The South Australian bee specialists used molecular and morphological evidence to prove them as new. Three of the species had narrow heads and long mouth parts – adaptations to foraging on flowers of emu-bushes, which have narrow constrictions at the base. The new species are described in the open access journal ZooKeys.

Bees are important pollinators of crops and native plants, but habitat loss and pesticides are proved to be causing a serious decline in their populations in Europe and the United States of America. Meanwhile, the conservation status of native Australian bees is largely unknown because solid baseline data are unavailable and about one third of the species are as yet unknown to science. Furthermore, identification of Australian bees is hampered by a lack of keys for about half of the named species.

With their present publication, bee specialists Katja Hogendoorn (University of Adelaide), Remko Leijs and Mark Stevens (South Australian Museum) are now trying to make Australian native bees more accessible to the scientific community. The study introduces a new Barcoding of Life project, ‘AUSBS‘, which will be built to contain the barcode sequences of the identified Australian native bees.

In future, this database can help scientists who have molecular tools, but insufficient knowledge of bees, to identify known species. Yet, that is not the only use of the database. “Bee taxonomists can access and use the molecular information to answer specific problems, for example, how certain species are related or whether or not a male and female belong to the same species”, says Dr. Hogendoorn. “And combined with morphological information, the molecular database can help to identify new species”, she adds.

In their publication, the researchers demonstrate the utility of the database. After careful evaluation of the DNA sequence data and subsequent morphological comparison of the collected bees to museum type specimens, they recognised four new species in the genusEuhesma, which they subsequently described.

Three of the species belong to the group of bees that specialise on the flowers of emu-bushes. These bees have evolved narrow faces and very long mouth parts to collect the nectar through a narrow constriction at the base of the flowers. A similar evolution has been already observed in other groups of bees. The fourth species belongs to a different group within this large genus and has a normally shaped head.

So far, the project includes 271 sequences of 120 species that were collected during the Bush Blitz surveys, Australia’s largest nature discovery project. The researchers intend to build on the existing DNA database to cover as many as possible of the Australian species. “It is hoped that this will stimulate native bee research”, says Dr. Hogendoorn. “With about 750 Australian bee species still undescribed and many groups in need of revision there is an enormous job to do”, she concludes.

###

Original source:

Hogendoorn K, Stevens M, Leijs R (2015) DNA barcoding of euryglossine bees and the description of new species of Euhesma Michener (Hymenoptera, Colletidae, Euryglossinae).ZooKeys 520: 41-59.doi: 10.3897/zookeys.520.6185