EIVE 1.0 – The largest system of ecological indicator values in Europe

EIVE 1.0 is the most comprehensive system of ecological indicator values of vascular plants in Europe to date. It can be used as an important tool for continental-scale analyses of vegetation and floristic data.

Guest blog post by Jürgen Dengler, Florian Jansen & François Gillet

Geographic coverage of the 31 ecological indicator value systems that entered the calculation of the consensus system of EIVE 1.0 (image from the original article).

It took seven years and hundreds of hours of work by an international team of 34 authors to develop and publish the most comprehensive system of ecological indicator values (EIVs) of vascular plants in Europe to date.

EIVE 1.0 is now available as an open access database and described in the accompanying paper (Dengler et al. 2023).

EIVE 1.0 provides the five most-used ecological indicators, M – moisture, N – nitrogen, R – reaction, L – light and T – temperature, for a total of 14,835 vascular plant taxa in Europe, or between 13,748 and 14,714 for the individual indicators. For each of these taxa, EIVE contains three values: the EIVE niche position indicator, the EIVE niche width indicator and the number of regional EIV systems on which the assessment was based. Both niche position and niche width are given on a continuous scale from 0 to 10, not as categorical ordinal values as in the source systems.

Evidently, EIVE can be an important tool for continental-scale analyses of vegetation and floristic data in Europe.

It will allow to analyse the nearly 2 million vegetation plots currently contained in the European Vegetation Archive (EVA; Chytrý et al. 2016) in new ways.

Since EVA apart from elevation, slope inclination and aspect hardly contains any in situ measured environmental variables, the numerous macroecological studies up to date had to rely on coarse modelled environmental data (e.g. climate) instead. This is particularly problematic for soil variables such as pH, moisture or nutrients, which can change dramatically within a few metres.

Here, the approximation of site conditions by mean ecological indicator values can improve the predictive power substantially (Scherrer and Guisan 2019). Likewise, in broad-scale vegetation classification studies, mean EIVE values per plot would allow a better characterisation of the distinguished vegetation units. Lastly, one should not forget that most countries in Europe do not have a national EIV system, and here EIVE could fill the gap.

Violin plots showing largely continuous value distributions of the niche position and niche width values of the five indicators in EIVE 1.0 (image from the original article).

Almost on the same day as EIVE 1.0 another supranational system of ecological indicator values in Europe has been published by Tichý et al. (2023) with a similar approach.

Thus, it will be important for vegetation scientists in Europe to understand the pros and cons of both systems to allow the wise selection of the most appropriate tool:

  • EIVE 1.0 is based on 31 regional EIV systems, while Tichý et al. (2023) uses 12.
  • Both systems provide indicator values for moisture, nitrogen/nutrients, reaction, light and temperature, while Tichý et al. (2023) additionally has a salinity indicator.
  • Tichý et al. (2023) aimed at using the same scales as Ellenberg et al. (1991), which means that the scales vary between indicators (1–9, 0–9, 1–12), while EIVE has a uniform interval scale of 0–10 for all indicators.
  • Only EIVE provides niche width in addition to niche position. Niche width is an important aspect of the niche and might be used to improve the calculation of mean indicator values per plot (e.g. by weighting with inverse niche width).
  • The taxonomic coverage is larger in EIVE than in Tichý et al. (2023): 14,835 vs. 8,908 accepted taxa and 11,148 vs. 8,679 species.
  • EIVE provides indicator values for accepted subspecies, while Tichý et al. (2023) is restricted to species and aggregates. Separate indicator values for subspecies might be important for two reasons: (a) subspecies often strongly differ in at least one niche dimension; (b) many of the taxa now considered as subspecies have been treated at species level in the regional EIV systems.
  • Tichý et al. (2023) added 431 species not contained in any of the source systems based on vegetation-plot data from the European Vegetation Archive (EVA; Chytrý et al. 2016) while EIVE calculated the European indicator values only for taxa occurring at least in one source system. 
  • While both systems present maps that suggest a good coverage across Europe, Tichý et al. (2023)’s source systems largely were from Central Europe, NW Europe and Italy, but, unlike EIVE, these authors did not use source systems from the more “distal” parts of Europe, such as Sweden, Faroe Islands, Russia, Georgia, Romania, Poland and Spain, and they used only a small subset of indicators of the EIV systems of Ukraine, Greece and the Alps.
  • In a validation with GBIF-derived data on temperature niches, Dengler et al. (2023) showed that EIVE has a slightly stronger correlation than Tichý et al. (2023)’s indicators (r = 0.886 vs. 0.852).
The correlation of EIVE-T values of species with GBIF-derived temperature niche data was high and even higher when restricting the calculation to those species whose consensus value was based on at least four sources (image from the original article).

How did EIVE manage to integrate all EIV systems in Europe that contained at least one of the selected indicators for vascular plants, while Tichý et al. (2023) used only a small subset?

This difference is mainly due to a more complex workflow in EIVE (which also was one of the reasons why the preparation took so long). First, Tichý et al. (2023) restricted their search to EIV systems and indicators that had the same number of categories as the “original” Ellenberg system.

Second, from these they discarded those that showed a too low correlation with Ellenberg. By contrast, EIVE’s workflow allowed the use of any system with an ordinal (or even metric) scale, irrespective of the number of categories or the initial match with Ellenberg et al. (1991).

EIVE also did not treat one system (Ellenberg) as the master to assess all others but considered each of them equally valid. While indeed the individual EIV systems are often quite inconsistent, i.e. even if they refer to Ellenberg, the same value of an indicator in one system might mean something different in another system, our iterative linear optimisation enabled us to adjust all 31 systems for the five indicators to a common basis.

This in turn allowed deriving EIVE as the consensus system of all the source systems. The fact that in our validation of the temperature indicator, EIVE performed better than Tichý et al. (2023) and much better than most of the regional EIV systems might be attributable to the so-called wisdom of the crowd, going back to the statistician Francis Galton who found that averaging numerous independent assessments (even by laymen) of a continuous quantity can leads to very good estimates of the true value. 

Apart from the indicator values themselves, EIVE has a second main feature that might not be so obvious at first glance, but which actually took the EIVE team, including several taxonomists, more time than the workflow to generate the indicator values themselves: the taxonomic backbone. EIVE for vascular plants is fully based on the taxonomic concept (including the synonymic relationships) of the Euro+Med Plantbase.

However, since Euro+Med lacks an important part of taxa that are frequently recorded in vegetation plots, to make our backbone fully usable to vegetation science, we expanded it beyond Euro+Med to something called “Euro+Med augmented”. We particularly added hybrids, neophytes and aggregates, three groups of plants hitherto only very marginally covered in Euro+Med. All additions were done by experts consistently with the taxonomic concept of Euro+Med and are fully documented. Likewise, many additional synonym relationships had to be added that were missing in Euro+Med.

Finally, we implemented the so-called “concept synonymy” (see Jansen and Dengler 2010), which allows the assignment of the same name from different sources to different accepted names (“taxonomic concepts”). This applies mainly to nested taxa that are treated at different levels in different sources, e.g. once as species with several subspecies, once as aggregate with several species. However, there are also some cases of misapplied names (i.e. names that were not used in agreement with their nomenclatural type in certain EIV systems). Such cases generally cannot be solved by the various tools for automatic taxonomic cleaning, but require experts who make a case-by-case decision.

The whole taxonomic workflow of EIVE is fully transparent with an R code that “digests”:

(a) the names as they are in the source systems,

(b) the official Euro+Med database and

(c) tables that document our additions and modifications (with reasons and references).

This comprehensive documentation will allow continuous and efficient improvement in the future, be it because of taxonomic novelties adopted in Euro+Med or because EIVE’s experts decide to change certain interpretations. That way, “Euro+Med augmented” and the accompanying R-based workflow can also be a valuable tool for other projects that wish to harmonise plant taxonomic information from various sources at a continental scale, e.g. in vegetation-plot databases such as GrassPlot (Dengler et al. 2018) and EVA (Chytrý et al. 2016).

The publication of EIVE 1.0 is not the endpoint, but rather a starting point for future developments in a community-based approach.

Together with interested colleagues from outside, the EIVE core team plans to prepare better and more comprehensive releases of EIVE in the future, including updates to its taxonomic backbone.

Future releases of EIVE will be published in fixed versions, typically together with a paper that describes the changes in the content.

As steps for the next two years, we anticipate that we will first add further taxa (bryophytes, lichens, macroalgae) and some additional indicators, both of which are relatively easy with our established R-based workflow. Then we plan EIVE 2.0 that will use the approx. 2 million vegetation plots in EVA (Chytrý et al. 2016) to re-calibrate EIVE for all taxa (see http://euroveg.org/requests/EVA-data-request-form-2022-02-10-Dengleretal.pdf).

We invite you to get into contact with us if you have:

(a) a new or overlooked indicator value system for any taxonomic group in Europe and adjacent areas (including comprehensive datasets of measured environmental data in vegetation plots);

(b) suggestions for improvements of our taxonomic backbone;

(c) a paper idea in the EIVE context that you would like to realise together with the EIVE core team (since everything is OA, you can, of course, use EIVE 1.0 for any possible purpose without notifying us as long as you cite EIVE properly).

Last but not least, any test of the validity and performance of EIVE, alone or in comparison with Tichý et al. (2023), with in situ measured environmental variables, locally or even continentally, would be most welcome.

***

This Behind the paper post refers to the article Ecological Indicator Values for Europe (EIVE) 1.0 by Jürgen Dengler, Florian Jansen, Olha Chusova, Elisabeth Hüllbusch, Michael P. Nobis, Koenraad Van Meerbeek, Irena Axmanová, Hans Henrik Bruun, Milan Chytrý, Riccardo Guarino, Gerhard Karrer, Karlien Moeys, Thomas Raus, Manuel J. Steinbauer, Lubomir Tichý, Torbjörn Tyler, Ketevan Batsatsashvili, Claudia Bita-Nicolae, Yakiv Didukh, Martin Diekmann, Thorsten Englisch, Eduardo Fernandez Pascual, Dieter Frank, Ulrich Graf, Michal Hájek, Sven D. Jelaska, Borja Jiménez-Alfaro, Philippe Julve, George Nakhutsrishvili, Wim A. Ozinga, Eszter-Karolina Ruprecht, Urban Šilc, Jean-Paul Theurillat, and François Gillet published in Vegetation Classification and Survey (https://doi.org/10.3897/VCS.98324).

***

Follow the Vegetation Classification and Survey journal on Facebook and Twitter.

***

Brief personal summaries: 

Jürgen Dengler is a Professor of Vegetation Ecology at the Zurich University of Applied Science (ZHAW) in Wädenswil, Switzerland. Among others, he cofounded the European Vegetation Database (EVA), the global vegetation-plot database “sPlot” and the “GrassPlot” database of the Eurasian Dry Grassland Group. His major research interests are grassland ecology, grassland conservation, biodiversity patterns, macroecology, vegetation change, broad-scale vegetation classification, methodological developments in vegetation ecology and ecoinformatics.

Florian Jansen is a Professor of Landscape Ecology at the University of Rostock, Germany. His research interests are vegetation ecology and dynamics, mire ecology including greenhouse gas emissions, and numerical ecology with R. He (co-)founded the German Vegetation Database vegetweb.de, the European Vegetation Database (EVA), and the global vegetation-plot database “sPlot”. He wrote the R package eHOF for modelling species response curves along one-dimensional ecological gradients.

François Gillet is an Emeritus Professor of Community Ecology at the University of Franche-Comté in Besançon, France. His major research interests are vegetation diversity, ecology and dynamics, grassland and forest ecology, integrated synusial phytosociology, numerical ecology with R, dynamic modelling of social-ecological systems.

***

References: 

Chytrý, M., Hennekens, S.M., Jiménez-Alfaro, B., Knollová, I., Dengler, J., Jansen, F., Landucci, F., Schaminée, J.H.J., Aćić, S., (…) & Yamalov, S. 2016. European Vegetation Archive (EVA): an integrated database of European vegetation plots. Applied Vegetation Science 19: 173–180.

Dengler J, Wagner V, Dembicz I, García-Mijangos I, Naqinezhad A, Boch S, Chiarucci A, Conradi T, Filibeck G, … Biurrun I (2018) GrassPlot – a database of multi-scale plant diversity in Palaearctic grasslands. Phytocoenologia 48: 331–347.

Dengler, J., Jansen, F., Chusova, O., Hüllbusch, E., Nobis, M.P., Van Meerbeek, K., Axmanová, I., Bruun, H.H., Chytrý, M., (…) & Gillet, F. 2023. Ecological Indicator Values for Europe (EIVE) 1.0. Vegetation Classification and Survey 4: 7–29.

Ellenberg H, Weber HE, Düll R, Wirth V, Werner W, Paulißen D (1991) Zeigerwerte von Pflanzen in Mitteleuropa. Scripta Geobotanica 18: 1–248.

Jansen F, Dengler J (2010) Plant names in vegetation databases – a neglected source of bias. Journal of Vegetation Science 21: 1179–1186.

Midolo, G., Herben, T., Axmanová, I., Marcenò, C., Pätsch, R., Bruelheide, H., Karger, D.N., Acic, S., Bergamini, A., Bergmeier, E., Biurrun, I., Bonari, G., Carni, A., Chiarucci. A., De Sanctis, M., Demina, O., (…), Dengler, J., (…) & Chytrý, M. 2023. Disturbance indicator values for European plants. Global Ecology and Biogeography 32: 24–34.

Scherrer D, Guisan A (2019) Ecological indicator values reveal missing predictors of species distributions. Scientific Reports 9: Article 3061.

Tichý, L, Axmanová, I., Dengler, J., Guarino, R., Jansen, F., Midolo, G., Nobis, M.P., Van Meerbeek, K., Aćić, S., (…) & Chytrý, M. 2023. Ellenberg-type indicator values for European vascular plant species. Journal of Vegetation Science 34: e13168.

One Biodiversity Knowledge Hub to link them all: BiCIKL 2nd General Assembly

The FAIR Data Place – the key and final product of the partnership – is meant to provide scientists with all types of biodiversity data “at their fingertips”

The Horizon 2020 – funded project BiCIKL has reached its halfway stage and the partners gathered in Plovdiv (Bulgaria) from the 22nd to the 25th of October for the Second General Assembly, organised by Pensoft

The BiCIKL project will launch a new European community of key research infrastructures, researchers, citizen scientists and other stakeholders in the biodiversity and life sciences based on open science practices through access to data, tools and services.

BiCIKL’s goal is to create a centralised place to connect all key biodiversity data by interlinking 15 research infrastructures and their databases. The 3-year European Commission-supported initiative kicked off in 2021 and involves 14 key natural history institutions from 10 European countries.

BiCIKL is keeping pace as expected with 16 out of the 48 final deliverables already submitted, another 9 currently in progress/under review and due in a few days. Meanwhile, 21 out of the 48 milestones have been successfully achieved.

Prof. Lyubomir Penev (BiCIKL’s project coordinator Prof. Lyubomir Penev and CEO and founder of Pensoft) opens the 2nd General Assembly of BiCIKL in Plovdiv, Bulgaria.

The hybrid format of the meeting enabled a wider range of participants, which resulted in robust discussions on the next steps of the project, such as the implementation of additional technical features of the FAIR Data Place (FAIR being an abbreviation for Findable, Accessible, Interoperable and Reusable).

This FAIR Data Place online platform – the key and final product of the partnership and the BiCIKL initiative – is meant to provide scientists with all types of biodiversity data “at their fingertips”.

This data includes biodiversity information, such as detailed images, DNA, physiology and past studies concerning a specific species and its ‘relatives’, to name a few. Currently, the issue is that all those types of biodiversity data have so far been scattered across various databases, which in turn have been missing meaningful and efficient interconnectedness.

Additionally, the FAIR Data Place, developed within the BiCIKL project, is to give researchers access to plenty of training modules to guide them through the different services.

Halfway through the duration of BiCIKL, the project is at a turning point, where crucial discussions between the partners are playing a central role in the refinement of the FAIR Data Place design. Most importantly, they are tasked with ensuring that their technologies work efficiently with each other, in order to seamlessly exchange, update and share the biodiversity data every one of them is collecting and taking care of.

By Year 3 of the BiCIKL project, the partners agree, when those infrastructures and databases become efficiently interconnected to each other, scientists studying the Earth’s biodiversity across the world will be in a much better position to build on existing research and improve the way and the pace at which nature is being explored and understood. At the end of the day, knowledge is the stepping stone for the preservation of biodiversity and humankind itself.


“Needless to say, it’s an honour and a pleasure to be the coordinator of such an amazing team spanning as many as 14 partnering natural history and biodiversity research institutions from across Europe, but also involving many global long-year collaborators and their infrastructures, such as Wikidata, GBIF, TDWG, Catalogue of Life to name a few,”

said BiCIKL’s project coordinator Prof. Lyubomir Penev, CEO and founder of Pensoft.

“I see our meeting in Plovdiv as a practical demonstration of our eagerness and commitment to tackle the long-standing and technically complex challenge of breaking down the silos in the biodiversity data domain. It is time to start building freeways between all biodiversity data, across (digital) space, time and data types. After the last three days that we spent together in inspirational and productive discussions, I am as confident as ever that we are close to providing scientists with much more straightforward routes to not only generate more biodiversity data, but also build on the already existing knowledge to form new hypotheses and information ready to use by decision- and policy-makers. One cannot stress enough how important the role of biodiversity data is in preserving life on Earth. These data are indeed the groundwork for all that we know about the natural world”  

Prof. Lyubomir Penev added.
Christos Arvanitidis (CEO of LifeWatch ERIC) at the 2nd General Assembly of the BiCIKL project.

Christos Arvanitidis, CEO of LifeWatch ERIC, added:

“The point is: do we want an integrated structure or do we prefer federated structures? What are the pros and cons of the two options? It’s essential to keep the community united and allied because we can’t afford any information loss and the stakeholders should feel at home with the Project and the Biodiversity Knowledge Hub.”


Joe Miller, Executive Secretary and Director at GBIF, commented:

“We are a brand new community, and we are in the middle of the growth process. We would like to already have answers, but it’s good to have this kind of robust discussion to build on a good basis. We must find the best solution to have linkages between infrastructures and be able to maintain them in the future because the Biodiversity Knowledge Hub is the location to gather the community around best practices, data and guidelines on how to use the BiCIKL services… In order to engage even more partners to fill the eventual gaps in our knowledge.”


Joana Pauperio (biodiversity curator at EMBL-EBI) at the 2nd General Assembly of the BiCIKL project.

“BiCIKL is leading data infrastructure communities through some exciting and important developments”  

said Dr Guy Cochrane, Team Leader for Data Coordination and Archiving and Head of the European Nucleotide Archive at EMBL’s European Bioinformatics Institute (EMBL-EBI).

“In an era of biodiversity change and loss, leveraging scientific data fully will allow the world to catalogue what we have now, to track and understand how things are changing and to build the tools that we will use to conserve or remediate. The challenge is that the data come from many streams – molecular biology, taxonomy, natural history collections, biodiversity observation – that need to be connected and intersected to allow scientists and others to ask real questions about the data. In its first year, BiCIKL has made some key advances to rise to this challenge,”

he added.

Deborah Paul, Chair of the Biodiversity Information Standards – TDWG said:

“As a partner, we, at the Biodiversity Information Standards – TDWG, are very enthusiastic that our standards are implemented in BiCIKL and serve to link biodiversity data. We know that joining forces and working together is crucial to building efficient infrastructures and sharing knowledge.”


The project will go on with the first Round Table of experts in December and the publications of the projects who participated in the Open Call and will be founded at the beginning of the next year.

***

Learn more about BiCIKL on the project’s website at: bicikl-project.eu

Follow BiCIKL Project on Twitter and Facebook. Join the conversation on Twitter at #BiCIKL_H2020.

***

All BiCIKL project partners:

Call for Expression of Interest for biodiversity data-related scientific projects from BiCIKL

The purpose of this call is to solicit, select and implement four to six biodiversity data-related scientific projects that will make use of the added value services developed by the leading Research Infrastructures that make the BiCIKL project.

The BiCIKL project invites submissions of Expression of Interest (EoI) to the First BiCIKL Open Call for projects. The purpose of this call is to solicit, select and implement four to six biodiversity data-related scientific projects that will make use of the added value services developed by the leading Research Infrastructures that make the BiCIKL project.

By opening this call, BiCIKL aims to better understand how it could support scientific questions that arise from across the biodiversity world in the future, while addressing specific scientific or technical biodiversity data challenges presented by the applicants.

We need and want to assess real-world problems and make the best possible use of our data and technical capabilities. This will greatly assist in defining the long-term development goals of the participating Research Infrastructures and improve the way they can technically and operationally work together to deliver greater scientific value.

explain the project partners.

The BiCIKL project – a Horizon 2020-funded project involving 14 European institutions, representing major global players in biodiversity research and natural history, and coordinated by Pensoft – establishes a European starting community of key research infrastructures, researchers, citizen scientists and other biodiversity and life sciences stakeholders based on open science practices through access to data, tools and services.

Find more about the Call and submit your Expression of Interest

***

Follow BiCIKL on Twitter and Facebook.

Join the conversation on Twitter via #BiCIKL_H2020.

New BiCIKL project to build a freeway between pieces of biodiversity knowledge

Within Biodiversity Community Integrated Knowledge Library (BiCIKL), 14 key research and natural history institutions commit to link infrastructures and technologies to provide flawless access to biodiversity data.

In a recently started Horizon 2020-funded project, 14 European institutions from 10 countries, representing both the continent’s and global key players in biodiversity research and natural history, deploy and improve their own and partnering infrastructures to bridge gaps between each other’s biodiversity data types and classes. By linking their technologies, they are set to provide flawless access to data across all stages of the research cycle.

Three years in, BiCIKL (abbreviation for Biodiversity Community Integrated Knowledge Library) will have created the first-of-its-kind Biodiversity Knowledge Hub, where a researcher will be able to retrieve a full set of linked and open biodiversity data, thereby accessing the complete story behind an organism of interest: its name, genetics, occurrences, natural history, as well as authors and publications mentioning any of those.

Ultimately, the project’s products will solidify Open Science and FAIR (Findable, Accessible, Interoperable and Reusable) data practices by empowering and streamlining biodiversity research.

Together, the project partners will redesign the way biodiversity data is found, linked, integrated and re-used across the research cycle. By the end of the project, BiCIKL will provide the community with a more transparent, trustworthy and efficient highly automated research ecosystem, allowing for scientists to access, explore and put into further use a wide range of data with only a few clicks.

“In recent years, we’ve made huge progress on how biodiversity data is located, accessed, shared, extracted and preserved, thanks to a vast array of digital platforms, tools and projects looking after the different types of data, such as natural history specimens, species descriptions, images, occurrence records and genomics data, to name a few. However, we’re still missing an interconnected and user-friendly environment to pull all those pieces of knowledge together. Within BiCIKL, we all agree that it’s only after we puzzle out how to best bridge our existing infrastructures and the information they are continuously sourcing that future researchers will be able to realise their full potential,” 

explains BiCIKL’s project coordinator Prof. Lyubomir Penev, CEO and founder of Pensoft, a scholarly publisher and technology provider company.

Continuously fed with data sourced by the partnering institutions and their infrastructures, BiCIKL’s key final output: the Biodiversity Knowledge Hub, is set to persist with time long after the project has concluded. On the contrary, by accelerating biodiversity research that builds on – rather than duplicates – existing knowledge, it will in fact be providing access to exponentially growing contextualised biodiversity data.

***

Learn more about BiCIKL on the project’s website at: bicikl-project.eu

Follow BiCIKL Project on Twitter and Facebook. Join the conversation on Twitter at #BiCIKL_H2020.

***

The project partners:

48 years of Australian collecting trips in one data package

From 1973 to 2020, Australian zoologist Dr Robert Mesibov kept careful records of the “where” and “when” of his plant and invertebrate collecting trips. Now, he has made those valuable biodiversity data freely and easily accessible via the Zenodo open-data repository, so that future researchers can rely on this “authority file” when using museum specimens collected from those events in their own studies. The new dataset is described in the open-access, peer-reviewed Biodiversity Data Journal.

While checking museum records, Dr Robert Mesibov found there were occasional errors in the dates and places for specimens he had collected many years before. He was not surprised.

“It’s easy to make mistakes when entering data on a computer from paper specimen labels”, said Mesibov. “I also found specimen records that said I was the collector, but I know I wasn’t!”

One solution to this problem was what librarians and others have long called an “authority file”.

“It’s an authoritative reference, in this case with the correct details of where I collected and when”, he explained.

“I kept records of almost all my collecting trips from 1973 until I retired from field work in 2020. The earliest records were on paper, but I began storing the key details in digital form in the 1990s.”

The 48-year record has now been made publicly available via the Zenodo open-data repository after conversion to the Darwin Core data format, which is widely used for sharing biodiversity information. With this “authority file”, described in detail in the open-access, peer-reviewed Biodiversity Data Journal, future researchers will be able to rely on sound, interoperable and easy to access data, when using those museum specimens in their own studies, instead of repeating and further spreading unintentional errors.

“There are 3829 collecting events in the authority file”, said Mesibov, “from six Australian states and territories. For each collecting event there are geospatial and date details, plus notes on the collection.”

Mesibov hopes the authority file will be used by museums to correct errors in their catalogues.

“It should also save museums a fair bit of work in future”, he explained. “No need to transcribe details on specimen labels into digital form in a database, because the details are already in digital form in the authority file.”

Mesibov points out that in the 19th and 20th centuries, lists of collecting events were often included in the reports of major scientific expeditions.

“Those lists were authority files, but in the pre-digital days it was probably just as easy to copy collection data from specimen labels.”

“In the 21st century there’s a big push to digitise museum specimen collections”, he said. “Museum databases often have lookup tables with scientific names and the names of collectors. These lookup tables save data entry time and help to avoid errors in digitising.”

“Authority files for collecting events are the next logical step,” said Mesibov. “They can be used as lookup tables for all the important details of individual collections: where, when, by whom and how.”

###

Research paper:

Mesibov RE (2021) An Australian collector’s authority file, 1973–2020. Biodiversity Data Journal 9: e70463. https://doi.org/10.3897/BDJ.9.e70463

###

Robert Mesibov’s webpage: https://www.datafix.com.au/mesibov.html

Robert Mesibov’s ORCID page: https://orcid.org/0000-0003-3466-5038

Call for data papers describing datasets from Russia to be published in Biodiversity Data Journal

GBIF partners with FinBIF and Pensoft to support publication of new datasets about biodiversity from across Russia

Original post via GBIF

In collaboration with the Finnish Biodiversity Information Facility (FinBIF) and Pensoft Publishers, GBIF has announced a new call for authors to submit and publish data papers on Russia in a special collection of Biodiversity Data Journal (BDJ). The call extends and expands upon a successful effort in 2020 to mobilize data from European Russia.

Between now and 15 September 2021, the article processing fee (normally €550) will be waived for the first 36 papers, provided that the publications are accepted and meet the following criteria that the data paper describes a dataset:

The manuscript must be prepared in English and is submitted in accordance with BDJ’s instructions to authors by 15 September 2021. Late submissions will not be eligible for APC waivers.

Sponsorship is limited to the first 36 accepted submissions meeting these criteria on a first-come, first-served basis. The call for submissions can therefore close prior to the stated deadline of 15 September 2021. Authors may contribute to more than one manuscript, but artificial division of the logically uniform data and data stories, or “salami publishing”, is not allowed.

BDJ will publish a special issue including the selected papers by the end of 2021. The journal is indexed by Web of Science (Impact Factor 1.331), Scopus (CiteScore: 2.1) and listed in РИНЦ / eLibrary.ru.

For non-native speakers, please ensure that your English is checked either by native speakers or by professional English-language editors prior to submission. You may credit these individuals as a “Contributor” through the AWT interface. Contributors are not listed as co-authors but can help you improve your manuscripts.

In addition to the BDJ instruction to authors, it is required that datasets referenced from the data paper a) cite the dataset’s DOI, b) appear in the paper’s list of references, and c) has “Russia 2021” in Project Data: Title and “N-Eurasia-Russia2021“ in Project Data: Identifier in the dataset’s metadata.

Authors should explore the GBIF.org section on data papers and Strategies and guidelines for scholarly publishing of biodiversity data. Manuscripts and datasets will go through a standard peer-review process. When submitting a manuscript to BDJ, authors are requested to select the Biota of Russia collection.

To see an example, view this dataset on GBIF.org and the corresponding data paper published by BDJ.

Questions may be directed either to Dmitry Schigel, GBIF scientific officer, or Yasen Mutafchiev, managing editor of Biodiversity Data Journal.

The 2021 extension of the collection of data papers will be edited by Vladimir Blagoderov, Pedro Cardoso, Ivan Chadin, Nina Filippova, Alexander Sennikov, Alexey Seregin, and Dmitry Schigel.

This project is a continuation of the successful call for data papers from European Russia in 2020. The funded papers are available in the Biota of Russia special collection and the datasets are shown on the project page.

***

Definition of terms

Datasets with more than 5,000 records that are new to GBIF.org

Datasets should contain at a minimum 5,000 new records that are new to GBIF.org. While the focus is on additional records for the region, records already published in GBIF may meet the criteria of ‘new’ if they are substantially improved, particularly through the addition of georeferenced locations.” Artificial reduction of records from otherwise uniform datasets to the necessary minimum (“salami publishing”) is discouraged and may result in rejection of the manuscript. New submissions describing updates of datasets, already presented in earlier published data papers will not be sponsored.

Justification for publishing datasets with fewer records (e.g. sampling-event datasets, sequence-based data, checklists with endemics etc.) will be considered on a case-by-case basis.

Datasets with high-quality data and metadata

Authors should start by publishing a dataset comprised of data and metadata that meets GBIF’s stated data quality requirement. This effort will involve work on an installation of the GBIF Integrated Publishing Toolkit.

Only when the dataset is prepared should authors then turn to working on the manuscript text. The extended metadata you enter in the IPT while describing your dataset can be converted into manuscript with a single-click of a button in the ARPHA Writing Tool (see also Creation and Publication of Data Papers from Ecological Metadata Language (EML) Metadata. Authors can then complete, edit and submit manuscripts to BDJ for review.

Datasets with geographic coverage in Russia

In correspondence with the funding priorities of this programme, at least 80% of the records in a dataset should have coordinates that fall within the priority area of Russia. However, authors of the paper may be affiliated with institutions anywhere in the world.

***

Check out the Biota of Russia dynamic data paper collection so far.

Follow Biodiversity Data Journal on Twitter and Facebook to keep yourself posted about the new research published.

Open Science RIO Journal invites early research outcomes for the free-to-publish collection “Observations, prevention and impact of COVID-19”

Looking at today’s ravaging COVID-19 (Coronavirus) pandemic, which, at the time of writing, has spread to over 220 countries; its continuously rising death toll and widespread fear, on the outside, it may feel like scientists and decision-makers are scratching their heads more than ever in the face of the unknown. In reality, however, we get to witness an unprecedented global community gradually waking up to the realisation of the only possible solution: collaboration. 

On one hand, we have nationwide collective actions, including cancelled travel plans and mass gatherings; social distancing; and lockdowns, that have already proved successful at changing what the World Health Organisation (WHO) has determined as “the course of a rapidly escalating and deadly epidemic” in Hong Kong, Singapore and China. On the other hand, we have the world’s best scientists and laboratories all steering their expertise and resources towards the better understanding of the virus and, ultimately, developing a vaccine for mass production as quickly as possible. 

While there is little doubt that the best specialists in the world will eventually invent an efficient vaccine – just like they did following the Western African Ebola virus epidemic (2013–2016) and on several other similar occasions in the years before – the question at hand is rather when this is going to happen and how many human lives it is going to cost?

Again, it all comes down to collective efforts. It only makes sense that if research teams and labs around the globe join their efforts and expertise, thereby avoiding duplicate work, their endeavours will bear fruit sooner rather than later. Similarly to employees from across the world, who have been demonstrating their ability to perform their day-to-day tasks and responsibilities from the safety of their homes just as efficiently as they would have done from their conventional offices, in today’s high-tech, online-friendly reality, no more should scientists be restricted by physical and geographical barriers either. 

“Observations, prevention and impact of COVID-19”: Special Collection in RIO Journal

To inspire and facilitate collaboration across the world, the SPARC-recognised Open Science innovator Research Ideas and Outcomes (RIO Journal) decided to bring together scientific findings in an easy to discover, read, cite and build on collection of publications. 

Furthermore, due to its revolutionary approach to publishing, where early and brief research outcomes (i.e. ideas, raw data, software descriptions, posters, presentations, case studies and many others) are all considered as precious scientific gems, hence deserving a formal publication in a renowned academic journal, RIO places a special focus on these contributions. 

Accepted manuscripts that shall deal with research relevant to the COVID-19 pandemic across disciplines, including medicine, ethics, politics, economics etc. at a local, regional, national or international scale; and also meant to encourage crucial discussions, will be published free of charge in recognition of the emergency of the current situation. Especially encouraged are submissions focused on the long-term effects of COVID-19.

Why publish in RIO Journal? 

Launched in 2015, RIO Journal has since proved its place at the forefront of Open Science, which resulted in the SPARC’s Innovator Award in 2016. Supported by a renowned advisory board and subject editors, today the journal stands as a leading Open Science proponent. 

Furthermore, thanks to the technologically advanced infrastructure and services it provides, in addition to a long list of indexers and databases where publications are registered, the manuscripts submitted to RIO Journal are not only rapidly processed and published, but once they get online, they immediately become easy to discover, cite and built on by any researcher, anywhere in the world. 

On top of that, Pensoft’s targeted and manually provided science communication services make sure that published research of social value reaches the wider audience, including key decision-makers and journalists, by means of press releases and social media promotion.

***

More info about RIO’s globally unique features, visit the journal’s websiteFollow RIO Journal on Twitter and Facebook.

FAIR biodiversity data in Pensoft journals thanks to a routine data auditing workflow

Data audit workflow provided for data papers submitted to Pensoft journals.

To avoid publication of openly accessible, yet unusable datasets, fated to result in irreproducible and inoperable biological diversity research at some point down the road, Pensoft takes care for auditing data described in data paper manuscripts upon their submission to applicable journals in the publisher’s portfolio, including Biodiversity Data JournalZooKeysPhytoKeysMycoKeys and many others.

Once the dataset is clean and the paper is published, biodiversity data, such as taxa, occurrence records, observations, specimens and related information, become FAIR (findable, accessible, interoperable and reusable), so that they can be merged, reformatted and incorporated into novel and visionary projects, regardless of whether they are accessed by a human researcher or a data-mining computation.

As part of the pre-review technical evaluation of a data paper submitted to a Pensoft journal, the associated datasets are subjected to data audit meant to identify any issues that could make the data inoperable. This check is conducted regardless of whether the dataset are provided as supplementary material within the data paper manuscript or linked from the Global Biodiversity Information Facility (GBIF) or another external repository. The features that undergo the audit can be found in a data quality checklist made available from the website of each journal alongside key recommendations for submitting authors.

Once the check is complete, the submitting author receives an audit report providing improvement recommendations, similarly to the commentaries he/she would receive following the peer review stage of the data paper. In case there are major issues with the dataset, the data paper can be rejected prior to assignment to a subject editor, but resubmitted after the necessary corrections are applied. At this step, authors who have already published their data via an external repository are also reminded to correct those accordingly.

“It all started back in 2010, when we joined forces with GBIF on a quite advanced idea in the domain of biodiversity: a data paper workflow as a means to recognise both the scientific value of rich metadata and the efforts of the the data collectors and curators. Together we figured that those data could be published most efficiently as citable academic papers,” says Pensoft’s founder and Managing director Prof. Lyubomir Penev.
“From there, with the kind help and support of Dr Robert Mesibov, the concept evolved into a data audit workflow, meant to ‘proofread’ the data in those data papers the way a copy editor would go through the text,” he adds.
“The data auditing we do is not a check on whether a scientific name is properly spelled, or a bibliographic reference is correct, or a locality has the correct latitude and longitude”, explains Dr Mesibov. “Instead, we aim to ensure that there are no broken or duplicated records, disagreements between fields, misuses of the Darwin Core recommendations, or any of the many technical issues, such as character encoding errors, that can be an obstacle to data processing.”

At Pensoft, the publication of openly accessible, easy to access, find, re-use and archive data is seen as a crucial responsibility of researchers aiming to deliver high-quality and viable scientific output intended to stand the test of time and serve the public good.

CASE STUDY: Data audit for the “Vascular plants dataset of the COFC herbarium (University of Cordoba, Spain)”, a data paper in PhytoKeys

To explain how and why biodiversity data should be published in full compliance with the best (open) science practices, the team behind Pensoft and long-year collaborators published a guidelines paper, titled “Strategies and guidelines for scholarly publishing of biodiversity data” in the open science journal Research Ideas and Outcomes (RIO Journal).