data management

Pensoft joins new Horizon Europe project to help tackle terrestrial invasive alien species

Pensoft will play a vital role in public awareness, engagement and promoting effective strategies for monitoring and managing IAS.

*The Chinese muntjac* *(Muntiacus reevesi)* *is an invasive alien species for Europe with established populations across the western part of the continent*. *Photo by Mario Shimbov (Pensoft).*

As one of the partners in charge of maximising the project’s impact, Pensoft will work on OneSTOP’s visual branding, communication, dissemination and exploitation, and the development of a data management plan for the project.

Invasive alien species (IAS) pose one of the most significant threats to global biodiversity, contributing to species extinctions, ecosystem degradation, and economic losses exceeding $400 billion annually.

To tackle this, the EU enforces Regulation (EU) 1143/2014 and the Biodiversity Strategy for 2030, aiming to prevent IAS introduction, enhance early detection, and manage their spread. Member States coordinate efforts with scientific support and citizen engagement to minimise their impact and protect Europe’s biodiversity. Addressing this urgent challenge, the EU Horizon project OneSTOP has officially launched as part of a coordinated European effort to combat biological invasions in terrestrial environments.

Comprehensive Approach to Tackling Invasive Alien Species

OneSTOP is one of two ambitious projects funded under the Horizon Europe programme, the other being GuardIAS, which focuses on marine and freshwater habitats. The two collaborative initiatives held their joint official kick-off meeting in January at the Joint Research Centre in Ispra, Italy. Together, these projects aim to develop innovative solutions for detecting, preventing, and managing invasive alien species across all ecosystem realms.

Coordinated by Dr Quentin Groom from Meise Botanic Garden, Belgium, and Prof Helen Roy from the UK Centre for Ecology and Hydrology, OneSTOP will integrate advanced scientific research, cutting-edge detection technologies, and policy-driven strategies to enhance biosecurity across Europe.

*The ОneSTOP project consortium at the project’s kick-off meeting held on 20-24 January 2025 in Ispra, Italy.*

The project is structured around four key objectives:

Improve species detection and response time by incorporating computer vision, environmental DNA (eDNA) analysis and citizen science initiatives.
Facilitate swift action against invasive species threats by openly sharing data in international standards for biodiversity data with stakeholders who need it.
Support policy-makers in making informed decisions about where and how to allocate resources for invasive species management by developing data-driven systems.
Ensure stakeholder collaboration and knowledge exchange by implementing Living Labs at the regional level and an international policy forum, thereby encouraging socio-political action.

OneSTOP aligns with the European Alien Species Information Network (EASIN) mission to protect EU biodiversity by improving IAS management through advanced biosecurity technologies and enhanced data integration. By fostering collaboration with the Joint Research Centre (JRC) and supporting Member States with innovative tools, the project strengthens the EU’s capacity to detect, respond to, and mitigate IAS threats in line with existing regulations.

Pensoft’s role in OneSTOP

As the leader of Work Package 1, Pensoft is responsible for shaping OneSTOP’s visual identity and developing a comprehensive strategy for communication, dissemination, and impact. This includes crafting a data and knowledge management plan to ensure the project’s findings are effectively shared and utilised. By fostering collaboration with key biosecurity networks, these efforts will strengthen OneSTOP’s long-term influence.

A key part of this work is to raise awareness about invasive alien species (IAS) and their pathways, ensuring that policymakers, researchers, and the public understand their impact and the importance of prevention. Pensoft will contribute to translating complex scientific findings into accessible content—including infographics, policy briefs, and interactive visualisations—to engage policymakers, researchers, and the public. These efforts will ensure that IAS knowledge is effectively shared, fostering collaboration and informed decision-making across sectors. Knowledge transfer materials will be shared through various channels, including OneSTOP’s five Living Labs across Europe, where stakeholders will be actively engaged in outreach and citizen science initiatives.

Pensoft will play a vital role in strengthening public awareness, fostering engagement, and promoting effective strategies for monitoring and managing IAS.

International Consortium

The project brings together twenty international partners from fifteen countries operating in various sectors, ultimately contributing with diverse expertise:

Meise Botanic Garden – Belgium
Aarhus University – Denmark
UK Centre for Ecology & Hydrology – United Kingdom
Biopolis – Portugal
Coventry University – United Kingdom
The Cyprus Institute – Cyprus
Research Institute for Nature and Forest – Belgium
Institute of Botany of the Czech Academy of Sciences – Czech Republic
Lincoln University – New Zealand
Platform Kinetics – United Kingdom
Pensoft Publishers – Bulgaria
Stellenbosch University – South Africa
University of Exeter – United Kingdom
University of Vienna – Austria
Greenformation – Hungary
Helmholtz Centre for Environmental Research – Germany
Ovidius University of Constanta – Romania
Natural Resources Institute Finland – Finland
The Binary Forest – Belgium
Experimental Station of Arid Areas of the Spanish National Research Council – Spain

The OneSTOP project website is coming soon!

For more information visit the OneSTOP project website, and make sure to follow the project’s progress via our social media channels on BlueSky and LinkedIn.

Better data practices advance biodiversity knowledge

A framework to retrieve, refine and align secondary biodiversity data with FAIR standards.

Guest blog post by Nubia Marques et al.

In a world increasingly defined by data-driven decisions, biodiversity research stands to benefit from standardized and accessible data. Despite their importance for research, biodiversity datasets often fail to meet FAIR (Findable, Accessible, Interoperable, Reusable) standards, leading to concerns about data quality, reliability, and accessibility.

How to ensure biodiversity data are FAIR, linked, open and future-proof?

To address this, we propose a framework to retrieve, refine and align secondary biodiversity data with FAIR standards, utilizing the Darwin Core model. We followed four steps:

data localization (systematic review)
quality validation
standardization using the Darwin Core standard
sharing and archive in the appropriate repository.

Our approach integrates data validation and quality control steps to ensure that secondary data sets can be trusted.

Our study in Biodiversity Data Journal focused on ecotonal estuarine ecosystems near the easternmost Amazon, where we recovered data from 46,000 individuals representing 3,871 taxa across eight biotic groups (birds, amphibians, reptiles, mammals, fish, phytoplankton, benthos, and plants) from 1985 to 2022. These data were used to illustrate how our strategy improves validation, making the data more reliable for macroecological modeling and conservation management. As data becomes more standardized, researchers around the world will be better equipped to collaborate, identify trends, protect ecosystems, and advance sustainability efforts.

Relationships between numbers of taxa and occurrences gathered through an extensive review of secondary biodiversity data from the Golfão Maranhense area, in the estuarine regions of eastern Amazonia.

Accessible biodiversity data empowers stakeholders and provides critical insights into ecosystem health and species conservation. However, without standardized formats, this data is often fragmented, incomplete, or difficult to compare. By creating a consistent framework for collecting, storing, and sharing data, we are opening the door to more informed decision-making and innovation in biodiversity conservation.

The key to conserving biodiversity is collaboration and transparency. By prioritizing accessible and standardized data, we ensure that vital information reaches those who need it most – whether it’s for scientific study, habitat management or policymaking.

Let’s continue to make biodiversity data a tool for global change!

Research article:

Marques N, Soares CDdeM, Casali DdeM, Guimarães E, Fava F, Abreu JMdaS, Moras L, Silva LGda, Matias R, Assis RLde, Fraga R, Almeida S, Lopes V, Oliveira V, Missagia R, Carvalho E, Carneiro N, Alves R, Souza-Filho P, Oliveira G, Miranda M, Tavares VdaC (2024) Retrieving biodiversity data from multiple sources: making secondary data standardised and accessible. Biodiversity Data Journal 12: e133775. https://doi.org/10.3897/BDJ.12.e133775

How the names of organisms help to turn ‘small data’ into ‘Big Data’

Innovation in ‘Big Data’ helps address problems that were previously overwhelming. What we know about organisms is in hundreds of millions of pages published over 250 years. New software tools of the Global Names project find scientific names, index digital documents quickly, correcting names and updating them. These advances help “Making small data big” by linking together to content of many research efforts. The study was published in the open access journal Biodiversity Data Journal.

The ‘Big Data’ vision of science is transformed by computing resources to capture, manage, and interrogate the deluge of information coming from new technologies, infrastructural projects to digitise physical resources (such as our literature from the Biodiversity Heritage Library), or digital versions of specimens and records about specimens by museums.

Increased bandwidth has made dialogue among distributed data centres feasible and this is how new insights into biology are arising. In the case of biodiversity sciences, data centres range in size from the large GenBank for molecular records and the Global Biodiversity Information Facility for records of occurrences of species, to a long tail of tens of thousands of smaller datasets and web-sites which carry information compiled by individuals, research projects, funding agencies, local, state, national and international governmental agencies.

The large biological repositories do not yet approach the scale of astronomy and nuclear physics, but the very large number of sources in the long tail of useful resources do present biodiversity informaticians with a major challenge – how to discover, index, organize and interconnect the information contained in a very large number of locations.

In this regard, biology is fortunate that, from the middle of the 18th Century, the community has accepted the use of latin binomials such as Homo sapiens or Ba humbugi for species. All names are listed by taxonomists. Name recognition tools can call on large expert compilations of names (Catalogue of Life, Zoobank, Index Fungorum, Global Names Index) to find matches in sources of digital information. This allows for the rapid indexing of content.

Even when we do not know a name, we can ‘discover’ it because scientific names have certain distinctive characteristics (written in italics, most often two successive words in a latinised form, with the first one – capitalised). These properties allow names not yet present in compilations of names to be discovered in digital data sources.

The idea of a names-based cyberinfrastructure is to use the names to interconnect large and small distributed sites of expert knowledge distributed across the Internet. This is the concept of the described Global Names project which carried out the work described in this paper.

The effectiveness of such an infrastructure is compromised by the changes to names over time because of taxonomic and phylogenetic research. Names are often misspelled, or there might be errors in the way names are presented. Meanwhile, increasing numbers of species have no names, but are distinguished by their molecular characteristics.

In order to assess the challenge that these problems may present to the realization of a names-based cyberinfrastructure, we compared names from GenBank and DRYAD (a digital data repository) with names from Catalogue of Life to assess how well matched they are.

As a result, we found out that fewer than 15% of the names in pair-wise comparisons of these data sources could be matched. However, with a names parser to break the scientific names into all of their component parts, those parts that present the greatest number of problems could be removed to produce a simplified or canonical version of the name. Thanks to such tools, name-matching was improved to almost 85%, and in some cases to 100%.

The study confirms the potential for the use of names to link distributed data and to make small data big. Nonetheless, it is clear that we need to continue to invest more and better names-management software specially designed to address the problems in the biodiversity sciences.

###

Original source:

Patterson D, Mozzherin D, Shorthouse D, Thessen A (2016) Challenges with using names to link digital biodiversity information. Biodiversity Data Journal, doi: 10.3897/BDJ.4.e8080.

Additional information:

The study was supported by the National Science Foundation.