biodiversity data

Bulgaria joins the Global Biodiversity Information Facility (GBIF)

Led by Pensoft and its CEO Prof. Lyubomir Penev, the partnership marks a major step for Bulgarian science and regional biodiversity leadership.

Bulgaria officially joins the Global Biodiversity Information Facility (GBIF). This major event for Bulgarian science was initiated by a memorandum signed by the Minister of Environment and Water: Manol Genov.

GBIF is an international network and data infrastructure funded by governments around the world that provides international open access to a modern and comprehensive database of all species of living organisms on the planet.

Joining GBIF is an important step for initiatives such as the Bulgarian Barcode of Life (BgBOL), as it will facilitate the integration of genetic data on species diversity into the global scientific community and support the creation of a more accurate and accessible bioinformatic database. This will increase the scientific visibility and relevance of Bulgarian efforts in molecular taxonomy and conservation.

Newly established Bulgarian Barcode of Life to support biodiversity conservation in the country

World map showing GBIF network participants: green for voting participants, blue for associate participants, gray for non-participants. — Prof. Lyubomir Penev

“First of all, I’d like to congratulate all fellow scientists working in the domain of biology and ecology in Bulgaria with this wonderful achievement,” says Prof. Dr. Lyubomir Penev, founder and CEO of the scientific publisher and technology provider Pensoft, as well as a key participant in the talks and preparations for Bulgaria’s joining GBIF. He is also Chair of BgBOL.

“Becoming a full member of GBIF has been a long-anticipated milestone we have discussed and worked on for several years. Coming not long after we initiated the Bulgarian Barcode of Life, the Bulgarian membership in GBIF gives us yet another uncontested evidence that the nation is on the right path to preserving our uniquely rich fauna and flora,” he adds.

Pensoft is looking forward to sharing our know-how with Bulgarian institutions and scientists in order to streamline the visibility and overall efficiency of biodiversity data collected from Bulgaria.
Prof. Lyubomir Penev

“As close partners of GBIF for over 15 years now, Pensoft is looking forward to sharing our know-how with Bulgarian institutions and scientists, so that they can fully utilise the GBIF infrastructure and tools, in order to streamline the visibility and overall efficiency of biodiversity data collected from Bulgaria.”

GBIF is managed by a Secretariat based in Copenhagen and brings together countries and organisations that collaborate through national and institutional coordinators (also called participant nodes). The mechanism provides common standards, good practices and open access tools for institutions around the world to share information on the location and recording of species and specimens. According to GBIF, a total of 107 countries and organisations currently participate in the network, a significant number of which are European.

The GBIF network, as screenshot from https://www.gbif.org/the-gbif-network on 10/06/2025.

By joining GBIF, biodiversity data generated in Bulgaria can be streamlined through the network’s infrastructure so that the country does not need to build and maintain its own separate infrastructure, which also saves significant financial resources.

As a full voting member, Bulgaria will ensure that biodiversity data in the country will be shared and accessible through the platform, and will contribute to global knowledge on biodiversity, respectively to the solutions that will promote its conservation and sustainable use.

Map of Bulgaria showing biodiversity data with orange heatmap indicating occurrences. — Bulgaria’s page on GBIF, as screenshot from https://www.gbif.org/country/BG/summary on 10/06/2025.

Improvements in data management by Bulgaria will also contribute to better reporting and fulfilment of obligations to the Convention on Biological Diversity (CBD) as well as to the Intergovernmental Platform on Biodiversity and Ecosystem Services (IPBES). As a member of GBIF, Bulgaria will be able to apply for funding for flagship activities in Bulgarian institutions and neighbouring Balkan countries. This will enable the country to expand its leadership role in the Balkans in biodiversity research and data accumulation.

GBIF and Pensoft signed a Memorandum of Cooperation

The partnership between GBIF and Pensoft dates back to 2009 when the global network and the publisher signed their first Memorandum of Understanding intended to solidify their cooperation as leaders in the technological advancement relevant to biodiversity knowledge. Over the next few years, Pensoft integrated its whole biodiversity journal portfolio with the GBIF infrastructure to enable multiple automated workflows, including export of all species occurrence data published in scientific articles straight to the GBIF platform. Most recently, over 20 biodiversity journals powered by Pensoft’s scholarly publishing platform ARPHA launched their own hosted portals on GBIF to make it easier to access and use biodiversity data associated with published research, aligning with principles of Findable, Accessible, Interoperable, and Reusable (FAIR) data.

Journals published on ARPHA now archived in the Biodiversity Heritage Library

To date, the content available on BHL includes 16,000 legacy articles and also extends to future articles.

Content from more than 30 biodiversity journals published on the ARPHA Platform will now be archived in the Biodiversity Heritage Library (BHL), the world’s largest open-access digital library for biodiversity literature and archives.

A global consortium of natural history, botanical, research, and national libraries, BHL digitises and freely shares essential biodiversity materials. A critical resource for researchers, it provides vital access to material that might otherwise be difficult to obtain.

Under the agreement, over 16,000 articles published on Pensoft’s self-developed ARPHA Platform are now available on BHL. Both legacy content and new articles are made available on the platform, complete with full-text PDFs and all relevant metadata.

Thanks to this integration, content in our journals will become even more accessible and readily discoverable, helping researchers find the biodiversity information they need.
Prof. Lyubomir Penev

More content published on ARPHA will gradually be added to the BHL archive.

The publications will be included in the Library’s full-text search, allowing researchers to easily locate relevant biodiversity literature. Crucially, the scientific names within the articles will be indexed using the Global Names Architecture, enabling seamless discovery of information about specific taxa across the BHL collection.

This automated workflow is facilitated by the ARPHA platform and uses the Open Archives Initiative Protocol for Metadata Harvesting (OAI-PMH) to enable exposure and harvesting of repository metadata.

“Pensoft is pleased to collaborate with BHL in our joint mission to support global biodiversity research through free access to knowledge. Thanks to this integration, content in our journals will become even more accessible and readily discoverable, helping researchers find the biodiversity information they need,” said Prof. Lyubomir Penev, CEO and founder of Pensoft and ARPHA.

The news comes soon after BHL announced it is about to face a major shift in its operation. From 2026, the Smithsonian Institution – one of BHL’s 10 founding members – will cease to host the administrative and technical components of BHL. As the consortium explores a range of options, the BHL team is confident that “the transition opens the door to a reimagined and more sustainable future for BHL.”

Biodiversity Knowledge Hub makes an appearance at the European Geosciences Union General Assembly 2025

The Biodiversity Knowledge Hub fosters interoperability between diverse resources to make it easier to use and combine information.

*Gabriel Peoluze (LifeWatch ERIC) presents the Biodiversity Knowledge Hub poster at EGU 2025*
*(Vienna, Austria).*

On Monday, 28 April, the first day of the European Geosciences Union General Assembly 2025 (EGU 2025), participants had the chance to discover one of the most promising initiatives in biodiversity informatics: the Biodiversity Knowledge Hub (BKH). BKH was presented as part of a dedicated poster session, titled “Biodiversity Knowledge Hub: Addressing the impacts of environmental change by linking Research Infrastructures, Global Aggregators and Community Networks“.

Understanding and addressing the impacts of environmental change on biodiversity and ecosystems demands access to reliable FAIR data (as in Findable, Accessible, Interoperable, Reusable). However, the current landscape is often fragmented, making it difficult to combine and use these resources effectively.

Enter the Horizon-funded project Biodiversity Community Integrated Knowledge Library (BiCIKL): a pioneering initiative that demonstrates the transformative power of interdisciplinary collaboration. Coordinated by Pensoft, BiCIKL ran between 2021 and 2024.

New BiCIKL project to build a freeway between pieces of biodiversity knowledge

The Vision of BiCIKL

Within BiCIKL, 14 European institutions from ten countries teamed up with the aim to integrate biodiversity data across research infrastructures, scientific repositories, and expert communities.

Through this integration, BiCIKL bridged the gap between isolated knowledge systems and delivered actionable insights to guide conservation and resilience efforts. The project embodies the principles of open science by demonstrating how interdisciplinary collaboration can turn fragmented data into cohesive, usable knowledge for researchers, policymakers, and practitioners.

How to ensure biodiversity data are FAIR, linked, open and future-proof?

The Biodiversity Knowledge Hub

At the heart of BiCIKL’s success is the Biodiversity Knowledge Hub (BKH): an innovative platform that provides seamless access to biodiversity data, tools, and workflows. The BKH fosters interoperability between diverse resources, thus making it easier to combine information from different sources. Whether for advanced research analytics or policymaking in support of sustainable development, the BKH empowers users with tools tailored to their needs.

A few of the standout features of the BKH include:

Modular design to allow continuous expansion and adaptability to new challenges in biodiversity and climate resilience
Interoperable systems that connect a variety of databases, repositories, and services to deliver integrated knowledge.
Community building by welcoming a broad network of stakeholders to ensure the platform’s long-term sustainability and growth.

Watch the Biodiversity Knowledge Hub video on YouTube.

Setting a New Benchmark in Biodiversity Informatics

Through its collaborative approach, BiCIKL set a new standard for how biodiversity and climate resilience initiatives can be harmonised globally. By showcasing best practices in data integration, capacity building, and stakeholder engagement, BiCIKL became much more than a project: it turned into a blueprint for future biodiversity knowledge infrastructures.

The Biodiversity Knowledge Hub serves to demonstrate how harmonised standards and active collaboration are key to unlocking the full potential of biodiversity data. In doing so, its mission is to create scalable, long-term solutions that are crucial for addressing today’s pressing environmental challenges.

The poster presentation at EGU25 outlined the methodologies and technologies driving the BKH, emphasizing its role as a pioneering model for integrated biodiversity knowledge and action. As environmental pressures continue to mount, the work of BiCIKL and the Biodiversity Knowledge Hub offers a hopeful path forward—one where knowledge flows freely, collaborations flourish, and data-driven solutions guide our way to a more resilient future.

Visit the Biodiversity Community Integrated Knowledge Library (BiCIKL) project’s website at: https://bicikl-project.eu/.

Don’t forget to also explore the Biodiversity Knowledge Hub (BKH) for yourself at: https://biodiversityknowledgehub.eu/ and watch the BKH’s introduction video.

Revisit highlights from the BiCIKL project on X/Twitter from the project’s hashtag: #BiCIKL_H2020 and handle: @BiCIKL_H2020.

BiCIKL project sums up outcomes and future prospects at a Final GA in Cambridge

Pensoft joins the Biodiversity Meets Data Horizon project to support biodiversity monitoring and conservation

As part of the new consortium, Pensoft is to use innovative communication tools in support of evidence-based biodiversity conservation across Europe.

The European Union (EU) has been working to protect nature for decades, with the Natura 2000 network now safeguarding over 18% of EU land and 9% of its marine territory. Yet, biodiversity is still in trouble, with only 50% of bird species and 15% of habitats in good conservation status.

To turn the tide, the EU’s Biodiversity Strategy for 2030 will expand the existing Natura 2000 areas, implement the EU’s first-ever Nature Restoration Law, and introduce concrete measures to achieve global biodiversity targets. Success will depend on enhancing biodiversity monitoring, making better use of data and gaining a clearer picture of how nature is changing.

Addressing this urgent challenge, the EU Horizon project BMD (abbreviated for Biodiversity Meets Data) will offer a centralised platform (Single Access Point or SAP) for improved biodiversity monitoring across Europe.

Pensoft’s role

Pensoft will play a role in Biodiversity Meets Data’s impact by planning and implementing the communication, dissemination and exploitation of project results, as well as helping with the training and capacity building for BMD’s end-users, which will be led by LifeWatch ERIC. Pensoft will adopt a multi-format approach to knowledge transfer with tailored outputs for the scientific community, decision-makers, industry representatives and the general public.

Furthermore, the BMD SAP will also incorporate elements of the Biodiversity Knowledge Hub (BKH), developed under the BiCIKL project, coordinated by Pensoft.

“It’s incredibly rewarding to see the continuity in our projects, with the legacy of the BiCIKL project continuing with Biodiversity Meets Data. This seamless progression not only builds on our past successes but also ensures that our work continues to deliver long-lasting value to the biodiversity community.”
said Prof. Dr. Lyubomir Penev, CEO and Founder of Pensoft, and project coordinator of BiCIKL (abbreviated from Biodiversity Community Integrated Knowledge Library).

The BMD project consortium at the project’s kick-off meeting in early March 2025 (Leiden, the Netherlands).

International consortium

Coordinated by Naturalis Biodiversity Center, the project brings together 14 partner organisations from 11 countries to develop innovative solutions for biodiversity management.

Naturalis Biodiversity Center – the Netherlands
Royal Botanic Garden Edinburgh – the United Kingdom
Meise Botanic Garden – Belgium
Helmholtz Centre for Environmental Research – Germany
e-Science European Infrastructure for Biodiversity and Ecosystem Research – Spain
Pensoft Publishers – Bulgaria
The European Land Conservation Network – the Netherlands
University of Tartu – Estonia
Stichting Catalogue of Life – the Netherlands
The International Hellenic University – Greece
The Senckenberg Nature Research Society – Germany
The Environment Agency Austria – Austria
The National Research Council – Italy
SIB Swiss Institute of Bioinformatics – Switzerland

For more information:

Visit the BMD project website at https://bmd-project.eu/, and make sure to follow the project’s progress via our social media channels on Bluesky and Linkedin.

More than 20 journals published by Pensoft with their own hosted data portals on GBIF to streamline and FAIR-ify biodiversity research

The portals currently host data on over 1,000 datasets and almost 325,000 occurrence records across the 25 journals.

In collaboration with the Global Biodiversity Information Facility (GBIF), Pensoft has established hosted data portals for 25 open-access peer-reviewed journals published on the ARPHA Platform.

A screenshot featuring a close-up of a turtle on a forest floor, overlayed with a web portal design for biodiversity data browsing. — A screenshot of the Check List data portal.

The initiative aims to make it easier to access and use biodiversity data associated with published research, aligning with principles of Findable, Accessible, Interoperable, and Reusable (FAIR) data.

The data portals offer seamless integration of published articles and associated data elements with GBIF-mediated records. Now, researchers, educators, and conservation practitioners can discover and use the extensive species occurrence and other data associated with the papers published in each journal.

A video displaying an interactive map with occurrence data on the BDJ portal.

The collaboration between Pensoft and GBIF was recently piloted with the Biodiversity Data Journal (BDJ). Today, the BDJ hosted portal provides seamless access and exploration for nearly 300,000 occurrences of biological organisms from all over the world that have been extracted from the journal’s all-time publications. In addition, the portal provides direct access to more than 800 datasets published alongside papers in BDJ, as well as to almost 1,000 citations of the journal articles associated with those publications.

The Biodiversity Data Journal launches its own data portal on GBIF

“The release of the BDJ portal and subsequent ones planned for other Pensoft journals should inspire other publishers to follow suit in advancing a more interconnected, open and accessible ecosystem for biodiversity research,” said Dr. Vince Smith, Editor-in-Chief of BDJ and head of digital, data and informatics at the Natural History Museum, London.

Joining the @ejtaxonomy, the @BioDataJournal is the latest #ScientificJournal to launch a GBIF hosted portal! 🐟

This @Pensoft-published journal is the first of many under the masthead expected to participate in the GBIF programme. ⚡

Read more: 🔗https://t.co/IA3IWydRLy pic.twitter.com/pbulurX9Kn
— GBIF @biodiversity.social/@gbif (@GBIF) March 10, 2025

“The programme will provide a scalable solution for more than thirty of the journals we publish thanks to our partnership with Plazi, and will foster greater connectivity between scientific research and the evidence that supports it,” said Prof. Lyubomir Penev, founder and chief executive officer of Pensoft.

On the new portals, users can search data, refining their queries based on various criteria such as taxonomic classification, and conservation status. They also have access to statistical information about the hosted data.

Together, the hosted portals provide data on almost 325,000 occurrence records, as well as over 1,000 datasets published across the journals.

The Biodiversity Data Journal launches its own data portal on GBIF

With this simple website designed to lower technical demands, data managers and other stakeholders can easily focus on data exploration and reuse.

The Biodiversity Data Journal (BDJ) became the second open-access peer-reviewed scholarly title to make use of the hosted portals service provided by the Global Biodiversity Information Facility (GBIF): an international network and data infrastructure aimed at providing anyone, anywhere, open access to data about all types of life on Earth.

The Biodiversity Data Journal portal, hosted on the GBIF platform, is to support biodiversity data use and engagement at national, institutional, regional and thematic scales by facilitating access and reuse of data by users with various expertise in data use and management.

Having piloted the GBIF hosted portal solution with arguably the most revolutionary biodiversity journal in its exclusively open-access scholarly portfolio, Pensoft is to soon replicate the effort with at least 20 other journals in the field. This would mean that the publisher will more than double the number of the currently existing GBIF-hosted portals.

As of the time of writing, the BDJ portal provides seamless access and exploration for nearly 300,000 occurrences of biological organisms from all over the world that have been extracted from the journal’s all-time publications. In addition, the portal provides direct access to more than 800 datasets published alongside papers in BDJ, as well as to almost 1,000 citations of the journal articles associated with those publications.

The release of the BDJ portal should inspire other publishers to follow suit in advancing a more interconnected, open and accessible ecosystem for biodiversity research
Vince Smith

Using the search categories featured in the portal, users can narrow their query by geography, location, taxon, IUCN Global Red List Category, geological context and many others. The dashboard also lets users access multiple statistics about the data, and even explore potentially related records with the help of the clustering feature (e.g. a specimen sequenced by another institution or type material deposited at different institutions). Additionally, the BDJ portal provides basic information about the journal itself and links to the news section from its website.

A video displaying an interactive map with occurrence data on the BDJ portal.

Launched in 2013 with the aim to bring together openly available data and narrative into a peer-reviewed scholarly paper, the Biodiversity Data Journal has remained at the forefront of scholarly publishing in the field of biodiversity research. Over the years, it has been amongst the first to adopt many novelties developed by Pensoft, including the entirely XML-based ARPHA Writing Tool (AWT) that has underpinned the journal’s submission and review process for several years now. Besides the convenience of an entirely online authoring environment, AWT provides multiple integrations with key databases, such as GBIF and BOLD, to allow direct export and import at the authoring stage, thereby further facilitating the publication and dissemination of biodiversity data. More recently, BDJ also piloted the “Nanopublications for Biodiversity” workflow and format as a novel solution to future-proof biodiversity knowledge by sharing “pixels” of machine-actionable scientific statements.

A decade of empowering biodiversity science: celebrating 10 years of Biodiversity Data Journal

“I am thrilled to see the Biodiversity Data Journal’s (BDJ) hosted portal active, ten years since it became the first journal to submit taxon treatments and Darwin Core occurrence records automatically to GBIF! Since its launch in 2013, BDJ has been unrivalled amongst taxonomy and biodiversity journals in its unique workflows that provide authors with import and export functions for structured biodiversity data to/from GBIF, BOLD, iDigBio and more. I am also glad to announce that more than 30 Pensoft biodiversity journals will soon be present as separate hosted portals on GBIF thanks to our long-time collaboration with Plazi, ensuring proper publication, dissemination and re-use of FAIR biodiversity data,” said Prof. Dr. Lyubomir Penev, founder and CEO of Pensoft, and founding editor of BDJ.

“The release of the BDJ portal and subsequent ones planned for other Pensoft journals should inspire other publishers to follow suit in advancing a more interconnected, open and accessible ecosystem for biodiversity research,” said Vince Smith, editor-in-chief of BDJ and head of digital, data and informatics at the Natural History Museum, London.

Better data practices advance biodiversity knowledge

A framework to retrieve, refine and align secondary biodiversity data with FAIR standards.

Guest blog post by Nubia Marques et al.

In a world increasingly defined by data-driven decisions, biodiversity research stands to benefit from standardized and accessible data. Despite their importance for research, biodiversity datasets often fail to meet FAIR (Findable, Accessible, Interoperable, Reusable) standards, leading to concerns about data quality, reliability, and accessibility.

How to ensure biodiversity data are FAIR, linked, open and future-proof?

To address this, we propose a framework to retrieve, refine and align secondary biodiversity data with FAIR standards, utilizing the Darwin Core model. We followed four steps:

data localization (systematic review)
quality validation
standardization using the Darwin Core standard
sharing and archive in the appropriate repository.

Our approach integrates data validation and quality control steps to ensure that secondary data sets can be trusted.

Our study in Biodiversity Data Journal focused on ecotonal estuarine ecosystems near the easternmost Amazon, where we recovered data from 46,000 individuals representing 3,871 taxa across eight biotic groups (birds, amphibians, reptiles, mammals, fish, phytoplankton, benthos, and plants) from 1985 to 2022. These data were used to illustrate how our strategy improves validation, making the data more reliable for macroecological modeling and conservation management. As data becomes more standardized, researchers around the world will be better equipped to collaborate, identify trends, protect ecosystems, and advance sustainability efforts.

Relationships between numbers of taxa and occurrences gathered through an extensive review of secondary biodiversity data from the Golfão Maranhense area, in the estuarine regions of eastern Amazonia.

Accessible biodiversity data empowers stakeholders and provides critical insights into ecosystem health and species conservation. However, without standardized formats, this data is often fragmented, incomplete, or difficult to compare. By creating a consistent framework for collecting, storing, and sharing data, we are opening the door to more informed decision-making and innovation in biodiversity conservation.

The key to conserving biodiversity is collaboration and transparency. By prioritizing accessible and standardized data, we ensure that vital information reaches those who need it most – whether it’s for scientific study, habitat management or policymaking.

Let’s continue to make biodiversity data a tool for global change!

Research article:

Marques N, Soares CDdeM, Casali DdeM, Guimarães E, Fava F, Abreu JMdaS, Moras L, Silva LGda, Matias R, Assis RLde, Fraga R, Almeida S, Lopes V, Oliveira V, Missagia R, Carvalho E, Carneiro N, Alves R, Souza-Filho P, Oliveira G, Miranda M, Tavares VdaC (2024) Retrieving biodiversity data from multiple sources: making secondary data standardised and accessible. Biodiversity Data Journal 12: e133775. https://doi.org/10.3897/BDJ.12.e133775

MAkiNg Technology work for moNitoring polliNAtors: Pensoft joins ANTENNA

Pensoft is to maximise the project’s impact by informing stakeholders about results and raising public awareness about pollinators.

Pensoft joins the newly funded Biodiversa+ project ANTENNA focused on making technology work for monitoring pollinators and is tasked with the communication, dissemination and exploitation activities.

The overarching goal of ANTENNA is to fill key monitoring gaps through advancing innovative technologies that will underpin and complement EU-wide pollinator monitoring schemes, and to provide tested transnational pipelines from monitoring activities to curated datasets and enhanced indicators that support pollinator-relevant policy and end-users.

The ANTENNA project answers the BiodivMon call, which was launched in September 2022 by Biodiversa+ in collaboration with the European Commission. The BiodivMon call sought proposals for three-year research projects to improve transnational monitoring of biodiversity and ecosystem change, emphasising innovation and harmonisation of biodiversity data collection and management methodologies, addressing knowledge gaps on biodiversity status and trends to combat biodiversity loss, and the effective use of existing biodiversity monitoring data.

Supporting the work of Work Package #5: “Project coordination, and communication”, Pensoft is dedicated to maximising the project’s impact by employing a mix of channels to inform stakeholders about the results from ANTENNA and raise public awareness about pollinators.

Pensoft is also tasked with creating and maintaining a clear and recognisable project brand, promotional materials, website, social network profiles, internal communication platform, and online libraries. Another key responsibility is the development, implementation and regular updates of the project’s communication, dissemination and exploitation plans, that ANTENNA is set to follow for the next four years.

On 14-15 March 2024, ANTENNA held its official kick off meeting. Project partners came together in Halle, Germany for two days to outline objectives, discuss strategies, and set the groundwork for this venture.

Specifically, the combined expertise of the consortium will address the following objectives:

Advance automated sample sorting and image recognition tools from individual prototypes to systems that can be adopted by practitioners
Expand pollinator monitoring to under-researched pollinator taxa, ecosystems, and pressures
Quantify the added value of novel monitoring systems in comparison and combination with ‘traditional’ methods in terms of cost effectiveness
Provide a framework for integrative monitoring by combining multiple data streams and. The framework will also support the development of near real-time forecasting models as bases for early warning systems;
Upscale local demonstrations into the implementation of large-scale transnational pipelines and provide context-specific guidance to the use of policy-makers and other users who might need to select monitoring methods and indicators.

Consortium*:

Helmholtz-Centre for Environmental Research (UFZ), Germany
Naturalis Biodiversity Center, Netherlands
Aarhus University, Denmark
Consejo Superior de Investigaciones Científicas (CSIC), Spain
University of the Aegean, Greece
Universidad Politécnica de Madrid, Spain
Trinity College Dublin, Ireland

*Pensoft Publishers is a subcontractor tasked by the UFZ with multiple communication, dissemination and exploitation activities as part of Work Package 5.

Stay up to date with the ANTENNA project’s progress on X/Twitter (@ANTENNA_project) and LinkedIn (/antenna-project).

Brand new computer language describes organismal traits to create computable species descriptions

Describing traits with Phenoscript is like programming a computer code for how an organism looks.

The beetle species *Grebennikovius basilewskyi.* Numbers next to arrows indicate patterns of phenotype statements explained in the section “Phenoscript: main patterns of phenotype statements”. Arrow numbers from T1 to T5 illustrate individual body parts. See more in the research study.

One of the most beautiful aspects of Nature is the endless variety of shapes, colours and behaviours exhibited by organisms. These traits help organisms survive and find mates, like how a male peacock’s colourful tail attracts females or his wings allow him to fly away from danger. Understanding traits is crucial for biologists, who study them to learn how organisms evolve and adapt to different environments.

To do this, scientists first need to describe these traits in words, like saying a peacock’s tail is “vibrant, iridescent, and ornate”. This approach works for small studies, but when looking at hundreds or even millions of different animals or plants, it’s impossible for the human brain to keep track of everything.

Computers could help, but not even the latest AI technology is able to grasp human language to the extent needed by biologists. This hampers research significantly because, although scientists can handle large volumes of DNA data, linking this information to physical traits is still very difficult.

To solve this problem, researchers from the Finnish Museum of Natural History, Giulio Montanaro and Sergei Tarasov, along with collaborators, have created a special language called Phenoscript. This language is designed to describe traits in a way that both humans and computers can understand. Describing traits with Phenoscript is like programming a computer code for how an organism looks.

Phenoscript uses something called semantic technology, which helps computers understand the meaning behind words, much like how modern search engines know the difference between the fruit “apple” and the tech company “Apple” based on the context of your search.

“This language is still being tested, but it shows a lot of promise. As more scientists start using Phenoscript, it will revolutionise biology by making vast amounts of trait data available for large-scale studies, boosting the emerging field of phenomics,”
explains Montanaro.

In their research article, newly published in the open-access, peer-reviewed Biodiversity Data Journal, the researchers make use of the new language for the first time, as they create semantic phenotypes for four species of dung beetles from the genus Grebennikovius. Then, to demonstrate the power of the semantic approach, they apply simple semantic queries to the generated phenotypic descriptions.

Finally, the team takes a look yet further ahead into modernising the way scientists work with species information. Their next aim is to integrate semantic species descriptions with the concept of nanopublications, “which encapsulates discrete pieces of information into a comprehensive knowledge graph”. As a result, data that has become part of this graph can be queried directly, thereby ensuring that it remains Findable, Accessible, Interoperable and Reusable (FAIR) through a variety of semantic resources.

***

Research paper:

Montanaro G, Balhoff JP, Girón JC, Söderholm M, Tarasov S (2024) Computable species descriptions and nanopublications: applying ontology-based technologies to dung beetles (Coleoptera, Scarabaeinae). Biodiversity Data Journal 12: e121562. https://doi.org/10.3897/BDJ.12.e121562

***

The hereby study is the latest addition to the special topical collection: “Linking FAIR biodiversity data through publications: The BiCIKL approach”, launched and supported by the recently concluded Horizon 2020 project: Biodiversity Community Integrated Knowledge Library (BiCIKL). The collection aims to bring together scientific publications that demonstrate the advantages and novel approaches in accessing and (re-)using linked biodiversity data.

***

What expert recommendations did the BiCIKL consortium give to policy makers and research funders to ensure that biodiversity data is FAIR, linked, open and, indeed, future-proof? Find out in the blog post summarising key lessons learnt from the Horizon 2020 project.

How to ensure biodiversity data are FAIR, linked, open and future-proof?

***

Follow Biodiversity Data Journal on Facebook and X.

How to ensure biodiversity data are FAIR, linked, open and future-proof?

Now concluded Horizon 2020-funded project BiCIKL shares lessons learned with policy-makers and research funders

Within the Biodiversity Community Integrated Knowledge Library (BiCIKL) project, 14 European institutions from ten countries, spent the last three years elaborating on services and high-tech digital tools, in order to improve the findability, accessibility, interoperability and reusability (FAIR-ness) of various types of data about the world’s biodiversity. These types of data include peer-reviewed scientific literature, occurrence records, natural history collections, DNA data and more.

By ensuring all those data are readily available and efficiently interlinked to each other, the project consortium’s intention is to provide better tools to the scientific community, so that it can more rapidly and effectively study, assess, monitor and preserve Earth’s biological diversity in line with the objectives of the likes of the EU Biodiversity Strategy for 2030 and the European Green Deal. Their targets require openly available, precise and harmonised data to underpin the design of effective measures for restoration and conservation, reminds the BiCIKL consortium.

Since 2021, the project partners at BiCIKL have been working together to elaborate existing workflows and links, as well as create brand new ones, so that their data resources, platforms and tools can seamlessly communicate with each other, thereby taking the burden off the shoulders of scientists and letting them focus on their actual mission: paving the way to healthy and sustainable ecosystems across Europe and beyond.

New BiCIKL project to build a freeway between pieces of biodiversity knowledge

Now that the three-year project is officially over, the wider scientific community is yet to reap the fruits of the consortium’s efforts. In fact, the end of the BiCIKL project marks the actual beginning of a European- and global-wide revolution in the way biodiversity scientists access, use and produce data. It is time for the research community, as well as all actors involved in the study of biodiversity and the implementation of regulations necessary to protect and preserve it, to embrace the lessons learned, adopt the good practices identified and build on the knowledge in existence.

This is why amongst the BiCIKL’s major final research outputs, there are two Policy Briefs meant to summarise and highlight important recommendations addressed to key policy makers, research institutions and funders of research. After all, it is the regulatory bodies that are best equipped to share and implement best practices and guidelines.

Most recently, the BiCIKL consortium published two particularly important policy briefs, both addressed to the likes of the European Commission’s Directorate-General for Environment; the European Environment Agency; the Joint Research Centre; as well as science and policy interface platforms, such as the EU Biodiversity Platform; and also organisations and programmes, e.g. Biodiversa+ and EuropaBON, which are engaged in biodiversity monitoring, protection and restoration. The policy briefs are also to be of particular use to national research funds in the European Union.

🆕Policy Brief by @Bicikl_H2020 addresses the likes of @EU_ENV @EUEnvironment @EU_ScienceHub @BiodiversaPlus & @EuropaBon_H2020🇪🇺 to pave the way for a Linked Open Data-based, AI-assisted Overarching #Biodiversity Supergraph.

📃See:https://t.co/JMr2IbgFLT #FAIRdata #BiCIKL_H2020 pic.twitter.com/IFl2xPbriN
— RIO Journal (@RIOJournal) May 7, 2024

One of the newly published policy briefs, titled “Uniting FAIR data through interlinked, machine-actionable infrastructures”, highlights the potential benefits derived from enhanced connectivity and interoperability among various types of biodiversity data. The publication includes a list of recommendations addressed to policy-makers, as well as nine key action points. Understandably, amongst the main themes are those of wider international cooperation; inclusivity and collaboration at scale; standardisation and bringing science and policy closer to industry. Another major outcome of the BiCIKL project: the Biodiversity Knowledge Hub portal is noted as central to many of these objectives and tasks in its role of a knowledge broker that will continue to be maintained and updated with additional FAIR data-compliant services as a living legacy of the collaborative efforts at BiCIKL.

🆕Policy brief by @Bicikl_H2020 advises how to liberate #scientific #data from #scholarly publications and publish #FAIRdata, in order to foster excellence & innovation in #biodiversity science, #monitoring & #conservation.

📃See: https://t.co/yHMc7TPeTm #BiCIKL_H2020 #FAIRdata pic.twitter.com/iFnSCvDdft
— RIO Journal (@RIOJournal) May 8, 2024

The second policy brief, titled “Liberate the power of biodiversity literature as FAIR digital objects”, shares key actions that can liberate data published in non-machine actionable formats and non-interoperable platforms, so that those data can also be efficiently accessed and used; as well as ways to publish future data according to the best FAIR and linked data practices. The recommendations highlighted in the policy brief intend to support decision-making in Europe; expedite research by making biodiversity data immediately and globally accessible; provide curated data ready to use by AI applications; and bridge gaps in the life cycle of research data through digital-born data. Several new and innovative workflows, linkages and integrative mechanisms and services developed within BiCIKL are mentioned as key advancements created to access and disseminate data available from scientific literature.

While all policy briefs and factsheets – both primarily targeted at non-expert decision-makers who play a central role in biodiversity research and conservation efforts – are openly and freely available on the project’s website, the most important contributions were published as permanent scientific records in a BiCIKL-branded dedicated collection in the peer-reviewed open-science journal Research Ideas and Outcomes (RIO). There, the policy briefs are provided as both a ready-to-print document (available as supplementary material) and an extensive academic publication.

Currently, the collection: “Towards interlinked FAIR biodiversity knowledge: The BiCIKL perspective” in the RIO journal contains 60 publications, including policy briefs, project reports, methods papers, conference abstracts, demonstrating and highlighting key milestones and project outcomes from along the BiCIKL’s journey in the last three years. The collection also features over 15 scientific publications authored by people not necessarily involved in BiCIKL, but whose research uses linked open data and tools created in BiCIKL. Their publications were published in a dedicated article collection in the Biodiversity Data Journal.

BiCIKL keeps on adding project outcomes in own collection in RIO Journal

***

Visit the Biodiversity Community Integrated Knowledge Library (BiCIKL) project’s website at: https://bicikl-project.eu/.

Don’t forget to also explore the Biodiversity Knowledge Hub (BKH) for yourself at: https://biodiversityknowledgehub.eu/ and watch the BKH’s introduction video.

Highlights from the BiCIKL project are also accessible on Twitter/X from the project’s hashtag: #BiCIKL_H2020 and handle: @BiCIKL_H2020.

BiCIKL project sums up outcomes and future prospects at a Final GA in Cambridge