Pensoft among the first 27 publishers to share prices & services via the Journal Comparison Service by Plan S

All journals published by Pensoft – each using the publisher’s self-developed ARPHA Platform – provide extensive and transparent information about their costs and services in line with the Plan S principles.

In support of transparency and openness in scholarly publishing and academia, the scientific publisher and technology provider Pensoft joined the Journal Comparison Service (JCS) initiative by cOAlition S, an alliance of national funders and charitable bodies working to increase the volume of free-to-read research. 

As a result, all journals published by Pensoft – each using the publisher’s self-developed ARPHA Platform – provide extensive and transparent information about their costs and services in line with the Plan S principles.

The JCS was launched to aid libraries and library consortia – the ones negotiating and participating in Open Access agreements with publishers – by providing them with everything they need to know in order to determine whether the prices charged by a certain journal are fair and corresponding to the quality of the service. 

According to cOAlition S, an increasing number of libraries and library consortia from Europe, Africa, North America, and Australia have registered with the JCS over the past year since the launch of the portal in September 2021.

While access to the JCS is only open to librarians, individual researchers may also make use of the data provided by the participating publishers and their journals. 

This is possible through an integration with the Journal Checker Tool, where researchers can simply enter the name of the journal of interest, their funder and affiliation (if applicable) to check whether the scholarly outlet complies with the Open Access policy of the author’s funder. A full list of all academic titles that provide data to the JCS is also publicly available. By being on the list means a journal and its publisher do not only support cOAlition S, but they also demonstrate that they stand for openness and transparency in scholarly publishing.

“We are delighted that Pensoft, along with a number of other publishers, have shared their price and service data through the Journal Comparison Service. Not only are such publishers demonstrating their commitment to open business models and cultures but are also helping to build understanding and trust within the research community.”

said Robert Kiley, Head of Strategy at cOAlition S. 

***

About cOAlition S:

On 4 September 2018, a group of national research funding organisations, with the support of the European Commission and the European Research Council (ERC), announced the launch of cOAlition S, an initiative to make full and immediate Open Access to research publications a reality. It is built around Plan S, which consists of one target and 10 principles. Read more on the cOAlition S website.

About Plan S:

Plan S is an initiative for Open Access publishing that was launched in September 2018. The plan is supported by cOAlition S, an international consortium of research funding and performing organisations. Plan S requires that, from 2021, scientific publications that result from research funded by public grants must be published in compliant Open Access journals or platforms. Read more on the cOAlition S website.

Digitising beans to feed the world

In 2018, NHM London’s digitisation team started a project to digitise non-type herbarium material from the legume family. A recent data paper in the Biodiversity Data Journal reports on the outcomes.

You can find the original blog post by the Natural History Museum of London, reposted here with minor edits.

Legumes are a group of plants that include soybeans, peas, chickpeas, peanuts and lentils. They are a significant source of protein, fibre, carbohydrates, and minerals in our diet and some, like the cowpea, are resistant to droughts.

In 2018, the Natural History Museum of London’s (NHM London) digitisation team started a project in collaboration with project leader Royal Botanic Gardens Kew and the Royal Botanic Garden Edinburgh.

The project’s outcomes were published in a data paper in the Biodiversity Data Journal. Within the project, the digitisation team aimed to collectively digitise non-type herbarium material from the legume family. This includes rosewood trees (Dalbergia), padauk trees (Pterocarpus) and the Phaseolinae subtribe that contains many of the beans cultivated for human and animal food.

This project was made possible through the Department for Environment Food & Rural Affairs (DEFRA)-allocated Official Development Assistance (ODA) funding, distributed by the UK government in its “global efforts to defeat poverty, tackle instability and create prosperity in developing countries”.

AfricanGuinea, Ethiopia, Sudan, Kenya, Uganda, Tanzania, Mozambique, Malawi and Madagascar
AsianBangladesh, Myanmar, Nepal, New Guinea and India
Southern and Central AmericanGuatemala, Honduras, El Salvador, Nicaragua, Bolivia, Argentina and Brazil
ODA-listed Countries

The legume groups: Dalbergia, Pterocarpus and Phaseolinae,were chosen for digitisation to support the development of dry beans as a sustainable and resilient crop, and to aid conservation and sustainable use of rosewood and padauk trees. Some of these beans, especially cow pea and pigeon pea, are sustainable and resilient crops, as they can be grown in poor-quality soils and are drought stress resistant. This makes them particularly suitable for agricultural production where the growing of other crops would be difficult.

Digitally discoverable herbarium specimens can provide important information about the distribution of individual species, as well as highlighting which species occur naturally together.

While there have been collaborative efforts between herbaria in the past, these have tended to prioritise digitisation of type specimens: the example specimens for which a species is named.

Types are important to identification, but being individual specimens, they don’t offer insights into species distribution over time. By focusing on the non-types across the world and over the last 200 years, we have released a brand-new resource to the global scientific community.

Searching for beans

This collection was digitised by creating an inventory record for each specimen, attaching images of each herbarium sheet, and then transcribing more data and georeferencing the specimens, providing an accurate locality in space and time for their collection. 

We originally had four months and three members of staff to digitise over 11,000 specimens. The Covid-19 lockdown was ironically rather lucky for this project as it enabled us to have more time to transcribe and georeference all of the records. 

say the researchers behind the digitisation project.
Map showing breakdown of records by country.

“We were able to assign country-level data to 10,857 out of the total number of 11,222 records. We were also able to transcribe the collectors’ names from the majority of our specimen labels (10,879 out of 11,222). Only 770 out of the 2,226 individuals identified during this project collected their specimens in ODA listed countries. The highest contributors were: Richard Beddome (130 specimens), Charles Clarke (110), Hans Schlieben (98) and Nathaniel Wallich (79). The breakdown of records by ODA country can be seen in the chart below. “

Map showing breakdown of records by country and pie chart showing distribution by ODA listed countries.

From our data, we can see the peak decade of collection was the 1930s, with almost half (4,583 specimens or 49,43%) collected between 1900 and 1950 (Fig. 10).

This peak can be attributed to three of our most prolific collectors: Arthur Kerr, John Gossweiler and Georges Le Testu, all of whom were most active in the 1930s. The oldest specimen (BM013713473) was collected by Mark Catesby (1683-1749) in the Bahamas in 1726.

they explain.

An interesting, but perhaps unsurprising, finding is that our collection is strongly male-dominated.

There are only two women (Caroline Whitefoord and Ynes Mexia) in the list of our top 50 plant collectors and they are not close to the most prolific collectors.

We identified more women in the rest of our records, but their contribution is on average less than 25 specimens per person in the dataset consisting of more than 10,000 specimens. In contrast, the top five male collectors contributed 10% of our collection. 

they continued

Releasing Rosewoods

Both the Pterocarpus and Dalbergia genera include species that are used as expensive good quality timber that is prone to illegal logging. Many species such as Pterocarpus tinctorius are also listed on the International Union for Conservation of Nature (IUCN) Red List of Threatened Species. By releasing this new resource of information on all these plants from three of the biggest herbaria in the world, we can share this datа with the people who are taking care of biodiversity in these countries. The data can be used to identify hotspots, where the tree is naturally growing and protect these areas. These data would also allow much closer attention to be paid to areas that could be targets for illegal logging activity.

Pterocarpus tinctorius is a species of padauk tree that is listed as endangered on the IUCN Red List.
Cowpea (Vigna unguiculata) is a food and animal feed crop grown in the semi-arid tropics.

The ODA-listed countries are economically impoverished and disproportionately prone to be disadvantaged with the changing climate whether from flood or drought or increase in temperature.

Using data to identify good, nutritious plant species that can be grown in such conditions can therefore benefit local communities, potentially reducing dependence on imports, aid and on less resilient crops. 

the team adds in conclusion.

***

This dataset is now openly available on the Museum’s Data Portal and a data paper about this work has been released in the Biodiversity Data Journal.

***

Stay in touch with the Digitisation team by following us on Instagram and Twitter

Don’t forget to also follow the Biodiversity Data Journal on Twitter and Facebook.

One Biodiversity Knowledge Hub to link them all: BiCIKL 2nd General Assembly

The FAIR Data Place – the key and final product of the partnership – is meant to provide scientists with all types of biodiversity data “at their fingertips”

The Horizon 2020 – funded project BiCIKL has reached its halfway stage and the partners gathered in Plovdiv (Bulgaria) from the 22nd to the 25th of October for the Second General Assembly, organised by Pensoft

The BiCIKL project will launch a new European community of key research infrastructures, researchers, citizen scientists and other stakeholders in the biodiversity and life sciences based on open science practices through access to data, tools and services.

BiCIKL’s goal is to create a centralised place to connect all key biodiversity data by interlinking 15 research infrastructures and their databases. The 3-year European Commission-supported initiative kicked off in 2021 and involves 14 key natural history institutions from 10 European countries.

BiCIKL is keeping pace as expected with 16 out of the 48 final deliverables already submitted, another 9 currently in progress/under review and due in a few days. Meanwhile, 21 out of the 48 milestones have been successfully achieved.

Prof. Lyubomir Penev (BiCIKL’s project coordinator Prof. Lyubomir Penev and CEO and founder of Pensoft) opens the 2nd General Assembly of BiCIKL in Plovdiv, Bulgaria.

The hybrid format of the meeting enabled a wider range of participants, which resulted in robust discussions on the next steps of the project, such as the implementation of additional technical features of the FAIR Data Place (FAIR being an abbreviation for Findable, Accessible, Interoperable and Reusable).

This FAIR Data Place online platform – the key and final product of the partnership and the BiCIKL initiative – is meant to provide scientists with all types of biodiversity data “at their fingertips”.

This data includes biodiversity information, such as detailed images, DNA, physiology and past studies concerning a specific species and its ‘relatives’, to name a few. Currently, the issue is that all those types of biodiversity data have so far been scattered across various databases, which in turn have been missing meaningful and efficient interconnectedness.

Additionally, the FAIR Data Place, developed within the BiCIKL project, is to give researchers access to plenty of training modules to guide them through the different services.

Halfway through the duration of BiCIKL, the project is at a turning point, where crucial discussions between the partners are playing a central role in the refinement of the FAIR Data Place design. Most importantly, they are tasked with ensuring that their technologies work efficiently with each other, in order to seamlessly exchange, update and share the biodiversity data every one of them is collecting and taking care of.

By Year 3 of the BiCIKL project, the partners agree, when those infrastructures and databases become efficiently interconnected to each other, scientists studying the Earth’s biodiversity across the world will be in a much better position to build on existing research and improve the way and the pace at which nature is being explored and understood. At the end of the day, knowledge is the stepping stone for the preservation of biodiversity and humankind itself.


“Needless to say, it’s an honour and a pleasure to be the coordinator of such an amazing team spanning as many as 14 partnering natural history and biodiversity research institutions from across Europe, but also involving many global long-year collaborators and their infrastructures, such as Wikidata, GBIF, TDWG, Catalogue of Life to name a few,”

said BiCIKL’s project coordinator Prof. Lyubomir Penev, CEO and founder of Pensoft.

“I see our meeting in Plovdiv as a practical demonstration of our eagerness and commitment to tackle the long-standing and technically complex challenge of breaking down the silos in the biodiversity data domain. It is time to start building freeways between all biodiversity data, across (digital) space, time and data types. After the last three days that we spent together in inspirational and productive discussions, I am as confident as ever that we are close to providing scientists with much more straightforward routes to not only generate more biodiversity data, but also build on the already existing knowledge to form new hypotheses and information ready to use by decision- and policy-makers. One cannot stress enough how important the role of biodiversity data is in preserving life on Earth. These data are indeed the groundwork for all that we know about the natural world”  

Prof. Lyubomir Penev added.
Christos Arvanitidis (CEO of LifeWatch ERIC) at the 2nd General Assembly of the BiCIKL project.

Christos Arvanitidis, CEO of LifeWatch ERIC, added:

“The point is: do we want an integrated structure or do we prefer federated structures? What are the pros and cons of the two options? It’s essential to keep the community united and allied because we can’t afford any information loss and the stakeholders should feel at home with the Project and the Biodiversity Knowledge Hub.”


Joe Miller, Executive Secretary and Director at GBIF, commented:

“We are a brand new community, and we are in the middle of the growth process. We would like to already have answers, but it’s good to have this kind of robust discussion to build on a good basis. We must find the best solution to have linkages between infrastructures and be able to maintain them in the future because the Biodiversity Knowledge Hub is the location to gather the community around best practices, data and guidelines on how to use the BiCIKL services… In order to engage even more partners to fill the eventual gaps in our knowledge.”


Joana Pauperio (biodiversity curator at EMBL-EBI) at the 2nd General Assembly of the BiCIKL project.

“BiCIKL is leading data infrastructure communities through some exciting and important developments”  

said Dr Guy Cochrane, Team Leader for Data Coordination and Archiving and Head of the European Nucleotide Archive at EMBL’s European Bioinformatics Institute (EMBL-EBI).

“In an era of biodiversity change and loss, leveraging scientific data fully will allow the world to catalogue what we have now, to track and understand how things are changing and to build the tools that we will use to conserve or remediate. The challenge is that the data come from many streams – molecular biology, taxonomy, natural history collections, biodiversity observation – that need to be connected and intersected to allow scientists and others to ask real questions about the data. In its first year, BiCIKL has made some key advances to rise to this challenge,”

he added.

Deborah Paul, Chair of the Biodiversity Information Standards – TDWG said:

“As a partner, we, at the Biodiversity Information Standards – TDWG, are very enthusiastic that our standards are implemented in BiCIKL and serve to link biodiversity data. We know that joining forces and working together is crucial to building efficient infrastructures and sharing knowledge.”


The project will go on with the first Round Table of experts in December and the publications of the projects who participated in the Open Call and will be founded at the beginning of the next year.

***

Learn more about BiCIKL on the project’s website at: bicikl-project.eu

Follow BiCIKL Project on Twitter and Facebook. Join the conversation on Twitter at #BiCIKL_H2020.

***

All BiCIKL project partners:

#TDWG2022 recap: TDWG and Pensoft welcomed 400 biodiversity information experts from 41 countries in Sofia

For the 37th time, experts from across the world to share and discuss the latest developments surrounding biodiversity data and how they are being gathered, used, shared and integrated across time, space and disciplines.

Between 17th and 21st October, about 400 scientists and experts took part in a hybrid meeting dedicated to the development, use and maintenance of biodiversity data, technologies, and standards across the world.

This year, the conference was hosted by Pensoft in collaboration with the National Museum of Natural History (Bulgaria) and the Institute of Biodiversity and Ecosystem Research at the Bulgarian Academy of Science. It ran under the theme “Stronger Together: Standards for linking biodiversity data”.

For the 37th time, the global scientific and educational association Biodiversity Information Standards (TDWG) brought together experts from all over the globe to share and discuss the latest developments surrounding biodiversity data and how they are being gathered, used, shared and integrated across time, space and disciplines.

This was the first time the event happened in a hybrid format. It was attended by 160 people on-site, while another 235 people joined online. 

The TDWG 2022 conference saw plenty of networking and engaging discussions with as many as 160 on-site attendees and another 235 people, who joined the event remotely.

The conference abstracts, submitted by the event’s speakers ahead of the meeting, provide a sneak peek into their presentations and are all publicly available in the TDWG journal Biodiversity Information Science and Standards (BISS).

“It’s wonderful to be in the Balkans and Bulgaria for our Biodiversity Information and Standards (TDWG) 2022 conference! Everyone’s been so welcoming and thoughtfully engaged in conversations about biodiversity information and how we can all collaborate, contribute and benefit,”

said Deborah Paul, Chair of TDWG, a biodiversity informatics specialist and community liaison at the University of Illinois, Prairie Research Institute‘s Illinois Natural History Survey and also an active participant in the Society for the Preservation of Natural History Collections (SPNHC), the Entomological Collections Network (ECN), ICEDIG, the Research Data Alliance (RDA), and The Carpentries.

“Our TDWG mission is to create, maintain and promote the use of open, community-driven standards to enable sharing and use of biodiversity data for all,”

she added.
Prof Lyubomir Penev (Pensoft) and Deborah Paul (TDWG) at TDWG 2022.

“We are proud to have been selected to be the hosts of this year’s TDWG annual conference and are definitely happy to have joined and observed so many active experts network and share their know-how and future plans with each other, so that they can collaborate and make further progress in the way scientists and informaticians work with biodiversity information,”  

said Pensoft’s founder and CEO Prof. Lyubomir Penev.

“As a publisher of multiple globally renowned scientific journals and books in the field of biodiversity and ecology, at Pensoft we assume it to be our responsibility to be amongst the first to implement those standards and good practices, and serve as an example in the scholarly publishing world. Let me remind you that it is the scientific publications that present the most reliable knowledge the world and science has, due to the scrutiny and rigour in the review process they undergo before seeing the light of day,”

he added.

***

In a nutshell, the main task and dedication of the TDWG association is to develop and maintain standards and data-sharing protocols that support the infrastructures (e.g., The Global Biodiversity Information Facility – GBIF), which aggregate and facilitate use of these data, in order to inform and expand humanity’s knowledge about life on Earth.

It is the goal of everyone at TDWG to let scientists interested in the world’s biodiversity to do their work efficiently and in a manner that can be understood, shared and reused.

It is the goal of everyone volunteering their time and expertise to TDWG to enable the scientists interested in the world’s biodiversity to do their work efficiently and in a manner that can be understood, shared and reused by others. After all, biodiversity data underlie everything we know about the natural world.

If there are optimised and universal standards in the way researchers store and disseminate biodiversity data, all those biodiversity scientists will be able to find, access and use the knowledge in their own work much more easily. As a result, they will be much better positioned to contribute new knowledge that will later be used in nature and ecosystem conservation by key decision-makers.

On Monday, the event opened with welcoming speeches by Deborah Paul and Prof. Lyubomir Penev in their roles of the Chair of TDWG and the main host of this year’s conference, respectively.

The opening ceremony continued with a keynote speech by Prof. Pavel Stoev, Director of the Natural History Museum of Sofia and co-host of TDWG 2022. 

Prof. Pavel Stoev (Natural History Museum of Sofia) with a presentation about the known and unknown biodiversity of Bulgaria during the opening plenary session of TDWG 2022.

He walked the participants through the fascinating biodiversity of Bulgaria, but also the worrying trends in the country associated with declining taxonomic expertise. 

He finished his talk with a beam of hope by sharing about the recently established national unit of DiSSCo, whose aim – even if a tad too optimistic – is to digitise one million natural history items in four years, of which 250,000 with photographs. So far, one year into the project, the Bulgarian team has managed to digitise more than 32,000 specimens and provide images to 10,000 specimens.

The plenary session concluded with a keynote presentation by renowned ichthyologist and biodiversity data manager Dr. Richard L. Pyle, who is also a manager of ZooBank – the key international database for newly described species.

Keynote presentation by Dr Richard L. Pyle (Bishop Museum, USA) at the opening plenary session of TDWG 2022.

In his talk, he highlighted the gaps in the ways taxonomy is being used, thereby impeding biodiversity research and cutting off a lot of opportunities for timely scientific progress.

“There are simple things we can do to change how we use taxonomy as a tool that would dramatically improve our ability to conduct science and understand biodiversity. There is enormous value and utility within existing databases around the world to understand biodiversity, how threatened it is, what impacts human activity has (especially climate change), and how to optimise the protection and preservation of biodiversity,”

he said in an interview for a joint interview by the Bulgarian News Agency and Pensoft.

“But we do not have easy access to much of this information because the different databases are not well integrated. Taxonomy offers us the best opportunity to connect this information together, to answer important questions about biodiversity that we have never been able to answer before. The reason meetings like this are so important is that they bring people together to discuss ways of using modern informatics to greatly increase the power of the data we already have, and prioritise how we fill the gaps in data that exist. Taxonomy, and especially taxonomic data integration, is a very important part of the solution.”

Pyle also commented on the work in progress at ZooBank ten years into the platform’s existence and its role in the next (fifth) edition of the International Code of Zoological Nomenclature, which is currently being developed by the International Commission of Zoological Nomenclature (ICZN). 

“We already know that ZooBank will play a more important role in the next edition of the Code than it has for these past ten years, so this is exactly the right time to be planning new services for ZooBank. Improvements at ZooBank will include things like better user-interfaces on the web to make it easier and faster to use ZooBank, better data services to make it easier for publishers to add content to ZooBank as part of their publication workflow, additional information about nomenclature and taxonomy that will both support the next edition of the Code, and also help taxonomists get their jobs done more efficiently and effectively. Conferences like the TDWG one are critical for helping to define what the next version of ZooBank will look like, and what it will do.”

***

During the week, the conference participants had the opportunity to enjoy a total of 140 presentations; as well as multiple social activities, including a field trip to Rila Monastery and a traditional Bulgarian dinner.

TDWG 2022 conference participants document their species observations on their way to Rila Monastery.

While going about the conference venue and field trip localities, the attendees were also actively uploading their species observations made during their stay in Bulgaria on iNaturalist in a TDWG2022-dedicated BioBlitz. The challenge concluded with a total of 635 observations and 228 successfully identified species.

Amongst the social activities going on during TDWG 2022 was a BioBlitz, where the conference participants could uploade their observations made in Bulgaria on iNaturalist and help each other successfully identify the specimens.

***

In his interview for the Bulgarian News Agency and Pensoft, Dr Vincent Smith, Head of the Informatics Division at the Natural History Museum, London (United Kingdom), co-founder of DiSSCo, the Distributed System of Scientific Collections, and the Editor-in-Chief of Biodiversity Data Journal, commented: 

“Biodiversity provides the support systems for all life on Earth. Yet the natural world is in peril, and we face biodiversity and climate emergencies. The consequences of these include accelerating extinction, increased risk from zoonotic disease, degradation of natural capital, loss of sustainable livelihoods in many of the poorest yet most biodiverse countries of the world, challenges with food security, water scarcity and natural disasters, and the associated challenges of mass migration and social conflicts.

Solutions to these problems can be found in the data associated with natural science collections. DiSSCo is a partnership of the institutions that digitise their collections to harness their potential. By bringing them together in a distributed, interoperable research infrastructure, we are making them physically and digitally open, accessible, and usable for all forms of research and innovation. 

At present rates, digitising all of the UK collection – which holds more than 130 million specimens collected from across the globe and is being taken care of by over 90 institutions – is likely to take many decades, but new technologies like machine learning and computer vision are dramatically reducing the time it will take, and we are presently exploring how robotics can be applied to accelerate our work.”

Dr Vincent Smith, Head of the Informatics Division at the Natural History Museum, London, co-founder of DiSSCo, and Editor-in-Chief of Biodiversity Data Journal at the TDWG 2022 conference.

In his turn, Dr Donat Agosti, CEO and Managing director at Plazi – a not-for-profit organisation supporting and promoting the development of persistent and openly accessible digital taxonomic literature – said:

“All the data about biodiversity is in our libraries, that include over 500 million pages, and everyday new publications are being added. No person can read all this, but machines allow us to mine this huge, very rich source of data. We do not know how many species we know, because we cannot analyse with all the scientists in this library, nor can we follow new publications. Thus, we do not have the best possible information to explore and protect our biological environment.”

Dr Donat Agosti demonstrating the importance of publishing biodiversity data in a structured and semantically enhanced format in one of his presentations at TDWG 2022.

***

At the closing plenary session, Gail Kampmeier – TDWG Executive member and one of the first zoologists to join TDWG in 1996 – joined via Zoom to walk the conference attendees through the 37-year history of the association, originally named the Taxonomic Databases Working Group, but later transformed to Biodiversity Information Standards, as it expanded its activities to the whole range of biodiversity data. 

“While this presentation is about TDWG’s history as an organisation, its focus will be on the heart of TDWG: its people. We would like to show how the organisation has evolved in terms of gender balance, inclusivity actions, and our engagement to promote and enhance diversity at all levels. But more importantly, where do we—as a community—want to go in the future?”,

reads the conference abstract of her colleague at TDWG Dr Visotheary Ung (CNRS-MNHN) and herself.

Then, in the final talk of the session, Deborah Paul took to the stage to present the progress and key achievements by the association from 2022.

She gave a special shout-out to the TDWG journal: Biodiversity Information Science and Standards (BISS), where for the 6th consecutive year, the participants of the annual conference submitted and published their conference abstracts ahead of the event. 

Deborah Paul reminds that – apart from the conference abstracts – the TDWG journal: Biodiversity Information Science and Standards (BISS) also welcomes full-lenght articles that demonstrate the development or application of new methods and approaches in biodiversity informatics.

Launched in 2017 on the Pensoft’s publishing platform ARPHA, the journal provides the quite unique and innovative opportunity to have both abstracts and full-length research papers published in a modern, technologically-advanced scholarly journal. In her speech, Deborah Paul reminded that BISS journal welcomes research articles that demonstrate the development or application of new methods and approaches in biodiversity informatics in the form of case studies.

Amongst the achievements of TDWG and its community, a special place was reserved for the Horizon 2020-funded BiCIKL project (abbreviation for Biodiversity Community Integrated Knowledge Library), involving many of the association’s members. 

Having started in 2021, the 3-year project, coordinated by Pensoft, brings together 14 partnering institutions from 10 countries, and 15 biodiversity under the common goal to create a centralised place to connect all key biodiversity data by interlinking a total of 15 research infrastructures and their databases.

Deborah Paul also reported on the progress of the Horizon 2020-funded project BiCIKL, which involves many of the TDWG members. BiCIKL’s goal is to create a centralised place to connect all key biodiversity data by interlinking 15 key research infrastructures and their databases.

In fact, following the week-long TDWG 2022 conference in Sofia, a good many of the participants set off straight for another Bulgarian city and another event hosted by Pensoft. The Second General Assembly of BiCIKL took place between 22nd and 24th October in Plovdiv.

***

You can also explore highlights and live tweets from TDWG 2022 on Twitter via #TDWG2022.
The Pensoft team at TDWG 2022 were happy to become the hosts of the 37th TDWG conference.

‘Who is in your database and why does it matter?’

The uncertainty about a person’s identity hampers research, hinders the discovery of expertise, and obstructs the ability to give attribution or credit for work performed. 

Collection discovery through disambiguation

Guest blog post by Sabine von Mering, Heather Rogers, Siobhan Leachman, David P. ShorthouseDeborah Paul & Quentin Groom

Worldwide, natural history institutions house billions of physical objects in their collections, they create and maintain data about these items, and they share their data with aggregators such as the Global Biodiversity Information Facility (GBIF), the Integrated Digitized Biocollections (iDigBio), the Atlas of Living Australia (ALA), Genbank and the European Nucleotide Archive (ENA). 

Even though these data often include the names of the people who collected or identified each object, such statements may be ambiguous, as the names frequently lack any globally unique, machine-readable concept of their shared identity.

Despite the data being available online, barriers exist to effectively use the information about who collects or provides the expertise to identify the collection objects. People have similar names, change their name over the course of their lifetime (e.g. through marriage), or there may be variability introduced through the label transcription process itself (e.g. local look-up lists). 

As a result, researchers and collections staff often spend a lot of time deducing who is the person or people behind unknown collector strings while collating or tidying natural history data. The uncertainty about a person’s identity hampers research, hinders the discovery of expertise, and obstructs the ability to give attribution or credit for work performed. 

Disambiguation activities: the act of churning strings into verifiable things using all available evidence – need not be done in isolation. In addition to presenting a workflow on how to disambiguate people in collections, we also make the case that working in collaboration with colleagues and the general public presents new opportunities and introduces new efficiencies. There is tacit knowledge everywhere.

More often than not, data about people involved in biodiversity research are scattered across different digital platforms. However, with linking information sources to each other by using person identifiers, we can better trace the connections in these networks, so that we can weave a more interoperable narrative about every actor.

That said, inconsistent naming conventions or lack of adequate accreditation often frustrate the realization of this vision. This sliver of natural history could be churned to gold with modest improvements in long-term funding for human resources, adjustments to digital infrastructure, space for the physical objects themselves alongside their associated documents, and sufficient training on how to disambiguate people’s names.

“He aha te mea nui o te ao. He tāngata, he tāngata, he tāngata.

“What is the most important thing in the world? It is people, it is people, it is people.”

(Māori proverb)

The process of properly disambiguating those who have contributed to natural history collections takes time. 

The disambiguation process involves the extra challenge of trying to deduce “who is who” for legacy data, compared to undertaking this activity for people alive today. Retrospective disambiguation can require considerable detective work, especially for scarcely known people or if the community has a different naming convention. Provided the results of this effort are well-communicated and openly shared, mercifully, it need only be done once.

At the core of our research is the question of how to solve the issue of assigning proper credit

In our recent Methods paper, we discuss several methods for this, as well as available routes for making records available online that include not only the names of people expressed as text, but additionally twinned with their unique, resolvable identifiers. 

Disambiguation is a cycle. Enrichment of the data feeds off itself leading to further disambiguation. As more names are disambiguated and more biographical data are accumulated, it becomes easier to disambiguate more names. 

First and foremost, we should maintain our own public biographical data by making full use of ORCID. In addition to preserving our own scientific legacy and that of the institutions that employ us, we have a responsibility to avoid generating unnecessary disambiguation work for others. 

For legacy data, where the people connected to the collections are deceased, Wikidata can be used to openly document rich bibliographic and demographic data, each statement with one or more verifiable references. Wikidata can also act as a bridge to link other sources of authority such as VIAF or ORCID identifiers. It has many tools and services to bulk import, export, and to query information, making it well-suited as a universal democratiser of information about people often walled-off in collection management systems (CMS). 

A network of the top twenty most used identifiers for biologists on Wikidata.

Once unique identifiers for people are integrated in collection management systems, these may be shared with the global collections and research community using the new Darwin Core terms, recordedByID or identifiedByID along with the well-known, yet text-based terms, recordedBy or identifiedBy. 

Approximately 120 datasets published through GBIF now make use of these identifier-based terms, which are additionally resolved in Bionomia every few weeks alongside co-curated attributions newly made there. This roundtrip of data – emerging as ambiguous strings of text from the source, affixed with resolvable identifiers elsewhere, absorbed into the source as new digital annotations, and then re-emerging with these fresh, identifier-based enhancements – is an exciting approach to co-manage collections data.

Round tripping. In Bionomia, people identifiers from Wikidata and ORCID are used to enrich data published via GBIF, thus linking natural history specimens to the world’s collectors.

Disambiguation work is particularly important in recognising contributors who have been historically marginalized. For example, gender bias in specimen data can be seen in the case of Wilmatte Porter Cockerell, a prolific collector of botanical, entomological and fossil specimens. Cockerell’s collections are often attributed to her husband as he was also a prolific collector and the two frequently collected together. 

On some labels, her identity is further obscured as she is simply recorded as “& wife” (see example on GBIF). Since Wilmatte Cockerell was her husband’s second wife, it can take some effort to confirm if a specimen can be attributed to her and not her husband’s first wife, who was also involved in collecting specimens. By ensuring that Cockerell is disambiguated and her contributions are appropriately attributed, the impact of her work becomes more visible enabling her work to be properly and fairly credited.

Thus, disambiguation work helps to not only give credit where credit is due, thereby making data about people and their biodiversity collections more findable, but it also creates an inclusive and representative narrative of the landscape of people involved with scientific knowledge creation, identification, and preservation. 

A future – once thought to be a dream – where the complete scientific output of a person is connected as Linked Open Data (LOD) is now

Both the tools and infrastructure are at our disposal and the demand is palpable. All institutions can contribute to this movement by sharing data that include unique identifiers for the people in their collections. We recommend that institutions develop a strategy, perhaps starting with employees and curatorial staff, people of local significance, or those who have been marginalized, and to additionally capitalize on existing disambiguation activities elsewhere. This will have local utility and will make a significant, long-term impact. 

The more we participate in these activities, the greater chance we will uncover positive feedback loops, which will act to lighten the workload for all involved, including our future selves!

The disambiguation of people in collections is an ongoing process, but it becomes easier with practice. We also encourage collections staff to consider modifying their existing workflows and policies to include identifiers for people at the outset, when new data are generated or when new specimens are acquired. 

There is more work required at the global level to define, update, and ratify standards and best practices to help accelerate data exchange or roundtrips of this information; there is room for all contributions. Thankfully, there is a diverse, welcoming, energetic, and international community involved in these activities. 

We see a bright future for you, our collections, and our research products – well within reach – when the identities of people play a pivotal role in the construction of a knowledge graph of life.

You would like to participate and need support getting disambiguation of your collection started? Please contact our TDWG People in Biodiversity Data Task Group.

A good start is also to check Bionomia to find out what metrics exist now for your institution or collection and affiliated people.

The next steps for collections: 7 objectives that can help to disambiguate your institutions’ collection:

1. Promote the use of person identifiers in local, national or international outreach, publishing and research activities

2. Increase the number of collection management systems that use person identifiers

3. Increase the number of living collectors registered and using an ORCID identifier when contributing to collections

4. Undertake disambiguation in the national languages of many countries

5. Increase the number of identified people on Wikidata linked to collections

6. Increase the number of people in collections with expertise in person disambiguation

7. Collaborate towards an exchange standard for attribution data

A real example of how a name string is disambiguated and the steps taken in documenting it. Wikidata item of Jean-André Soulié

***

Methods publication:

Groom Q, Bräuchler C, Cubey RWN, Dillen M, Huybrechts P, Kearney N, Klazenga N, Leachman S, Paul DL, Rogers H, Santos J, Shorthouse DP, Vaughan A, von Mering S, Haston EM (2022) The disambiguation of people names in biological collections. Biodiversity Data Journal 10: e86089. https://doi.org/10.3897/BDJ.10.e86089

***

Follow Biodiversity Data Journal on Twitter and Facebook.

High-schoolers join scholars to lift the lid on Hong Kong’s soil biodiversity

Most often, the students would find millipedes. They even helped identify two species that are new to Hong Kong’s fauna.

Soil and its macrofauna are an integral part of many ecosystems, playing an important role in decomposition and nutrient recycling. However, soil biodiversity remains understudied globally.

To help fill this gap and reveal the diversity of soil fauna in Hong Kong, a team of scientists from The Chinese University of Hong Kong initiated a citizen science project involving universities, non-governmental organisations and secondary school students and teachers.

“Involving citizens as part of the new knowledge generation process is important in promoting the understanding of biodiversity. Training younger-generation citizens to learn about biodiversity is of utmost importance and crucial to conservation engagement”

– say the researchers in their study, which was published in the open-access Biodiversity Data Journal.

The soil sampling methodology that the students employed in this study.
Video by Sheung Yee Lai, Ka Wai Ting, Tze Kiu Chong and Wai Lok So.

Working side by side with university academics, taxonomists and non-governmental organisation members, students from 21 schools/institutes were recruited to collect soil animals near their campusesfor a year and record their observations.

Between October 2019 and October 2020, they monitored and sampled species across 21 sites of urban and semi-natural habitats in Hong Kong, collecting a total of 3,588 individual samples. Their efforts yielded 150 soil macrofaunal species, identified as arthropods (including insects, spiders, centipedes and millipedes), worms, and snails.

Most often, the students found millipedes (23 out of 150 species). They even helped identify two millipede species that are new to Hong Kong’s fauna: Monographis queenslandica and Alloproctoides remyi. The former is usually found in Australia – the researchers suggest it might have been introduced to the area many decades ago from Queensland or vice versa – and the latter has been observed in Reunion and Mauritius.

Two polyxenid millipede species, collected in this study, turned out to had never before been recorded from Hong Kong.
Left: Monographis queenslandica and Alloproctoides remyi (right).
Image by Sheung Yee Lai, Ka Wai Ting and Wai Lok So.

Millipedes like these two species can accelerate litter decomposition and regulate the soil carbon and phosphorus cycling, while earthworms can modify the soil structure and regulate water and organic matter cycling.

“Before the beginning of this project, the understanding of soil biodiversity in Hong Kong, including the understanding of its contained millipede species, was inadequate”

the researchers write in their paper.

Now, they believe that the identified macrofauna species and their 646 DNA barcodes have established a solid foundation for further research in soil biodiversity in the area.

Their project also serves an additional purpose. Unlike most conventional scientific studies, which are usually carried out by the government, non-governmental organisations or academics in universities alone, this study utilised a citizen science approach through creating a big community engaged with biodiversity. In doing so, it helped educate the public and raise awareness on the use of basic science techniques in understanding local biodiversity.

So, it may have inspired a new generation of future scientists: some students started millipede cultures in their own schools, and one school used the millipede breeding model to participate in a science and technology competition.

This study is a proof that local institutes and high schools can unite together with research teams at universities and perform scientific work, the study’s authors believe.

It “has raised public awareness and potentially opens up opportunities for the general public to engage in scientific research in the future.” 

The team hopes that their approach could inspire future biodiversity sampling and monitoring studies to engage more citizen scientists.

***

Research article:

So WL, Ting KW, Lai SY, Huang EYY, Ma Y, Chong TK, Yip HY, Lee HT, Cheung BCT, Chan MK, Consortium HKSB, Nong W, Law MMS, Lai DYF, Hui JHL (2022) Revealing the millipede and other soil-macrofaunal biodiversity in Hong Kong using a citizen science approach. Biodiversity Data Journal 10: e82518. https://doi.org/10.3897/BDJ.10.e82518

***

Follow Biodiversity Data Journal on Twitter and Facebook.

Pensoft’s ARPHA Publishing Platform integrates with OA Switchboard to streamline reporting to funders of open research

By the time authors open their inboxes to the message their work is online, a similar notification will have also reached their research funder.

Image credit: OA Switchboard.

By the time authors – who have acknowledged third-party financial support in their research papers submitted to a journal using the Pensoft-developed publishing platform: ARPHA – open their inboxes to the congratulatory message that their work has just been published and made available to the wide world, a similar notification will have also reached their research funder.

This automated workflow is already in effect at all journals (co-)published by Pensoft and those published under their own imprint on the ARPHA Platform, as a result of the new partnership with the OA Switchboard: a community-driven initiative with the mission to serve as a central information exchange hub between stakeholders about open access publications, while making things simpler for everyone involved.

All the submitting author needs to do to ensure that their research funder receives a notification about the publication is to select the supporting agency or the scientific project (e.g. a project supported by Horizon Europe) in the manuscript submission form, using a handy drop-down menu. In either case, the message will be sent to the funding body as soon as the paper is published in the respective journal.

“At Pensoft, we are delighted to announce our integration with the OA Switchboard, as this workflow is yet another excellent practice in scholarly publishing that supports transparency in research. Needless to say, funding and financing are cornerstones in scientific work and scholarship, so it is equally important to ensure funding bodies are provided with full, prompt and convenient reports about their own input.”

comments Prof Lyubomir Penev, CEO and founder of Pensoft and ARPHA.

 

“Research funders are one of the three key stakeholder groups in OA Switchboard and are represented in our founding partners. They seek support in demonstrating the extent and impact of their research funding and delivering on their commitment to OA. It is great to see Pensoft has started their integration with OA Switchboard with a focus on this specific group, fulfilling an important need,”

adds Yvonne Campfens, Executive Director of the OA Switchboard.

***

About the OA Switchboard:

A global not-for-profit and independent intermediary established in 2020, the OA Switchboard provides a central hub for research funders, institutions and publishers to exchange OA-related publication-level information. Connecting parties and systems, and streamlining communication and the neutral exchange of metadata, the OA Switchboard provides direct, indirect and community benefits: simplicity and transparency, collaboration and interoperability, and efficiency and cost-effectiveness.

About Pensoft:

Pensoft is an independent academic publishing company, well known worldwide for its novel cutting-edge publishing tools, workflows and methods for text and data publishing of journals, books and conference materials.

All journals (co-)published by Pensoft are hosted on Pensoft’s full-featured ARPHA Publishing Platform and published in a way that ensures their content is as FAIR as possible, meaning that it is effortlessly readable, discoverable, harvestable, citable and reusable by both humans and machines.

***

Follow Pensoft on Twitter, Facebook and Linkedin.
Follow OA Switchboard on Twitter and Linkedin.

Lizards go north: Balkan wall lizard population found all the way in the Czech Republic

The northernmost population of the Balkan lizard, recently discovered in the Czech Republic, has proven to be genetically unique and variable.

The Czech Republic is a zoologically well-studied area, and its reptile fauna is not very rich. Therefore, the recent discovery of a new reptile species for the country, the Balkan wall lizard (Podarcis tauricus), came as a big surprise. This lizard inhabits areas of the Central and Western Balkans as far as Crimea, with isolated areas of occurrence in Hungary and northern Romania, so how did it get as far north as the Czech Republic? Fortunately, the genetics in much of the lizard’s range are relatively well-studied. Finding out where lizards from the Czech Republic fit genetically could reveal the origins of this northernmost population.

Podarcis tauricus in the wild – Váté písky near Bzenec, Czech Republic.

An analysis published by Czech herpetologists in the journal Biodiversity Data Journal shows that the lizards from the Czech population are genetically variable; therefore, the population was not established by the introduction of a single gravid female.

Geographical distribution of Podarcis tauricus. The green arrow shows the northernmost known locality (Váté písky, Czech Republic).

The population also has genetic “markers” not yet found elsewhere, although it is clearly related to populations from the Central and Western Balkans and Hungary. These findings suggest that this could be an original, possibly relict population.

Haplotype network, designed from 24 haplotypes of the cytb locus from 167 individuals of Podarcis tauricus and Podarcis gaigeae (Psonis et al. 2017; this study). Colours correspond to the country of the specimen’s geographical origin and each circle corresponds to a haplotype. The circle size is proportional to the number of individuals with the same haplotype. The number of individuals per haplotype is indicated. Due to the unequal size of cytb sequences from Psonis et al. (2017), only a fragment of 257 bp which was common for all 167 sequences was used for the haplotype network reconstruction. For this region of cytb locus, the sequences of our individuals from Czech Republic are identical to 18 individuals from Albania, Hungary, Kosovo and Serbia.

However, we cannot rule out recent introductions or spontaneous northward dispersal of the lizard associated with global climate change. Exotic species of animals and plants appear in the Czech Republic through various routes and tracing their origin is not always easy. Both intentional and unintentional introductions have been recorded for some reptiles, while some previously southern vertebrate and invertebrate species spread to the north spontaneously.

The first genetic data on the origin of the northernmost population of the Balkan wall lizard suggest that the lizard can spread to the north naturally; however, further investigations are needed to support this tentative conclusion. 

Research article:

Rehák I, Fischer D, Kratochvíl L, Rovatsos M (2022) Origin and haplotype diversity of the northernmost population of Podarcis tauricus (Squamata, Lacertidae): Do lizards respond to climate change and go north? Biodiversity Data Journal 10: e82156. https://doi.org/10.3897/BDJ.10.e82156

Citizen scientists from three continents help discover a new, giant slug from Europe

The animal, as big as a medium-sized carrot, was discovered on a citizen-science expedition and jointly described by its participants.

You might think that Europe is so well studied that no large animals remain undiscovered. Yet today, a new species of giant keelback slug from Montenegro was announced in the open-access Biodiversity Data Journal. The animal, as big as a medium-sized carrot, was discovered on a citizen-science expedition and jointly described by its participants.

A living specimen of Limax pseudocinereoniger on a researcher’s hand.

The international team of citizen scientists from Italy, the Netherlands, Serbia, South Africa, and the United States found the slug in July 2019 while exploring the spectacular Tara Canyon, Europe’s deepest gorge, on inflatable rafts. The brownish-grey animals, with a sharp ridge along the back, and 20 cm in length when fully stretched, were hiding under rocky overhangs in the narrowest part of the ravine.

A living specimen of Limax pseudocinereoniger seen from the side. Photo by Pierre Escoubas

At first, the newly discovered slugs seemed superficially indistinguishable from the ash-black keelback slug (Limax cinereoniger), which also lives in the Tara Canyon. The team had to use a portable DNA lab to work out that there is a 10% difference between the two slugs in the so-called DNA barcode. Moreover, when they dissected a few of them, they found differences in the reproductive organs as well. This was enough to decide that a new species had been discovered, and they named it Limax pseudocinereoniger to indicate its similarity to L. cinereoniger.

The field trip was run by Taxon Expeditions, which organises real scientific expeditions for the general public, with the aim to make scientific discoveries. Rick de Vries, a web editor and illustrator from Amsterdam who found the first specimen of L. pseudocinereoniger, says: “It’s an incredible thrill to hold an animal in your hands and to know that it is still unknown to science”.

Citizen scientists studying specimens in the team’s field lab in Montenegro.

Zoologist Iva Njunjić, one of the authors of the paper, thinks that more unknown species are likely to be found in Tara Canyon and the Durmitor National Park, of which it is part. “Using a combination of DNA analysis and anatomy will probably reveal more species that are identical on the outside but actually belong to different species,” she says.

In 2023, Taxon Expeditions plans to take a new team of citizen scientists to Montenegro with a mission to discover new species and document the hidden biodiversity.

Taxon Expeditions was founded by Iva Njunjić and Menno Schilthuizen of Naturalis Biodiversity Center and specialises in ‘taxonomy tourism’ trips in Brunei, Italy, Montenegro, Panama, and the Netherlands.

Original source:

Schilthuizen M, Thompson CG, de Vries R, van Peursen ADP, Paterno M, Maestri S, Marcolongo L, Esposti CD, Delledonne M, Njunjić I (2022) A new giant keelback slug of the genus Limax from the Balkans, described by citizen scientists. Biodiversity Data Journal 10: e69685. https://doi.org/10.3897/BDJ.10.e69685

Endemic frogs in Himalayan region exhibit site fidelity

The Murree Hills Frog and Hazara Torrent Frog show minimum movement out of their habitat, which makes them more unique from an ecology and conservation perspective

Amongst tetrapods, amphibians entail the highest number of threatened and data deficient species, which has put them in the limelight of research in animal ecology and conservation. Endemic species have evolved and adapted to a particular set of environmental conditions. Hence, these are more vulnerable to environmental changes and are susceptible to population declines because of their restricted distribution ranges.

Murree Hills Frog (Nanorana vicina). Photo by Herpetology Lab, Arid Agriculture University, Rawalpindi

The Murree Hills Frog and Hazara Torrent Frog are endemic to Pakistan and South Asian countries. They are associated with the torrential streams and nearby clear water pools situated at high elevation. These frogs are susceptible to threats like habitat degradation, urbanization, and climate change. A recent study published in the-open access journal Biodiversity Data Journal reports that these endemic frogs do not show much movement within and outside their habitat.

Hazara Torrent Frog (Allopaa hazarensis). Photo by Herpetology Lab, Arid Agriculture University, Rawalpindi

“We have, for the first time, used radio-transmitters (VHF) on frogs endemic to Himalayan region to understand their ecology,” explains Dr. Muhammad Rais, Assistant Professor at the Herpetology Lab in the Arid Agriculture University, Rawalpindi, and lead author of the study. “Surprisingly, the Murree Hills Frog and Hazara Torrent Frog depend heavily for their survival on particular stream(s).”

“We suggest carrying out additional long term studies by incorporating multiple adjacent stream systems to better understand dispersal and colonization in these frogs,” he says in conclusion.

Research article:

Akram A, Rais M, Saeed M, Ahmed W, Gill S, Haider J (2022) Movement Paradigm for Hazara Torrent Frog Allopaa hazarensis and Murree Hills Frog Nanorana vicina (Anura: Dicroglossidae). Biodiversity Data Journal 10: e84365. https://doi.org/10.3897/BDJ.10.e84365