Publications

What is a Publication?
5 Publications visible to you, out of a total of 5

Abstract (Expand)

<p>Robust validation of both research data and its accompanying metadata is essential for ensuring adherence to FAIR principles. Current approaches often handle these aspects separately, hindering a holistic quality assessment. Building upon previous BioHackathon work establishing ARCs (Annotated Research Context) as RO-Crates (ARC RO-Crate), we aim to develop and demonstrate an integrated validation strategy for FAIR digital objects. It distinguishes between validating the metadata descriptor and the payload data files.For the metadata descriptor, validation will ensure structural and semantic compliance to the base RO-Crate specification and the ARC-ISA family of RO-Crate profiles, using and extending the RO-Crate validator tool.For the payload data files, validation targets the actual content, since data files often require domain-specific structural and value constraints, which requires explicit schema definitions. For this, we will integrate Frictionless for checking data content against community standards (e.g. MIAPPE, as demonstrated in the HORIZON project AGENT). Crucially, this project will also explore mechanisms for specifying expected data structures’ requirements within the ARC RO-Crate itself. This aims to provide a more self-contained description of data, investigating how such internal requirements can be linked to data validation frameworks, complementing the crate’s metadata validation.The overall goal is to provide a powerful, holistic validation mechanism for ARC RO-Crates, enhancing their reliability, trustworthiness, and FAIRness. A MIAPPE-compliant plant phenomics dataset will serve as a use case. This integrated validation approach aims to streamline quality control for researchers and will be packaged as a deployable microservice, offering broad applicability across diverse research workflows.</p>

Authors: Eli Chadwick, Matthijs Brouwer, Kevin Schneider, Daniel Arend, Finn Bacall, Etienne Bardet, Sebastian Beier, Dominik Brilhaus, Xiaoming Hu, Emma Le Roy Pardonche, Timo Mühlhaus, Stuart Owen, Cyril Pommier, Heinrich Lukas Weil

Date Published: 16th Dec 2025

Publication Type: Report

Abstract (Expand)

Modern research projects increasingly require hybrid metadata approaches that balance adherence to domain-overarching, as well as domain-specific community standards with flexibility for project- or resource-specific metadata. The FAIRDOM-SEEK platform [1] is a widely used research data management system designed to support diverse domains, from systems biology to health research data, by integrating standardized metadata models (e.g., the ISA framework [2]) with customizable extensions. To address this need, we introduce the Extended Metadata feature in SEEK, which allows researchers to extend core metadata schemas with user-defined fields, hierarchies, and semantic annotations while ensuring interoperability with domain-specific standards. We demonstrate this capability through two use cases: 1. NFDI4Health Local Data Hubs (LDH) [3],[4]: In the context of the German National Research Data Infrastructure for Personal Health Data (NFDI4Health [5]), we have developed Local Data Hubs (LDH) based on the SEEK platform. These hubs support federated data structuring and sharing for sensitive health data from clinical trials, epidemiological studies, and public health research and allow to connect local platforms to the central metadata repository of NFDI4Health, the German Health Study Hub. Given the complexity of the NFDI4Health metadata schema (MDS) [6], the SEEK-based LDH software utilizes the Extended Metadata feature to fully represent the schema, allowing for flexible project-defined metadata extensions. 2. FAIR Data Station (FAIR-DS) [7]: Based on the ISA-framework, with the addition of Observation units from MIAPPE [8], the FAIR-DS is a web application that enables users to create and manage metadata according to FAIR principles. Using packages and terms configured through the UI, it generates Excel spreadsheets which are then populated to gather the metadata. FAIR-DS is then used to validate the metadata and generates RDF datasets representing the content. SEEK has been updated to allow Extended Metadata and Sample Types to be configured automatically via these RDF datasets, and also the content can be imported, and updated, in a single action. The Extended Metadata feature allows users to define additional metadata attributes to be tailored to specific data types, ensuring compliance with standards. When creating a resource, users can select an Extended Metadata type from a dropdown menu, dynamically triggering the rendering of associated metadata input forms within the web interface. This enables seamless integration of resource-specific metadata (e.g., clinical trial study metadata) alongside core descriptive fields. Currently, only instance administrators can create, manage (enable/disable), and delete additional attributes for specific resource types (e.g., ISA items such as Investigation, Study, Assay, as well as Projects and Models) based on specific schemas (e.g., the NFDI4Health MDS). Attribute types range from simple (e.g., string, text, date, integer, Boolean) to complex (e.g., controlled vocabularies linked to ontologies, nested hierarchical structures), with validation rules for mandatory or optional fields. Regular expressions are introduced to ensure correct input formatting. Metadata schemas can be created through backend seed files, JSON uploads, or FAIR-DS RDF imports. These schemas are programmatically accessible via the SEEK REST API, enabling automated metadata creation and retrieval. This ensures interoperability with external tools while adhering to FAIR data principles.

Authors: Xiaoming Hu, Stuart Owen, Frank Meineke, Finn Bacall, Carole Goble, Wolfgang Müller, Martin Golebiewski

Date Published: 2025

Publication Type: Conference Paper

Abstract (Expand)

The FAIRDOMHub is a repository for publishing FAIR (Findable, Accessible, Interoperable and Reusable) Data, Operating procedures and Models (https://fairdomhub.org/) for the Systems Biology community. It is a web-accessible repository for storing and sharing systems biology research assets. It enables researchers to organize, share and publish data, models and protocols, interlink them in the context of the systems biology investigations that produced them, and to interrogate them via API interfaces. By using the FAIRDOMHub, researchers can achieve more effective exchange with geographically distributed collaborators during projects, ensure results are sustained and preserved and generate reproducible publications that adhere to the FAIR guiding principles of data stewardship.

Authors: K. Wolstencroft, O. Krebs, J. L. Snoep, N. J. Stanford, F. Bacall, M. Golebiewski, R. Kuzyakiv, Q. Nguyen, S. Owen, S. Soiland-Reyes, J. Straszewski, D. D. van Niekerk, A. R. Williams, L. Malmstrom, B. Rinn, W. Muller, C. Goble

Date Published: 4th Jan 2017

Publication Type: Journal Article

Abstract (Expand)

The FAIRDOMHub is a repository for publishing FAIR (Findable, Accessible, Interoperable and Reusable) Data, Operating procedures and Models (https://fairdomhub.org/) for the Systems Biology community. It is a web-accessible repository for storing and sharing systems biology research assets. It enables researchers to organize, share and publish data, models and protocols, interlink them in the context of the systems biology investigations that produced them, and to interrogate them via API interfaces. By using the FAIRDOMHub, researchers can achieve more effective exchange with geographically distributed collaborators during projects, ensure results are sustained and preserved and generate reproducible publications that adhere to the FAIR guiding principles of data stewardship.

Authors: K. Wolstencroft, O. Krebs, J. L. Snoep, N. J. Stanford, F. Bacall, M. Golebiewski, R. Kuzyakiv, Q. Nguyen, S. Owen, S. Soiland-Reyes, J. Straszewski, D. D. van Niekerk, A. R. Williams, L. Malmstrom, B. Rinn, W. Muller, C. Goble

Date Published: 4th Jan 2017

Publication Type: Journal Article

Abstract (Expand)

RightField is a Java application that provides a mechanism for embedding ontology annotation support for scientific data in Microsoft Excel or Open Office spreadsheets. The result is semantic annotation by stealth, with an annotation process that is less error-prone, more efficient, and more consistent with community standards. By automatically generating RDF statements for each cell a rich, Linked Data querying environment allows scientists to search their data and other Linked Data resources interchangeably, and caters for queries across heterogeneous spreadsheets. RightField has been developed for Systems Biologists but has since adopted more widely. It is open source (BSD license) and freely available from http://www.rightfield.org.uk

Authors: Katy Wolstencroft, Stuart Owen, Matthew Horridge, Wolfgang Mueller, Finn Bacall, Jacky Snoep, Franco du Preez, Quyen Nguyen, Olga Krebs, Carole Goble

Date Published: 2012

Publication Type: Journal Article

Powered by
(v.1.17.2)
Copyright © 2008 - 2025 The University of Manchester and HITS gGmbH