This investigation serves as supplementary material for a SWAT4HCLS publication that describes minimum metadata and provenance requirements for reproducible enrichment analysis results.
Functional enrichment analysis is an essential downstream process in high throughput omics studies, such as transcriptomics and proteomics. By using the Gene Ontology (GO) and its annotations (GOA), underlying functional patterns of over-representation can be identified, leading to better interpretation of the omics data and new biological insights. However, GO reflects the current understanding of gene product function and evolves with our changing biological knowledge. When performing such analyses, it is therefore crucial to record GO version provenance, together with related parameters, such as statistical cut-offs and annotation sources. Surveying the literature on functional enrichment results reveals provenance information is rarely available, reducing the reproducibility and interpretation of results and preventing objective comparisons between related studies. In this work, we propose minimal metadata requirements for functional enrichment reproducibility. Our model complies with the FAIR principles and is based on the provenance ontology (PROV-O).
Created: 28th Nov 2022 at 14:01
Last updated: 13th Jan 2023 at 09:21