Publications

What is a Publication?
2 Publications visible to you, out of a total of 2

Abstract (Expand)

Abstract We have developed pISA-tree, a straightforward and flexible data management solution for organisation of life science project-associated research data and metadata. It enables on the flyand metadata. It enables on the fly creation of enriched directory tree structure ( p roject/ I nvestigation/ S tudy/ A ssay), via a series of sequential batch files in a standardised manner based upon the ISA metadata framework. Metadata, according to the system-provided metadata templates, is generated in parallel at each level. The system supports reproducible research and is in accordance with the Open Science initiative and FAIR principles. Compared with similar frameworks, it does not require any systems administration and maintenance as it can be run on a personal computer or network drive. It is complemented with two R packages, pisar and seekr , where the former facilitates integration of the pISA-tree datasets into bioinformatic pipelines and the latter enables synchronisation with the FAIRDOMHub public repository using the SEEK API. Source code and detailed documentation of pISA-tree and its supporting R packages are available from https://github.com/NIB-SI/pISA-tree . We demonstrate the usability of pISA-tree with two examples of medium sized life science projects. Accordingly, it is suitable and also currently used to manage larger projects including several partners from different countries. Since pISA-tree was initiated by end user requirements with an emphasis on practicality, it will facilitate adoption of FAIR data management practices and open science principles.

Authors: Marko Petek, Maja Zagorščak, Andrej Blejec, Živa Ramšak, Anna Coll, Špela Baebler, Kristina Gruden

Date Published: 21st Nov 2021

Publication Type: Journal

Abstract (Expand)

Background: Although the reference genome of Solanum tuberosum group Phureja double-monoploid (DM) clone is available, knowledge on the genetic diversity of the highly heterozygous tetraploid group Tuberosum, representing most cultivated varieties, remains largely unexplored. This lack of knowledge hinders further progress in potato research and its subsequent applications in breeding. Results: For the DM genome assembly, two only partially-overlapping gene models exist differing in a unique set of genes and intron/exon structure predictions. First step was to merge and manually curate the merged gene model, creating a union of genes in Phureja scaffold. We next compiled available RNA-Seq datasets (cca. 1.5 billion reads) for three tetraploid potato genotypes (cultivar Désirée, cultivar Rywal, and breeding clone PW363) with diverse breeding pedigrees. Short-read transcriptomes were assembled using CLC, Trinity, Velvet, and rnaSPAdes de novo assemblers using different settings to test for optimal outcome. In addition, for cultivar Rywal, PacBio Iso-Seq full-length transcriptome sequencing was also performed. Revised EvidentialGene redundancy-reducing pipeline was employed to produce accurate and complete cultivar-specific transcriptomes from assemblers output, as well as to attain the pan-transcriptome. Due to being the most diverse dataset in terms of tissues (stem, seedlings and roots) and experimental conditions, cv. Désirée was the most complete transcriptome (95.8% BUSCO completeness). For cv. Rywal and breeding clone PW363 data were available for leaf samples only and the resulting transcriptomes were less complete than cv. Désirée (89.8% and 89.3% BUSCO completeness, respectively). Cross comparison of these cultivar-specific transcriptomes and merged DM gene model suggests that the core potato transcriptome is comprised of 16,339 genes. The pan-transcriptome contains a total of 95,779 transcripts, of which 54,614 transcripts are not present in the Phureja genome. These represent the variants of the novel genes found in the potato pan-genome. Conclusions: Our analysis shows that the available gene model of double-monoploid potato from group Phureja is, to some degree, not complete. The generated transcriptomes and pan-transcriptome represent a valuable resource for potato gene variability exploration, high-throughput -omics analyses, and future breeding programmes.

Authors: Marko Petek, Maja Zagorščak, Živa Ramšak, Sheri Sanders, Elizabeth Tseng, Mohamed Zouine, Anna Coll, Kristina Gruden

Date Published: No date defined

Publication Type: Not specified

Powered by
(v.1.16.0-pre)
Copyright © 2008 - 2024 The University of Manchester and HITS gGmbH