hands-on tutorial
FAIRification of an RNAseq dataset
RNA sequencing is chosen here as an example of how to FAIRify data for a popular assay in the Life Sciences. RNAseq data can be shared and curated in designated public repositories using established ontologies (and controlled vocabularies) for describing protocols and biological material (metadata).
Two international repositories are commonly used to locate and download RNAseq (meta)data: ArrayExpress and GEO. Other repositories for raw sequence data exist (e.g. SRA, ENA, DDBJ), but ArrayExpress and GEO specifically house and index expression data , including rich metadata detailing samples, data processing and final results files such as gene expression matrices.
By submitting data to a public repository, it becomes openly accessible, searchable and annotated with rich metadata, by the submitter and curation team. Note, both repositories belong to the FAIRsharing database registry, which can help you find public repositories for all types of Life Science data.
This lesson will take you through a publicly available RNAseq dataset in ArrayExpress and show you how it meets FAIR principles using the checklist published in 2016 Wilkinson et al. 2016.
DOI: https://gxy.io/GTN:T00435
Licence: Creative Commons Attribution 4.0 International
Keywords: FAIR, clinical data
Resource type: hands-on tutorial
Version: 7
Status: Active
Prerequisites:
FAIR and its Origins
Metadata
Data Registration
Access
Persistent Identifiers
Learning objectives:
To be able to map each of the FAIR principles to a dataset in the public domain
Date created: 2024-05-26
Date modified: 2024-05-26
Date published: 2024-05-27
Activity log