View event

Date: 31 January 2023

Polygenic risk scores (PRS) provide an estimate of an individual’s disposition to a trait or complex disease which are calculated as the sum of the risk alleles weighted by the effect size estimate of the genome-wide association study data on the phenotype. PRS scores help in understanding the shared aetiology of certain traits and also in risk prediction and prevention of certain diseases.

As a part of the CINECA project we have been working on the development of a demonstrator of federated genetic analyses utilising a computational pipeline for PRS analysis. We have implemented this workflow using Nextflow, a modular and reproducible workflow manager that can be deployed and autoscaled to many different working environments including Slurm-based clusters and Kubernetes deployed using AWS amongst other computing environments. In this webinar, we will provide an overview of this PRS pipeline utilising the CINECA UK1 synthetic dataset, derived from the 1000 genomes project, as a demonstrator. As part of the work we will further extend the demonstrator to other datasets and GWAS workflows, and integrate it to be run in a federated setting utilising the GA4GH guidelines on using Beacons for data discovery along with AAI passports for data access authorisation and deployments.

The CINECA (Common Infrastructure for National Cohorts in Europe, Canada, and Africa) project aims to develop a federated cloud enabled infrastructure to make population scale genomic and biomolecular data accessible across international borders, to accelerate research, and improve the health of individuals across continents.

About the speakers

Will Rayner is the head of the Data and Analytics group at the Institute of Translational Genomics in the Computational Health Department at Helmholtz Munich. He is interested in all aspects of data management and data privacy and has been leading the CINECA WP4 PRS use case.

Anshika Chowdhary is a Data informatician, at the Institute of translational genomics in Helmholtz Zentrum Munich. She has been working on the development of the workflows in WP4 of the PRS use case, and analysis of the eQTL catalog on the datasets at HMGU.

Contact: Marta Lloret Llinares - marta.lloret@ebi.ac.uk

Keywords: DNA & RNA (dna-rna), Proteins (proteins), UniProt: The Universal Protein Resource, Fermentation, Microbial ecosystems webinar, Antimicrobial resistance, Ensembl, BLAST, Open Targets Platform, Cross domain (cross-domain), Chemical biology (chemical-biology), Drug discovery, Drug target identification, UniRule, ARBA, Automated annotation, MetaboLights: Metabolomics repository and reference database, Chemical Entities of Biological Interest, ChEBI, Metabolites, Molecular building blocks of life, Human Cell Atlas Data Coordination Platform, Single-cell transcriptomics, HCA data portal, Programmatic access, API, Python, Complex Portal, macromolecular assembly, InterPro, Boolean modelling, Europe PubMed Central, Literature (literature), Open access, Protein Data Bank in Europe - Knowledge Base, 3D structure, AlphaFold Database, DeepMind, Artificial intelligence, AI, Structure prediction, cancer, Boolean, Ensembl Genomes, European Nucleotide Archive, Data archive, Raw sequencing data, RNAcentral, Non-coding RNA, ncRNA, GPU, Data protection, Job dispatcher, Bioimage analysis resource, Accessibility, Missense variation, Biostatistics, Rfam, non-coding RNA, Infernal software, Sequence annotation, Root microbiome, Abiotic stress, land management, Plant genotype, Plant webinar series, HPC, database development, cross-linked databases, Plant database, data infrastructure, Plant breeding, Data standards, data managemnet, data sharing, Hyb-Seq method, Flowering plants, Crop improvement, Pangenomics, Pangenomes, Virtual humans, Drug-target identification, plant-microbe interactions, Spatial transcriptomics, Plant research, Drug targets, Machine learning, Mathematical modelling, plant science, Data integration, plant-environment interaction, Phenotyping, field phenotyping, Deep phenotyping, EOSC-Life, NHGRI-EBI GWAS Catalog, clinical data, genome-wide association, plants, European Variation Archive, EVA, Variant clusters, Variant data annotation, Constraint-based metabolic modelling, UniProt knowledgebase, protein variant impact, disease-associated protein variants, Bioethics, FAIR principles, ELSI, cohort data, translational research, BioModels database, Mathematical modeling, Reproducibity, Systems biology models, workflows, federated analysis, polygenic risk scores

Organizer: European Bioinformatics Institute (EBI)

Host institutions: EMBL-EBI

Capacity: 1000

Event types:

Workshops and courses

Scientific topics: Computational biology

Activity log

Content provider

Node

Federated analysis for polygenic risk score calculations

About the speakers