e-learning

Mutation calling, viral genome reconstruction and lineage/clade assignment from SARS-CoV-2 sequencing data

Abstract

Sequence-based monitoring of global infectious disease crises, such as the COVID-19 pandemic, requires capacity to generate and analyze large volumes of sequencing data in near real time. These data have proven essential for surveilling the emergence and spread of new viral variants, and for understanding the evolutionary dynamics of the virus.

About This Material

This is a Hands-on Tutorial from the GTN which is usable either for individual self-study, or as a teaching material in a classroom.

Questions this will address

  • How can a complete analysis, including viral consensus sequence reconstruction and lineage assignment be performed?
  • How can such an analysis be kept manageable for lots of samples, yet flexible enough to handle different types of input data?
  • What are key results beyond consensus genomes and lineage assignments that need to be understood to avoid inappropriate conclusions about samples?
  • How can the needs for high-throughput data analysis in an ongoing infectious disease outbreak/pandemic and the need for proper quality control and data inspection be balanced?

Learning Objectives

  • Discover and obtain recommended Galaxy workflows for SARS-CoV-2 sequence data analysis through public workflow registries
  • Choose and run a workflow to discover mutations in a batch of viral samples from sequencing data obtained through a range of different protocols and platforms
  • Run a workflow to summarize and visualize the mutation discovery results for a batch of samples
  • Run a workflow to construct viral consensus sequences for the samples in a batch
  • Know different SARS-CoV-2 lineage classification systems, and use pangolin and Nextclade to assign samples to predefined lineages
  • Combine information from different analysis steps to be able to draw appropriate conclusions about individual samples and batches of viral data

Licence: Creative Commons Attribution 4.0 International

Keywords: Variant Analysis, covid19, one-health, virology

Target audience: Students

Resource type: e-learning

Version: 17

Status: Active

Prerequisites:

  • From NCBI's Sequence Read Archive (SRA) to Galaxy: SARS-CoV-2 variant analysis
  • Introduction to Galaxy Analyses
  • Mapping
  • Quality Control
  • Using dataset collections

Learning objectives:

  • Discover and obtain recommended Galaxy workflows for SARS-CoV-2 sequence data analysis through public workflow registries
  • Choose and run a workflow to discover mutations in a batch of viral samples from sequencing data obtained through a range of different protocols and platforms
  • Run a workflow to summarize and visualize the mutation discovery results for a batch of samples
  • Run a workflow to construct viral consensus sequences for the samples in a batch
  • Know different SARS-CoV-2 lineage classification systems, and use pangolin and Nextclade to assign samples to predefined lineages
  • Combine information from different analysis steps to be able to draw appropriate conclusions about individual samples and batches of viral data

Date modified: 2024-10-01

Date published: 2021-06-30

Authors: Bérénice Batut, Wolfgang Maier

Contributors: Beatriz Serrano-Solano, Björn Grüning, Bérénice Batut, Helena Rasche, Jasper Ouwerkerk, Saskia Hiltemann, Teresa Müller, Wolfgang Maier

Scientific topics: Genetic variation


Activity log