Introduction to Galaxy and Single Cell RNA Sequence analysis

This learning path aims to teach you the basics of Galaxy and analysis of Single Cell RNA-seq data. You will learn how to use Galaxy for analysis, and an important Galaxy feature for iterative single cell analysis. You’ll tbe guided through the general theory of single analysis and then perform a basic analysis of 10X chromium data. For support throughout these tutorials, join our Galaxy single cell chat group on Matrix to ask questions!

New to Galaxy and/or the field of scRNA-seq? Follow this learning path to get familiar with the basics!

Licence: Creative Commons Attribution 4.0 International

Keywords: Single Cell

Authors: Wendi Bacon, Pavankumar Videm

Status: Active

Learning objectives:

Module 1: Introduction to Galaxy

  • Learn how to upload a file
  • Learn how to use a tool
  • Learn how to view results
  • Learn how to view histories
  • Learn how to extract and run a workflow
  • Learn how to share a history
  • Learn how to set name tags
  • Learn how they are propagated

Module 2: Theory of Single-Cell RNA-seq

  • To understand the pitfalls in scRNA-seq sequencing and amplification, and how they are overcome.
  • Know the types of variation in an analysis and how to control for them.
  • Grasp what dimension reduction is, and how it might be performed.
  • Be familiarised with the main types of clustering techniques and when to use them.

Module 3: Time to analyse data!

  • Demultiplex single-cell FASTQ data from 10X Genomics
  • Learn about transparent matrix formats
  • Understand the importance of high and low quality cells
  • Describe an AnnData object to store single-cell data
  • Explain the preprocessing steps for single-cell data
  • Evaluate quality of single-cell data and apply steps to select and filter cells and genes based on QC
  • Execute data normalization and scaling
  • Identify highly variable genes
  • Construct and run a dimensionality reduction using Principal Component Analysis
  • Perform a graph-based clustering for cells
  • Identify marker genes for the clusters
  • Construct and run a cell type annotation for the clusters

1

Module 1: Introduction to Galaxy [and Single Cell RNA Sequence analysis]

• beginner 2 materials

Get a first look at the Galaxy platform for data analysis. We start with a short introduction (video slides & practical) to familiarize you with the Galaxy interface, and then proceed with a short tutorial of how to tag - and organise! - your history.

Time estimation: 1 hour

2

Module 2: Theory of Single-Cell RNA-seq

•• intermediate 1 material

When analysing sequencing data, you should always start with a quality control step to clean your data and make sure your data is good enough to answer your research question. After this step, you will often proceed with a mapping (alignment) or genome assembly step, depending on whether you have a reference genome to work with.

Time estimation: 30 minutes

3

Module 3: Time to analyse data!

2 materials

Activity log