Tutorial, Slides, Scripts
Genome assembly, annotation and comparative genomics - EBP-Nor workshop 2024
Training materials from the 3-day EBP-Nor Genome Assembly, Annotation and Comparative Genomics workshop that was given at the Norwegian Biodiversity Genomics & Conference 2024.
The repository includes slides and tutorials that covers the process from going from fastq files of your sequenced genome(s) to performing comparative gene analysis.
Genome Assembly:
The Genome Assembly section introduces participants to whole-genome assembly processes using real datasets. It includes:
- GenomeScope2: Estimating genome size and heterozygosity.
- Smudgeplot: Inferring ploidy levels from sequencing data.
- HiFiAdapterFilt: Filtering adapters from PacBio HiFi reads.
- hifiasm: Assembly of PacBio HiFi sequencing data.
- YaHS: Scaffolding genome assemblies using Hi-C data.
- gfastats: Assembly statistics and quality metrics.
- BUSCO: Evaluating genome assembly completeness.
- Merqury: Assessing genome assembly accuracy.
- FCS-GX and GRIT Rapid Curation suite: Manual curation and contamination removal.
- PretextView: Visualization of genome assemblies.
Genome Annotation:
Participants learn how to annotate genomes effectively through:
- RepeatMasker: Masking genomic repeats.
- miniprot: Mapping protein sets to genome assemblies.
- GALBA: Ab initio gene prediction.
- EvidenceModeler (EVM): Combining gene annotation evidence.
- BUSCO: Evaluating annotation completeness.
- Functional annotation: Assigning biological functions to genes.
Comparative Genomics:
This section provides practical training in comparative genomic analyses and visualization:
- OrthoFinder: Identifying orthologous gene groups.
- R visualizations: Basic visualization of OrthoFinder results.
- CAFE5: Analysis of gene family evolution.
- GO enrichment analyses with g:Profiler: Interpretation of functional enrichments using custom GO annotions.
Licence: Creative Commons Attribution Share Alike 4.0 International
Keywords: Assembly, Annotation, Comparative genomics
Target audience: Bioinformatician
Resource type: Tutorial, Slides, Scripts
Status: Active
Scientific topics: Bioinformatics, Genomics, Comparative genomics
Activity log