Genome assembly and annotation

This tutorial outlines the various online biological databases and their uses as well as an introduction to homology terminology and searching using BLAST.

Learning outcomes

  • Describe the primary steps in de novo assembly
  • Implement basic assembly quality control and metrics
  • List the primary outputs of genome annotation
  • Describe the primary steps in reference mapping
  • Compare the pros and cons of assembly and mapping approaches
  • Undertake genome assembly using Flye or Spades
  • Undertake genome annotation using Bakta
  • Undertake reference-based mapping using Snippy

Prerequisites

Approximate time to finish tutorial

  • Lecture: 1.5 hours
  • Tutorials: 1.5 hours
  • Pre/post surveys: 10 minutes

Order of tutorial

Please do the pre-learning quiz, then watch the presentation.
During the presentation there are points to stop and do exercises, which are linked below. The answers to the questions in the exercises are linked within each one.
Once finished the tutorial, take the post-learing quiz.

Genome Assembly Pre-tutorial Survey

Presentation

Tasks from slides with sample answers

What is the sequencing depth of the two positions highlighted in blue? (see slides for image)

Click here for answer G: 7
A: 8


What is the resulting sequence after the de bruijn-based joining of these two reads? (k=3) TTAACCA CCAAAAT

Click here for answer TTAACCAAAT


Which of these is a good maximum number of contigs in an assembly? 100 500 1000

Click here for answer 100


tRNAs are a type of coding or non-coding gene?

Click here for answer Non-coding


Worksheets

UNIX shell approaches

Galaxy approaches

Genome Assembly Post-tutorial Survey

Other tools and videos