Pathogen genomics data analysis course

This workshop contains a set of tutorials for undertaking basic data analytics for pathogen genomics research (although many topics covered are also applicable elsewhere).
It is made to be lightweight, with each tutorial capable of running on a laptop. The primary target audience is public health researchers/technicians in low resource settings but there is material here for anybody starting out in bioinformatics and/or pathogen genomics data analysis

This work is licensed under CC BY-NC-SA 4.0

That means it is fine to reuse, distribute, adapt and build upon the material for non-commercial purposes, once proper credit is given to the relevant creator (i.e. Conor Meehan for most material). If you adapt or modify the material, you must license the modified material under the same terms.
If you do reuse or adapt the material, please let me know as I am always eager to hear if the material is useful. My contact details can be found on my Nottingham trent University staff page.

Funding for the basic creation of this course comes from the Microbiology Society

Each tutorial has a lecture or explanation of the topic with associated practice worksheets and sample answers.
It is suggested that you do these in order, or at least the concepts in computer programming, UNIX tutorials and R tutorial before the other topics, if you wish to do the terminal and/or R-based worksheets (there are Galaxy and web-based alternatives for many if you do not).

Concepts in computer programming

The UNIX shell

Introduction to R

Experimental design and statistics

Data handling and figure generation

Biological Databases and BLAST

Genome assembly and annotation (viral and bacterial)

Introduction to phylogenetics

Introduction to Bayesian phylodynamics

Predicting pathogenic features (AMR, virulence factors, plasmids)

Plasmodium genomics and drug resistance

Bacterial genomic epidemiology and strain typing

Microbiome amplicon analyses


Python tutorials