Typing bacteria using MLST (via UNIX)

  • In this worksheet you will learn how to use the MLST tool to type a bacterial genome via UNIX/Conda

Suggested prerequisites

Dataset

Steps

  1. Create a directory for your analyses and step into it
mkdir mlst_demo
cd mlst_demo
  1. Copy your assembled genome into this folder or download the sample data
    • You can save this directly to your terminal current working directory by using the wget command (wget can be installed via conda).
wget https://conmeehan.github.io/PathogenDataCourse/Datasets/DRR187559_scaffolds.fasta
  1. Install MLST using conda
    • It is recommended to always install packages in their own environments so here will we create an enironment and install MLST in one step.
mamba create -n mlst -c bioconda mlst -y
mamba activate mlst
  1. Run MLST on the scaffolds file
    • MLST auto detects the correct species scheme to use for your data
    • You can specify a scheme by adding the --scheme option. You can see a list of all schemes using mlst --longlist * --threads is the number of threads to dedicate to the process. My computer has 8 threads so I am dedicating 7 * The > redirect will put the resulting allele calls to the _mlst.tsv file
mlst --threads 7 DRR187559_scaffolds.fasta >DRR187559_mlst.tsv
  1. View the resulting file
    • You will see the sample name, the scheme used (s. aureus 764 in this case) and then the list of the MLST genes and the allele number associated with your input genome
      cat DRR187559_mlst.tsv
      
    • Guidance on tweaking this output and running multiple genomes at once can be found in the github page for MLST
  2. Deactivate your mamba environment when finished
    mamba deactivate