- In this worksheet you will learn how to calculate the digital DNA:DNA hybridisation values between a set of genomes
Suggested prerequisite(s)
- An understanding of how dDDH works: https://bmcbioinformatics.biomedcentral.com/articles/10.1186/1471-2105-14-60
- Extensive background information for the various TYGS tools can be found here: https://tygs.dsmz.de/background/show
    - TYGS does a lot of other calculations beyond dDDH so it is suggested you read this for a more complete picture of these features
 
Dataset
- This demonstration uses the fasta files of a set of closely related bacterial whole genome sequences
    - Download this dataset and unzip it to a folder on your computer.
 
Steps
- In your web browser, navigate to https://tygs.dsmz.de/user_requests/new
    - This is the homepage for the Type (Strain) Genome Server, as run by the DSMZ collection.
 
- GGDC takes a set of genomes and compares to themselves and type strains
- To do this, click on the ‘Choose files’ button and then select all the genomes in the folder downloaded above
    - Hold down the shift key, select the first and then the last genome file to select them all
 
- We will include type strains in our comparisons so leave the rest of the options as they are
- Type your email address into the box at the bottom of the page and then click ‘Submit query’
- This can take a long time (about an hour sometimes but usually less)
    - If you want to start analysing instead of waiting you can download the final report produced by TYGS here
 
- Once finished, click the link sent to your email and then click ‘results’ at the top right.
- You can download a full PDF (like the one linked in step 6) by clicking ‘Download PDF Report’
- Click on the ‘Identification’ button to see the species assignments based on comparisons to the type strains
    - This is table 2 in the PDF file
 
- Click on ‘Pairwise comparisons’ to see the dDDH results
    - This is table 3 in the PDF file
- Download this as an Excel file by clicking the button at the top right for easier analysis
        - A copy of this excel file can be downloaded here
 
 
- It is recommended to use the dDDH (d4, in %) (5th column in the webserver and the PDF).
    - Over 70% is a likely intra-species match
 
- Using the filter function on the top left of the table on the webserver (or sorting data function in excel) to find the top match for each of our 6 strains
    - Do all have a dDDH <70% to a type strain?