This tutorial outlines the various online biological databases and their uses as well as an introduction to homology terminology and searching using BLAST.
Learning outcomes
- Identify online database types and retrieve data
- Describe types of homologs and gene evolution
- Recognise the types of BLAST and their uses
- Execute a BLAST search either via the terminal (BLAST+), the NCBI website portal or the Galaxy portal
Prerequisites
- It is recommended that you have Notepad++ (Windows) or BBEdit (Mac) for viewing fasta files; most linux default editors can do this.
- It is recommended that you have followed the Concepts in Computer Programming, UNIX tutorial (basics) and Setting up and using conda tutorials if you are going to do the BLAST+ worksheet.
Order of tutorial
Please do the pre-learning quiz, then watch the presentation.
During the presentation there are points to stop and do exercises, which are linked below. The answers to the questions in the exercises are linked within each one.
Once finished the tutorial, take the post-learing quiz.
Biological DBs and BLAST Pre-tutorial Survey
Presentation
Tasks from slides with sample answers
Motif searching
Write this Prosite motif:
- Any residue except Glycine (G)
- A cytosine (C)
- Any of the following residues: A, L, W, S
- Two positions of any residue
- A proline (P)
Click here for answer
Homology terminology
Paralogs are defined as:
- Homologs, usually with different functions, that arose from a gene duplication event
- Homologs, usually with the same function, that arose from a speciation event
- Homologs, usually with the same function, that arose from a horizontal gene transfer event
Click here for answer
1. Homologs, usually with different functions, that arose from a gene duplication eventBLAST types and cut-offs
BLASTx is used for:
- A protein sequence against a protein database
- A nucleotide sequence against a nucleotide database
- A protein sequence against a nucleotide database
- A nucleotide sequence against a protein database
Click here for answer
4. A nucleotide sequence against a protein databaseWhich of these is the most stringent e-value cut-off?
- 0.0
- 1e-30
- 1e-3
Click here for answer
1. 0.0Worksheets
- Using BLAST+ (UNIX environment) to search for homologous sequences
- BLAST on the NCBI website
- BLAST using Galaxy