Skip to content

Latest commit

 

History

History
105 lines (63 loc) · 3.73 KB

00-BLAST-intro.md

File metadata and controls

105 lines (63 loc) · 3.73 KB
typora-copy-images-to sequence-fold command-fold last-update
./
80 xclip -o | tr -d "\n" | fold -w80
75 xclip -o | tr -d "\n" | fold -sw75 | sed -e 's/$/\\/'
April 2nd, 2018

Introduction to BLAST

by alper yilmaz for GTU Bioinformatics Program Course

2019-03-14 (PDF version of this document is accessible at goo.gl/bhrkqQ and html version is available at https://goo.gl/siUoax)

[TOC]

Paralog vs. Homolog

Homology-Ortholog-Paralog

image source

Local vs. Global Alignment

Global Sequence Alignment Local Sequence Alignment
In global alignment, an attempt is made to align the entire sequence (end to end alignment) Finds local regions with the highest level of similarity between the two sequences.
A global alignment contains all letters from both the query and target sequences A local alignment aligns a substring of the query sequence to a substring of the target sequence.
If two sequences have approximately the same length and are quite similar, they are suitable for global alignment. Any two sequences can be locally aligned as local alignment finds stretches of sequences with high level of matches without considering the alignment of rest of the sequence regions.
Suitable for aligning two closely related sequences. Suitable for aligning more divergent sequences or distantly related sequences.
Global alignments are usually done for comparing homologous genes like comparing two genes with same function (in human vs. mouse) or comparing two proteins with similar function. Used for finding out conserved patterns in DNA sequences or conserved domains or motifs in two proteins.
A general global alignment technique is the Needleman–Wunsch algorithm. A general local alignment method is Smith–Waterman algorithm.

Local vs Global Alignment

image and table source

BLAST

Functions of BLAST

  • Identify species
  • Locating domains
  • Establishing phylogeny
  • Comparison

The Index

BLAST Index

Types of BLAST programs

BLAST programs

image source

How BLAST works

How BLAST works

image source

BLAST mechanism

image source

Scoring

Nucleotide

Nucleotide scoring

image source

Alignment score

AACGTTTCCAGTCCAAATAGCTAGGC
===--===   =-===-==-======
AACCGTTC   TACAATTACCTAGGC

Hits(+1): 18
Misses (-2): 5
Gaps (existence -2, extension -1): 1 Length: 3
Score = 18 * 1 + 5 * (-2) – 2 – 2 = 4

Amino Acid

BLOSUM62 Substitution Matrix

BLOSUM62

image source