bio-0.5.0.1: A bioinformatics library

Bio.Alignment.QAlign

Contents

Description

Quality-aware alignments

Generally, quality data are ignored for alignment/pattern searching like Smith-Waterman, Needleman-Wunsch, or BLAST(p|n|x). I believe that accounting for quality will at the very least affect things like BLAST statistics, and e.g. is crucial for good EST annotation using Blastx.

This module performs sequences alignments, takes quality values into account.

See also http://bioinformatics.oxfordjournals.org/cgi/content/abstract/btn052v1.

Synopsis

Smith-Waterman

Locally optimal alignment with affine gaps, i.e. best infix match.

local_score :: QualMx t Double -> (Double, Double) -> Sequence t -> Sequence t -> Double

Calculate local edit distance (Smith-Waterman alignment score)

local_align :: QualMx t Double -> (Double, Double) -> Sequence t -> Sequence t -> (Double, EditList)

Calculate local alignment (Smith-Waterman) (can we replace uncurry max' with fst - a local alignment must always end on a subst, no?)

Needleman-Wunsch

Globally optimal alignment with affine gaps, the whole sequences are matched.

global_score :: QualMx t Double -> (Double, Double) -> Sequence t -> Sequence t -> Double

Calculate global edit distance (Needleman-Wunsch alignment score)

global_align :: QualMx t Double -> (Double, Double) -> Sequence t -> Sequence t -> (Double, EditList)

Calculate global alignment (Needleman-Wunsch)

Overlapping alignment.

The suffix of one sequence matches a prefix of another.

overlap_score :: QualMx t Double -> (Double, Double) -> Sequence t -> Sequence t -> Double

Calucalte best overlap score, where gaps at the edges are free The starting point is like for local score (0 cost for initial indels), the result is the maximum anywhere in the last column or bottom row of the matrix.

overlap_align :: QualMx t Double -> (Double, Double) -> Sequence t -> Sequence t -> (Double, EditList)

Calucalte best overlap score, where gaps at the edges are free The starting point is like for local score (0 cost for initial indels), the result is the maximum anywhere in the last column or bottom row of the matrix.

Matrix construction

qualMx :: Qual -> Qual -> (Chr, Chr) -> Double

Interactive testing of alignments