EMBOSS: prettyplot


Program prettyplot

Function

Displays aligned sequences, with colouring and boxing

Description

prettyplot reads in a set of aligned DNA or protein sequences. It displays them graphically, with conserved regions highlighted in various ways.

Usage

Here is a sample session with prettyplot.

% prettyplot -resbreak=10 -boxcol -consensus -plurality=3
Displays aligned sequences, with colouring and boxing
Input sequence set: globin.msf
Graph type [x11]:
click here for result
$ prettyplot globin.msf -plurality=3 -docolour
Displays aligned sequences, with colouring and boxing
Graph type [x11]: 
click here for result

Command line arguments

   Mandatory qualifiers (* if not always prompted):
  [-msf]               seqset     File containing a sequence alignment
*  -graph              graph      Graph type

   Optional qualifiers:
   -residuesperline    integer    The number of residues to be displayed on
                                  each line
   -resbreak           integer    Residues before a space
   -[no]ccolours       bool       Colour residues by their consensus value.
   -cidentity          string     Colour to display identical residues (RED)
   -csimilarity        string     Colour to display similar residues (GREEN)
   -cother             string     Colour to display other residues (BLACK)
   -docolour           bool       Colour residues by table oily, amide etc.
   -[no]title          bool       Do not display the title
   -shade              string     Set to BPLW for normal shading
                                  so for pair = 1.5,1.0,0.5 and shade = BPLW
                                  Residues score Colour
                                  1.5 or over....... BLACK (B)
                                  1.0 to 1.5 ....... BROWN (P)
                                  0.5 to 1.0 ....... WHEAT (L)
                                  under 0.5 ....... WHITE (W)
                                  The only four letters allowed are BPLW, in
                                  any order.
   -pair               string     Values to represent identical similar
                                  related
   -identity           integer    Only match those which are identical in all
                                  sequences.
   -[no]box            bool       Display prettyboxes
   -boxcol             bool       Colour the background in the boxes
   -boxcolval          string     Colour to be used for background. (GREY)
   -[no]name           bool       Display the sequence names
   -maxnamelen         integer    Margin size for the sequence name.
   -[no]number         bool       Display the residue number
   -[no]listoptions    bool       Display the date and options used
   -plurality          float      Plurality check value (totweight/2)
   -consensus          bool       Display the consensus
   -[no]collision      bool       Allow collisions in calculating consensus
   -alternative        integer    Use alternative collisions routine
                                  0) Normal collision check. (default)
                                  1) checks identical scores with the max
                                  score found. So if any other residue matches
                                  the identical score then a collision has
                                  occurred.
                                  2) If another residue has a greater than or
                                  equal to matching score and these do not
                                  match then a collision has occurred.
                                  3) Checks all those not in the current
                                  consensus.If any of these give a top score
                                  for matching or identical scores then a
                                  collision has occured.
   -matrixfile         matrix     Matrix file
   -showscore          integer    Print residue scores
   -portrait           bool       Set page to Portrait

   Advanced qualifiers:
   -data               bool       (no help text) bool value

   General qualifiers:
  -help                bool       report command line options. More
                                  information on associated and general
                                  qualifiers can be found with -help -verbose


Mandatory qualifiers Allowed values Default
[-msf]
(Parameter 1)
File containing a sequence alignment Readable sequences Required
-graph Graph type EMBOSS has a list of known devices, including postscript, ps, hpgl, hp7470, hp7580, meta, colourps, cps, xwindows, x11, tektronics, tekt, tek4107t, tek, none, null, text, data, xterm, png EMBOSS_GRAPHICS value, or x11
Optional qualifiers Allowed values Default
-residuesperline The number of residues to be displayed on each line Any integer value 50
-resbreak Residues before a space Integer 1 or more Same as -residuesperline to give no breaks
-[no]ccolours Colour residues by their consensus value. Yes/No Yes
-cidentity Colour to display identical residues (RED) Any string is accepted RED
-csimilarity Colour to display similar residues (GREEN) Any string is accepted GREEN
-cother Colour to display other residues (BLACK) Any string is accepted BLACK
-docolour Colour residues by table oily, amide etc. Yes/No No
-[no]title Do not display the title Yes/No Yes
-shade Set to BPLW for normal shading so for pair = 1.5,1.0,0.5 and shade = BPLW Residues score Colour 1.5 or over....... BLACK (B) 1.0 to 1.5 ....... BROWN (P) 0.5 to 1.0 ....... WHEAT (L) under 0.5 ....... WHITE (W) The only four letters allowed are BPLW, in any order. Any string is accepted An empty string is accepted
-pair Values to represent identical similar related Any string is accepted 1.5,1.0,0.5
-identity Only match those which are identical in all sequences. Integer 0 or more 0
-[no]box Display prettyboxes Yes/No Yes
-boxcol Colour the background in the boxes Yes/No No
-boxcolval Colour to be used for background. (GREY) Any string is accepted GREY
-[no]name Display the sequence names Yes/No Yes
-maxnamelen Margin size for the sequence name. Any integer value 10
-[no]number Display the residue number Yes/No Yes
-[no]listoptions Display the date and options used Yes/No Yes
-plurality Plurality check value (totweight/2) Any integer value Half the total sequence weighting
-consensus Display the consensus Yes/No No
-[no]collision Allow collisions in calculating consensus Yes/No Yes
-alternative Use alternative collisions routine 0) Normal collision check. (default) 1) checks identical scores with the max score found. So if any other residue matches the identical score then a collision has occurred. 2) If another residue has a greater than or equal to matching score and these do not match then a collision has occurred. 3) Checks all those not in the current consensus.If any of these give a top score for matching or identical scores then a collision has occured. Integer from 0 to 3 0
-matrixfile Matrix file Comparison matrix file in EMBOSS data path EBLOSUM62 for protein
EDNAFULL for DNA
-showscore Print residue scores Any integer value -1
-portrait Set page to Portrait Yes/No No
Advanced qualifiers Allowed values Default
-data (no help text) bool value Yes/No No

Input file format

Any sequence USA.

Output file format

An image of the alignment is displayed.

Data files

Prettyplot uses a comparison matrix file to calculate similarity to the consensus.

For protein sequences EBLOSUM62 is used for the substitution matrix. For nucleotide sequence, EDNAFULL is used.

EMBOSS data files are distributed with the application and stored in the standard EMBOSS data directory, which is defined by EMBOSS environment variable EMBOSS_DATA.

Users can provide their own data files in their own directories. Project specific files can be put in the current directory, or for tidier directory listings in a subdirectory called ".embossdata". Files for all EMBOSS runs can be put in the user's home directory, or again in a subdirectory called ".embossdata".

The directories are searched in the following order:

Notes

None.

References

None.

Warnings

None.

Diagnostic Error Messages

None.

Exit status

It exits with status 0 unless an error is reported.

Known bugs

Portrait mode does not cover the whole page! This is a "feature" in plplot.

See also

Program nameDescription
abiviewReads ABI file and display the trace
cirdnaDraws circular maps of DNA constructs
emmaMultiple alignment program - interface to ClustalW program
infoalignInformation on a multiple sequence alignment
lindnaDraws linear maps of DNA constructs
pepnetDisplays proteins as a helical net
pepwheelShows protein sequences as helices
plotconPlots the quality of conservation of a sequence alignment
prettyseqOutput sequence with translated ranges
remapDisplay a sequence with restriction cut sites, translation etc
seealsoFinds programs sharing group names
showalignDisplays a multiple sequence alignment
showdbDisplays information on the currently available databases
showfeatShow features of a sequence
showseqDisplay a sequence with features, translation etc
textsearchSearch sequence documentation text. SRS and Entrez are faster!
tranalignAlign nucleic coding regions given the aligned proteins

Author(s)

This application was written by Ian Longden (il@sanger.ac.uk) Informatics Division, The Sanger Centre, Wellcome Trust Genome Campus, Hinxton, Cambridge, CB10 1SA, UK.

Many features were first implemented in the EGCG program "prettyplot" by Peter Rice.

The original suggestions for the PrettyPlot program were from Denis Duboule and Sigfried Labeit at EMBL. Gert Vriend added the star marking. Rita Grandori suggested the -NOCOLLISION option.

History

Completed 5th May 1999.

Target users

This program is intended to be used by everyone and everything, from naive users to embedded scripts.

Comments