EMBOSS: cusp


Program cusp

Function

Create a codon usage table

Description

Reads one or more coding sequences (CDS sequence only) and calculates a codon frequency table.

The output file can be used as a codon usage table in other applications.

Usage

Here is a sample session with cusp, using just one sequence. Normal use would be to extract a set of coding sequences and to use these as input.

% cusp -sbeg 135 -send 1292
Create a codon usage table
Input sequence: embl:paamir
Output file [paamir.cusp]: 

Command line arguments

   Mandatory qualifiers:
  [-sequence]          seqall     Sequence database USA
  [-outfile]           outfile    Output file name

   Optional qualifiers: (none)
   Advanced qualifiers:
   -cfile              codon      Codon usage table name

   General qualifiers:
  -help                bool       report command line options. More
                                  information on associated and general
                                  qualifiers can be found with -help -verbose


Mandatory qualifiers Allowed values Default
[-sequence]
(Parameter 1)
Sequence database USA Readable sequence(s) Required
[-outfile]
(Parameter 2)
Output file name Output file <sequence>.cusp
Optional qualifiers Allowed values Default
(none)
Advanced qualifiers Allowed values Default
-cfile Codon usage table name Codon usage file in EMBOSS data path Ehum.cut

Input file format

Output file format

Here is the start of the output from the example run, which used a single CDS from Pseudomonas aeruginosa which has a very high GC content ands a strong coding bias, as shown by the codons for Alanine where those ending with G or C are used almost exclusively.

# CUSP codon usage file
# Codon Amino acid      Fract   /1000   Number
GCA     A               0.077   7.772   3
GCC     A               0.462   46.632  18
GCG     A               0.462   46.632  18
GCT     A               0.000   0.000   0

.........

The 'Fract' column gives the faction of all amino acids coded for by this codon triplet.

The /1000 column represents the number of codons, given the input sequence(s), there are per 1000 bases. This will be an extrapolation if the sequence is shorter than 1000 bases.

If multiple sequences are input then the statistics are given for all of the sequences together, not individually.

Data files

cusp reads a codon usage file, but only as a template and does not use any of the data so any file will give the same results.

Notes

None.

References

None.

Warnings

None.

Diagnostic Error Messages

None.

Exit status

Always exits with status 0.

Known bugs

None.

See also

Program nameDescription
caiCAI codon adaptation index
chipsCodon usage statistics
codcmpCodon usage table comparison
sycoSynonymous codon usage Gribskov statistic plot

Author(s)

This application was written by Alan Bleasby (ableasby@hgmp.mrc.ac.uk)

History

Spring 2000 - written

Target users

This program is intended to be used by everyone and everything, from naive users to embedded scripts.

Comments