EMBOSS: dan


Program dan

Function

Calculates DNA RNA/DNA melting temperature

Description

Dan calculates the melting temperature (Tm) and the percent G+C of a nucleic acid sequence (optionally plotting them). For the Melting temperature profile, free energy values calculated from nearest neighbor thermodynamics are used (Breslauer et al. Proc. Natl. Acad. Sci. USA 83, 3746-3750 and Baldino et al. Methods in Enzymol. 168, 761-777).

Usage

Here is a sample session with dan.

% dan
Input sequence: embl:paamir
Enter window size [20]: 
Enter Shift Increment [1]: 
Enter DNA concentration (nM) [50.]: 
Enter salt concentration (mM) [50.]: 
Output file [paamir.dan]: 

An example of producing a plot of Tm:

% dan -plot
Input sequence(s): embl:paamir
Enter window size [20]: 
Enter Shift Increment [1]: 
Enter DNA concentration (nM) [50.]: 
Enter salt concentration (mM) [50.]: 
Enter minimum temperature [55.]: 
Graph type [x11]: 

Command line arguments

   Mandatory qualifiers (* if not always prompted):
  [-sequence]          seqall     Sequence database USA
   -windowsize         integer    The values of melting point and other
                                  thermodynamic properties of the sequence are
                                  determined by taking a short length of
                                  sequence known as a window and determining
                                  the properties of the sequence in that
                                  window. The window is incrementally moved
                                  along the sequence with the properties being
                                  calcualted at each new position.
   -shiftincrement     integer    This is the amount by which the window is
                                  moved at each increment in order to find the
                                  melting point and other properties along
                                  the sequence.
   -dnaconc            float      Enter DNA concentration (nM)
   -saltconc           float      Enter salt concentration (mM)
*  -mintemp            float      Enter a minimum value for the temperature
                                  scale (y-axis) of the plot.
*  -graph              xygraph    Graph type
*  -formamide          float      This specifies the percent formamide to be
                                  used in calculations (it is ignored unless
                                  -product is used).
*  -mismatch           float      This specifies the percent mismatch to be
                                  used in calculations (it is ignored unless
                                  -product is used).
*  -prodlen            integer    This specifies the product length to be used
                                  in calculations (it is ignored unless
                                  -product is used).
*  -outfile            report     If a plot is not being produced then data on
                                  the melting point etc. in each window along
                                  the sequence is output to the file.

   Optional qualifiers (* if not always prompted):
*  -temperature        float      If -thermo has been specified then this
                                  specifies the temperature at which to
                                  calculate the DeltaG, DeltaH and DeltaS
                                  values.

   Advanced qualifiers:
   -plot               bool       If this is not specified then the file of
                                  output data is produced, else a plot of the
                                  melting point along the sequence is
                                  produced.
   -rna                bool       This specifies that the sequence is an RNA
                                  sequnce and not a DNA sequence.
   -product            bool       This prompts for percent formamide, percent
                                  of mismatches allowed and product length.
   -thermo             bool       Output the DeltaG, DeltaH and DeltaS values
                                  of the sequence windows to the output data
                                  file.

   General qualifiers:
  -help                bool       report command line options. More
                                  information on associated and general
                                  qualifiers can be found with -help -verbose


Mandatory qualifiers Allowed values Default
[-sequence]
(Parameter 1)
Sequence database USA Readable sequence(s) Required
-windowsize The values of melting point and other thermodynamic properties of the sequence are determined by taking a short length of sequence known as a window and determining the properties of the sequence in that window. The window is incrementally moved along the sequence with the properties being calcualted at each new position. Integer from 1 to 100 20
-shiftincrement This is the amount by which the window is moved at each increment in order to find the melting point and other properties along the sequence. Integer 1 or more 1
-dnaconc Enter DNA concentration (nM) Number from 1.000 to 100000.000 50.
-saltconc Enter salt concentration (mM) Number from 1.000 to 1000.000 50.
-mintemp Enter a minimum value for the temperature scale (y-axis) of the plot. Number from 0.000 to 150.000 55.
-graph Graph type EMBOSS has a list of known devices, including postscript, ps, hpgl, hp7470, hp7580, meta, colourps, cps, xwindows, x11, tektronics, tekt, tek4107t, tek, none, null, text, data, xterm, png EMBOSS_GRAPHICS value, or x11
-formamide This specifies the percent formamide to be used in calculations (it is ignored unless -product is used). Number from 0.000 to 100.000 0.
-mismatch This specifies the percent mismatch to be used in calculations (it is ignored unless -product is used). Number from 0.000 to 100.000 0.
-prodlen This specifies the product length to be used in calculations (it is ignored unless -product is used). Any integer value Window size (20)
-outfile If a plot is not being produced then data on the melting point etc. in each window along the sequence is output to the file. Report file  
Optional qualifiers Allowed values Default
-temperature If -thermo has been specified then this specifies the temperature at which to calculate the DeltaG, DeltaH and DeltaS values. Number from 0.000 to 100.000 25.
Advanced qualifiers Allowed values Default
-plot If this is not specified then the file of output data is produced, else a plot of the melting point along the sequence is produced. Yes/No No
-rna This specifies that the sequence is an RNA sequnce and not a DNA sequence. Yes/No No
-product This prompts for percent formamide, percent of mismatches allowed and product length. Yes/No No
-thermo Output the DeltaG, DeltaH and DeltaS values of the sequence windows to the output data file. Yes/No No

Input file format

Any DNA or RNA sequence USA.

Output file format

If a plot is not being produced, dan reports the sequence of each oligomer window, its melting temperature under the specified conditions and its GC content.

This is the start of the output file from the example.


DAN of: PAAMIR   from: 1  to: 2167

   1 GGTACCGCTGGCCGAGCATC   20 Tm=64.9 GC%=70.0
   2 GTACCGCTGGCCGAGCATCT   21 Tm=63.7 GC%=65.0
   3 TACCGCTGGCCGAGCATCTG   22 Tm=63.7 GC%=65.0
   4 ACCGCTGGCCGAGCATCTGC   23 Tm=66.9 GC%=70.0
   5 CCGCTGGCCGAGCATCTGCT   24 Tm=66.7 GC%=70.0
   6 CGCTGGCCGAGCATCTGCTC   25 Tm=65.5 GC%=70.0
   7 GCTGGCCGAGCATCTGCTCG   26 Tm=65.5 GC%=70.0
   8 CTGGCCGAGCATCTGCTCGA   27 Tm=63.7 GC%=65.0
etc.

The first non-blank line is the title containing the program name, the sequence name and the start and end positions of the sequence to be considered.

Subsequent lines contain columns of data for each window into the sequence as it is moved along, giving:

If the qualifier '-product' is used to make the program prompt for percent formamide percent of mismatches allowed and product length, then the output includes the melting temperature of the specified product:


DAN of: PAAMIR   from: 1  to: 2167

   1 GGTACCGCTGGCCGAGCATC   20 Tm=64.9 GC%=70.0 Tm(prod)=54.9
   2 GTACCGCTGGCCGAGCATCT   21 Tm=63.7 GC%=65.0 Tm(prod)=52.8
   3 TACCGCTGGCCGAGCATCTG   22 Tm=63.7 GC%=65.0 Tm(prod)=52.8
   4 ACCGCTGGCCGAGCATCTGC   23 Tm=66.9 GC%=70.0 Tm(prod)=54.9
   5 CCGCTGGCCGAGCATCTGCT   24 Tm=66.7 GC%=70.0 Tm(prod)=54.9
   6 CGCTGGCCGAGCATCTGCTC   25 Tm=65.5 GC%=70.0 Tm(prod)=54.9
   7 GCTGGCCGAGCATCTGCTCG   26 Tm=65.5 GC%=70.0 Tm(prod)=54.9
   8 CTGGCCGAGCATCTGCTCGA   27 Tm=63.7 GC%=65.0 Tm(prod)=52.8
etc.

If the qualifier '-thermo' is givedn then the DeltaG, DeltaH and DeltaS of the sequence in the window is also output.

Data files

The EMBOSS data files "Edna.melt" and "Erna.melt" are used to read in the entropy/enthalpy/energy data for DNA and RNA respectively.

EMBOSS data files are distributed with the application and stored in the standard EMBOSS data directory, which is defined by EMBOSS environment variable EMBOSS_DATA.

Users can provide their own data files in their own directories. Project specific files can be put in the current directory, or for tidier directory listings in a subdirectory called ".embossdata". Files for all EMBOSS runs can be put in the user's home directory, or again in a subdirectory called ".embossdata".

The directories are searched in the following order:

Notes

None.

References

  1. Breslauer, K.J., Frank, R., Blocker, H., and Marky, L.A. (1986). "Predicting DNA Duplex Stability from the Base Sequence." Proceedings of the National Academy of Sciences USA 83, 3746-3750.
  2. Baldino, M., Jr. (1989). "High Resolution In Situ Hybridization Histochemistry." In Methods in Enzymology, (P.M. Conn, ed.), 168, 761-777, Academic Press, San Diego, California, USA.

Warnings

RNA sequences must be submited to this application with the '-rna' qualifier on the command line, otherwise the sequence will be assumed to be DNA.

Diagnostic Error Messages

None.

Exit status

0 if successful.

Known bugs

None.

See also

Program nameDescription
bananaBending and curvature plot in B-DNA
btwistedCalculates the twisting in a B-DNA sequence
chaosCreate a chaos game representation plot for a sequence
compseqCounts the composition of dimer/trimer/etc words in a sequence
freakResidue/base frequency table or plot
isochorePlots isochores in large DNA sequences
wordcountCounts words of a specified size in a DNA sequence

Author(s)

This program was originally included in EGCG under the names "MELT" and "MELTPLOT", written by Rodrigo Lopez.

This application was written by Alan Bleasby (ableasby@hgmp.mrc.ac.uk)

History

Written (1999) - Alan Bleasby

Target users

This program is intended to be used by everyone and everything, from naive users to embedded scripts.

Comments