![]() |
EMBOSS: etandem |
Input sequences are converted into ACGT or N (so ambiguity codes are ignored).
The score is +1 for a match, -1 for a mismatch.
The first copy of a repeat is ignored.
The highest score is kept for each start position and repeat size.
The lowest score to be reported is set by the threshold score. The threshold score can be set on the command-line using the -threshold qualifier, the default is 20. For perfect repeats, the score is the length of the repeat (except for the first copy). Reduce the threshold score a little if you wish to to allow mismatches. Each mismatch scores -1 instead of +1 so it scores 2 less than a perfect match of the same number of bases.
Running with a wide range of repeat sizes is inefficient. That is why equicktandem was written - to give a rapid estimate of the major repeat sizes.
% etandem Input sequence: embl:hhtetra Output file [hhtetra.tan]: Minimum repeat size [10]: 6 Maximum repeat size [6]:
Mandatory qualifiers: [-sequence] sequence Sequence USA [-outfile] outfile Output file name -minrepeat integer Minimum repeat size -maxrepeat integer Maximum repeat size Optional qualifiers: (none) Advanced qualifiers: -threshold integer Threshold score -mismatch bool Allow N as a mismatch -uniform bool Allow uniform consensus General qualifiers: -help bool report command line options. More information on associated and general qualifiers can be found with -help -verbose |
Mandatory qualifiers | Allowed values | Default | |
---|---|---|---|
[-sequence] (Parameter 1) |
Sequence USA | Readable sequence | Required |
[-outfile] (Parameter 2) |
Output file name | Output file | <sequence>.etandem |
-minrepeat | Minimum repeat size | Integer, 2 or higher | 10 |
-maxrepeat | Maximum repeat size | Integer, same as -minrepeat or higher | Same as -minrepeat |
Optional qualifiers | Allowed values | Default | |
(none) | |||
Advanced qualifiers | Allowed values | Default | |
-threshold | Threshold score | Any integer value | 20 |
-mismatch | Allow N as a mismatch | Yes/No | No |
-uniform | Allow uniform consensus | Yes/No | No |
The columns of the report show:
An example of the output is:
120 793 936 6 24 93.8 acccta 90 283 420 6 23 84.8 taaccc 38 432 485 6 9 90.7 ccctaa 26 494 529 6 6 94.4 ccctaa 24 568 597 6 5 100.0 aaccct
Program name | Description |
---|---|
einverted | Finds DNA inverted repeats |
equicktandem | Finds tandem repeats |
palindrome | Looks for inverted repeats in a nucleotide sequence |
Running with a wide range of repeat sizes is inefficient. That is why equicktandem was written - to give a rapid estimate of the major repeat sizes.
This application was modified for inclusion in EMBOSS by Peter Rice (pmr@sanger.ac.uk) Informatics Division, The Sanger Centre, Wellcome Trust Genome Campus, Hinxton, Cambridge, CB10 1SA, UK.