EMBOSS: complex


Program complex

Function

Find the linguistic complexity in nucleotide sequences

Description

Usage

Here is a sample session with complex.

% complex -omnia
Input sequence: embl:*
Output sequence [hscad5.fasta]: 
Window length [100]: 
Step size [5]: 
Minimum word length [4]: 
Maximum word length [6]: 
Output file [hscad5.complex]: 
do embComWriteFile

HSCAD5 3170
HSD 781
HSEGL1 3919
HSFAU 518
HSFOS 6210
HSEF2 3075
HSHT 1658
CEZK637 40699
PDRHOD 1675
ECLAC 7477
ECLACA 1832
ECLACI 1113
ECLACY 1500
ECLACZ 3078
PAAMIB 1212
PAAMIE 1065
PAAMIR 2167
PAAMIS 1130
MMAM 366
RNOPS 1493
RNU68037 1218
HHTETRA 1272
100 5 0 4 6

Command line arguments

   Mandatory qualifiers (* if not always prompted):
  [-sequence]          seqall     Sequence database USA
*  -outseq             seqoutall  Output sequence(s) USA
   -lwin               integer    Window length
   -step               integer    the displacement of the window over the
                                  sequence
   -jmin               integer    Minimum word length
   -jmax               integer    Maximum word length
   -outfile            outfile    Output file name
*  -ujtable            outfile    UjTable temporary file name

   Optional qualifiers: (none)
   Advanced qualifiers:
   -omnia              bool       calculate over a set of sequences
   -sim                integer    calculate the linguistic complexity by
                                  comparison with a number of simulations
                                  having a uniform distribution of bases
   -freq               bool       execute the simulation of a sequence based
                                  on the base frequency of the original
                                  sequence
   -print              bool       generate a file named UjTable containing the
                                  values of Uj for each word j in the real
                                  sequence(s) and in any simulated sequences


Mandatory qualifiers Allowed values Default
[-sequence]
(Parameter 1)
Sequence database USA Readable sequence(s) Required
-outseq Output sequence(s) USA Writeable sequence(s) <sequence>.format
-lwin Window length Any integer value 100
-step the displacement of the window over the sequence Any integer value 5
-jmin Minimum word length Integer from 2 to 20 4
-jmax Maximum word length Integer from 2 to 50 6
-outfile Output file name Output file <sequence>.complex
-ujtable UjTable temporary file name Output file complex.ujtable
Optional qualifiers Allowed values Default
(none)
Advanced qualifiers Allowed values Default
-omnia calculate over a set of sequences Yes/No No
-sim calculate the linguistic complexity by comparison with a number of simulations having a uniform distribution of bases Any integer value 0
-freq execute the simulation of a sequence based on the base frequency of the original sequence Yes/No No
-print generate a file named UjTable containing the values of Uj for each word j in the real sequence(s) and in any simulated sequences Yes/No No

Input file format

Output file format

Here is the main output file from the example.

Sequence TEMBL:HHTETRA contains repeats and is included in the test database for repeat analysis.


Length of window : 100 
jmin : 4 
jmax : 6 
step : 5 
Execution without simulation 
----------------------------------------------------------------------------
|                  |                  |                  |                  |
|     number of    |      name of     |     length of    |      value of    |
|     sequence     |     sequence     |     sequence     |     complexity   |
|                  |                  |                  |                  |
----------------------------------------------------------------------------
         1                      HSCAD5           3170             0.6921 
         2                         HSD            781             0.6991 
         3                      HSEGL1           3919             0.6618 
         4                       HSFAU            518             0.6739 
         5                       HSFOS           6210             0.6681 
         6                       HSEF2           3075             0.6925 
         7                        HSHT           1658             0.7314 
         8                     CEZK637          40699             0.6307 
         9                      PDRHOD           1675             0.6201 
        10                       ECLAC           7477             0.7137 
        11                      ECLACA           1832             0.6916 
        12                      ECLACI           1113             0.7480 
        13                      ECLACY           1500             0.6801 
        14                      ECLACZ           3078             0.7278 
        15                      PAAMIB           1212             0.6596 
        16                      PAAMIE           1065             0.6418 
        17                      PAAMIR           2167             0.6562 
        18                      PAAMIS           1130             0.6989 
        19                        MMAM            366             0.7163 
        20                       RNOPS           1493             0.6571 
        21                    RNU68037           1218             0.6381 
        22                     HHTETRA           1272             0.3114 

Data files

Notes

References

Warnings

Diagnostic Error Messages

Exit status

Known bugs

See also

Program nameDescription
bananaBending and curvature plot in B-DNA
btwistedCalculates the twisting in a B-DNA sequence
danCalculates DNA RNA/DNA melting temperature

Author(s)

This application was written by Donata Colangelo (areadc37@area.ba.cnr.it)

History

Target users

This program is intended to be used by everyone and everything, from naive users to embedded scripts.

Comments