![]() |
EMBOSS: seqmatchall |
The larger the specified word size, the faster the comparison will proceed. Regions whose stretches of identity are shorter than the word size will be missed. You should therefore choose a word size that is small enough to find those regions of similarity you are interested in within a reasonable time-frame.
% seqmatchall Does an all-against-all comparison of a set of sequences Input sequence set: embl:eclac* Word size [4]: 15 Output file [outfile.seqmatchall]:
Mandatory qualifiers: [-sequence] seqset Sequence set USA -wordsize integer Word size [-outfile] outfile Output file name Optional qualifiers: (none) Advanced qualifiers: (none) General qualifiers: -help bool report command line options. More information on associated and general qualifiers can be found with -help -verbose |
Mandatory qualifiers | Allowed values | Default | |
---|---|---|---|
[-sequence] (Parameter 1) |
Sequence set USA | Readable sequences | Required |
-wordsize | Word size | Integer 2 or more | 4 |
[-outfile] (Parameter 2) |
Output file name | Output file | <sequence>.seqmatchall |
Optional qualifiers | Allowed values | Default | |
(none) | |||
Advanced qualifiers | Allowed values | Default | |
(none) |
The sequences must be all either protein or nucleic.
ECLAC (the complete E.coli lac operon) matches ECLACI ECLACZ ECLACY and ECLACA (the individual genes), and there is a short overlap between ECLACY and the flanking genes ECLACZ and ECLACA
1832 5645 7477 ECLAC 0 1832 ECLACA 1113 48 1161 ECLAC 0 1113 ECLACI 1500 4304 5804 ECLAC 0 1500 ECLACY 3078 1286 4364 ECLAC 0 3078 ECLACZ 158 1 159 ECLACA 1342 1500 ECLACY 59 1 60 ECLACY 3019 3078 ECLACZThe output is a list of regions of identity in pairs of sequences, each consisting of one line with 7 columns of data separated by TABs or space characters. The columns of data consist of:
Program name | Description |
---|---|
matcher | Finds the best local alignments between two sequences |
supermatcher | Finds a match of a large sequence against one or more sequences |
water | Smith-Waterman local alignment |
wordmatch | Finds all exact matches of a given size between 2 sequences |
polydot will give a graphical view of the same matches.