![]() |
EMBOSS: maskfeat |
The feature table is then searched for features whose type matches the specified feature type to be masked. By default, the type is 'repeat*' (i.e. any type whose name starts with 'repeat'). You can specify the name of any other type of feature, or features that you wish to mask. If you wish to specify more than one type of feature, separate their names with spaces or commas. The names of the types of feature to be found may be wild-carded with asterisks '*' to find gruops of feature types sharing a common part of their names.
If you are unsure of the names of feature types in use, please consult http://www3.ebi.ac.uk/Services/WebFeat/ for a list of the EMBL feature types and see Appendix A of the Swissprot user manual in http://www.expasy.ch/txt/userman.txt for a list of the Swissprot feature types.
If any features matching the specified names of feature types are found, then those regions of the sequence will be masked out by replacing that part of the sequence by masking characters. The default masking characters are 'X' for a protein sequence and 'N' for a nucleic acid sequence, although you can specify your own masking character, if required.
% maskfeat em:AB000360 Mask off features of a sequence. Output sequence [ab000360.fasta]:
Mandatory qualifiers: [-sequence] seqall Sequence database USA [-outseq] seqout Output sequence USA Optional qualifiers: -type string By default any feature in the feature table with a type starting 'repeat' is masked. You can set this to be any feature type you wish to mask. See http://www3.ebi.ac.uk/Services/WebFeat/ for a list of the EMBL feature types and see Appendix A of the Swissprot user manual in http://www.expasy.ch/txt/userman.txt for a list of the Swissprot feature types. The type may be wildcarded by using '*'. If you wish to mask more than one type, separate their names with spaces or commas, eg: *UTR repeat* -maskchar string Character to use when masking. Default is 'X' for protein sequences, 'N' for nucleic sequences. Advanced qualifiers: (none) General qualifiers: -help bool report command line options. More information on associated and general qualifiers can be found with -help -verbose |
Mandatory qualifiers | Allowed values | Default | |
---|---|---|---|
[-sequence] (Parameter 1) |
Sequence database USA | Readable sequence(s) | Required |
[-outseq] (Parameter 2) |
Output sequence USA | Writeable sequence | <sequence>.format |
Optional qualifiers | Allowed values | Default | |
-type | By default any feature in the feature table with a type starting 'repeat' is masked. You can set this to be any feature type you wish to mask. See http://www3.ebi.ac.uk/Services/WebFeat/ for a list of the EMBL feature types and see Appendix A of the Swissprot user manual in http://www.expasy.ch/txt/userman.txt for a list of the Swissprot feature types. The type may be wildcarded by using '*'. If you wish to mask more than one type, separate their names with spaces or commas, eg: *UTR repeat* | Any string is accepted | repeat* |
-maskchar | Character to use when masking. Default is 'X' for protein sequences, 'N' for nucleic sequences. | Any string is accepted | 'X' for protein, 'N' for nucleic |
Advanced qualifiers | Allowed values | Default | |
(none) |
The output from the above example is:
>AB000360 AB000360 Homo sapiens PIGC gene, complete cds. ggatccctgctgcagagggggtaacggtgtctggcttgccaagcaatatttgttgtggtc tatcatggaagaaataaagtcgggcaatatgaattttttttttctcaaatttgccggatg gctgtggtgtttctgactcttagttttctcattgtgaaaaaggaatgattatcttcttcg atcctctcaagagtttccttgttttgagtagattgatagctctttaaaggatgctaagct cagctaatggaagaagagtctagtttctttgaggctttgattttggttaaactatagagc tcatacctttctgtatggtgcagcttactattgtctttggattggtaacttaaaaaatac aaataacatgcctttgagaaccaataaaaactatggatattatccctataaatttacaca aatccagatataagcatgcaatgtgatatacctaagggatatgtgaaccactgagttaag aactgctttagagggagatacaatgtgagacacaggctttgggataagactttggtttga atcctggctctgctctgttaccttagggcaaagttacttaagcatcttgaatctcagctt ttttaccaaagcaggactaatactaacttacaaggtggtgaggattaagtgaaagaagat acataaggcacttagcacatagtaggtactcaataagcgatagctaacagatgtctatta ttattcaaggaattataattttcaaatctgaaatgcagttttaatgtcccataaggtgac taccacatacatttttctcagacttttagtaaactgagttgatttgactttatctcagta ctactcttgacctttcacaactttcgtaggttcacagtctctctttttctaggaacttgg ctgtgttgtcctgcctcagagacaaattcatctattgtaggcctagcccctgcctttgaa aacaaggaaaggttggtagaacatcaacacagcatggaatttccagggaggtctcatttc aaaacttcataaagaacaagaaccacctggacttctgtgagggcgatgattaaactggcc tgagtttgaatgaaaggataatgtatgctcaacctgtgactaacaccaaggaggtcaagt ggcagaaggtcttgtatgagcgacagccctttcctgataactatgtggaccggcgattcc tggaagagctccggaaaaacatccatgctcggaaataccaatattgggctgtggtatttg agtccagtgtggtgatccagcagctgtgcagtgtttgtgtttttgtggttatctggtggt atatggatgagggtcttctggccccccattggcttttagggactggcctggcttcttcac tgattgggtatgttttgtttgatctcattgatggaggtgaagggcggaagaagagtgggc agacccggtgggctgacctgaagagtgccctagtcttcattactttcacttatgggtttt caccagtgctgaagacccttacagagtctgtcagcactgacaccatctatgccatgtcag tcttcatgctgttaggccatctcatcttttttgactatggtgccaatgctgccattgtat ccagcacactatccttgaacatggccatctttgcttctgtatgcttggcatcacgtcttc cccggtccctgcatgccttcatcatggtgacatttgccattcagatttttgccctgtggc ccatgttgcagaagaaactaaaggcatgtactccccggagctatgtgggggtcacactgc tttttgcattttcagccgtgggaggcctactgtccattagtgctgtgggagccgtactct ttgcccttctgctgatgtctatctcatgtctgtgttcattctacctcattcgcttgcagc tttttaaagaaaacattcatgggccttgggatgaagctgaaatcaaggaagacttgtcca ggttcctcagttaaattaggacatccattacattattaaagcaagctgatagattagcct cctaactagtatagaacttaaagacagagttccattctggaagcagcatgtcattgtggt aagagaatagagatcaaaaccaaaaaaaatgaaccaaaggcttgggtggtgagggtgctt atcctttctgttattttgtagatgaaaaaactttctggggacctcttgaattacatgctg taacatatgaagtgatgtggtttctattaaaaaaataacacatccatcaagttgtctcat gatttttccataaacaggaggcagacagaggggcatgaagagtgaagtaaNNNNNNNNNN NNNNNNNNNNNNNNNNaaagtcacttctttctacccttttcaatgtgctaatgctctttt atttatctagggctcaaatcttagaacacagggtgctatgctcagttttgttgcccaaga tcacagaattggttacttaaccttgactcagagtttctaccttgttcttagggaagcata tcacaactaattgcaaagcagagtgtgatgtgtcacaataagcagaatgctagggggaat tc
Program name | Description |
---|---|
biosed | Replace or delete sequence sections |
coderet | Extract CDS, mRNA and translations from feature tables |
cutseq | Removes a specified section from a sequence |
degapseq | Removes gap characters from sequences |
descseq | Alter the name or description of a sequence |
entret | Reads and writes (returns) flatfile entries |
extractfeat | Extract features from a sequence |
extractseq | Extract regions from a sequence |
listor | Writes a list file of the logical OR of two sets of sequences |
maskseq | Mask off regions of a sequence |
newseq | Type in a short new sequence |
noreturn | Removes carriage return from ASCII files |
notseq | Excludes a set of sequences and writes out the remaining ones |
nthseq | Writes one sequence from a multiple set of sequences |
pasteseq | Insert one sequence into another |
revseq | Reverse and complement a sequence |
seqret | Reads and writes (returns) sequences |
seqretsplit | Reads and writes (returns) sequences in individual files |
showfeat | Show features of a sequence |
splitter | Split a sequence into (overlapping) smaller sequences |
swissparse | Retrieves sequences from swissprot using keyword search |
trimest | Trim poly-A tails off EST sequences |
trimseq | Trim ambiguous bits off the ends of sequences |
union | Reads sequence fragments and builds one sequence |
vectorstrip | Strips out DNA between a pair of vector sequences |
yank | Reads a sequence range, appends the full USA to a list file |
maskseq simply masks a user-specified set of regions, without using annotated features.