![]() |
contacts |
% contacts Reads CCF files (clean coordinate files) and writes CON files (contact files) of intra-chain residue-residue contact data. Location of CCF files (clean coordinate files) (input) [./]: structure Name of data file with van der Waals radii [Evdw.dat]: Threshold contact distance [1.0]: 1 Location of CON files (contact files) (output) [./]: Name of log file for the build [contacts.log]: 1cs4 1ii7 D1CS4A_ D1II7A_ |
Go to the output files for this example
Standard (Mandatory) qualifiers: [-cpdbdir] dirlist This option specifies the location of CCF files (clean coordinate files) (input). A 'clean cordinate file' contains protein coordinate and derived data for a single PDB file ('protein clean coordinate file') or a single domain from SCOP or CATH ('domain clean coordinate file'), in CCF format (EMBL-like). The files, generated by using PDBPARSE (PDB files) or DOMAINER (domains), contain 'cleaned-up' data that is self-consistent and error-corrected. Records for residue solvent accessibility and secondary structure are added to the file by using PDBPLUS. -vdwfile datafile This option specifies the name of the data file with van der Waals radii of atoms for different amino acid residues. -threshold float Contact between two residues is defined as when the van der Waals surface of any atom of the first residue comes within the threshold contact distance of the van der Waals surface of any atom of the second residue. The threshold contact distance is a user-defined distance with a default value of 1 Angstrom. [-conoutdir] outdir This option specifies the location of CON files (contact files) (output). A 'contact file' contains contact data for a protein or a domain from SCOP or CATH, in the CON format (EMBL-like). The contacts may be intra-chain residue-residue, inter-chain residue-residue or residue-ligand. The files are generated by using CONTACTS, INTERFACE and FUNKY. -conerrfile outfile The log file contains messages about any errors arising while contacts ran. Additional (Optional) qualifiers: -[no]ccfnaming boolean This option specifies whether to use pdbid code to name the output files. If set, the PDB identifier code (from the PDB file) is used to name the file. Otherwise, the output files have the same names as the input files. -skip boolean Whether to calculate contacts between residue adjacent in sequence. -ignore float If any two atoms from two different residues are at least this distance apart then no futher inter-atomic contacts will be checked for for that residue pair . This speeds the calculation up considerably. Advanced (Unprompted) qualifiers: (none) Associated qualifiers: "-conerrfile" associated qualifiers -odirectory string Output directory General qualifiers: -auto boolean Turn off prompts -stdout boolean Write standard output -filter boolean Read standard input, write standard output -options boolean Prompt for standard and additional values -debug boolean Write debug output to program.dbg -verbose boolean Report some/full command line options -help boolean Report command line options. More information on associated and general qualifiers can be found with -help -verbose -warning boolean Report warnings -error boolean Report errors -fatal boolean Report fatal errors -die boolean Report deaths |
Standard (Mandatory) qualifiers | Allowed values | Default | |
---|---|---|---|
[-cpdbdir] (Parameter 1) |
This option specifies the location of CCF files (clean coordinate files) (input). A 'clean cordinate file' contains protein coordinate and derived data for a single PDB file ('protein clean coordinate file') or a single domain from SCOP or CATH ('domain clean coordinate file'), in CCF format (EMBL-like). The files, generated by using PDBPARSE (PDB files) or DOMAINER (domains), contain 'cleaned-up' data that is self-consistent and error-corrected. Records for residue solvent accessibility and secondary structure are added to the file by using PDBPLUS. | Directory with files | ./ |
-vdwfile | This option specifies the name of the data file with van der Waals radii of atoms for different amino acid residues. | Data file | Evdw.dat |
-threshold | Contact between two residues is defined as when the van der Waals surface of any atom of the first residue comes within the threshold contact distance of the van der Waals surface of any atom of the second residue. The threshold contact distance is a user-defined distance with a default value of 1 Angstrom. | Any numeric value | 1.0 |
[-conoutdir] (Parameter 2) |
This option specifies the location of CON files (contact files) (output). A 'contact file' contains contact data for a protein or a domain from SCOP or CATH, in the CON format (EMBL-like). The contacts may be intra-chain residue-residue, inter-chain residue-residue or residue-ligand. The files are generated by using CONTACTS, INTERFACE and FUNKY. | Output directory | ./ |
-conerrfile | The log file contains messages about any errors arising while contacts ran. | Output file | contacts.log |
Additional (Optional) qualifiers | Allowed values | Default | |
-[no]ccfnaming | This option specifies whether to use pdbid code to name the output files. If set, the PDB identifier code (from the PDB file) is used to name the file. Otherwise, the output files have the same names as the input files. | Boolean value Yes/No | Yes |
-skip | Whether to calculate contacts between residue adjacent in sequence. | Boolean value Yes/No | No |
-ignore | If any two atoms from two different residues are at least this distance apart then no futher inter-atomic contacts will be checked for for that residue pair . This speeds the calculation up considerably. | Any numeric value | 20.0 |
Advanced (Unprompted) qualifiers | Allowed values | Default | |
(none) |
The format of the clean coordinate file is described in pdbparse
XX Intra-chain residue-residue contact data. XX TY INTRA XX EX THRESH 1.0; IGNORE 20.0; NMOD 1; NCHA 1 XX NE 1 XX EN [1] XX ID PDB 1cs4; DOM .; LIG . XX CN MO 1; CN1 1; CN2 .; ID1 A; ID2 .; NRES1 52; NRES2 . XX S1 SEQUENCE 52 AA; 5817 MW; 47362A43 CRC32; ADIEGFTSLA SQCTAQELVM TLNELFARFD KLAAENHCLR IKILGDCYYC VS XX NC SM 163; LI . XX SM ASP 2 ; ILE 3 SM ASP 2 ; GLU 4 SM ASP 2 ; ASP 46 SM ASP 2 ; CYS 47 SM ILE 3 ; GLU 4 SM ILE 3 ; GLY 5 SM ILE 3 ; PHE 6 SM ILE 3 ; LEU 9 SM ILE 3 ; LEU 25 SM ILE 3 ; ASP 46 SM GLU 4 ; GLY 5 SM GLU 4 ; PHE 6 SM GLY 5 ; PHE 6 SM GLY 5 ; THR 7 SM GLY 5 ; SER 8 SM GLY 5 ; LEU 9 SM PHE 6 ; THR 7 SM PHE 6 ; SER 8 SM PHE 6 ; LEU 9 SM PHE 6 ; ALA 10 SM PHE 6 ; LEU 18 SM PHE 6 ; LEU 22 SM PHE 6 ; GLY 45 SM PHE 6 ; ASP 46 SM THR 7 ; SER 8 SM THR 7 ; LEU 9 SM THR 7 ; ALA 10 SM THR 7 ; SER 11 SM SER 8 ; LEU 9 SM SER 8 ; ALA 10 SM SER 8 ; SER 11 [Part of this file has been deleted for brevity] SM PHE 29 ; LYS 31 SM PHE 29 ; LEU 32 SM PHE 29 ; ALA 33 SM ASP 30 ; LYS 31 SM ASP 30 ; LEU 32 SM ASP 30 ; ALA 33 SM ASP 30 ; ALA 34 SM ASP 30 ; ARG 40 SM LYS 31 ; LEU 32 SM LYS 31 ; ALA 33 SM LYS 31 ; ALA 34 SM LYS 31 ; GLU 35 SM LEU 32 ; ALA 33 SM LEU 32 ; ALA 34 SM LEU 32 ; GLU 35 SM LEU 32 ; ASN 36 SM ALA 33 ; ALA 34 SM ALA 33 ; GLU 35 SM ALA 33 ; ASN 36 SM ALA 33 ; HIS 37 SM ALA 33 ; CYS 38 SM ALA 34 ; GLU 35 SM ALA 34 ; ASN 36 SM ALA 34 ; HIS 37 SM GLU 35 ; ASN 36 SM GLU 35 ; HIS 37 SM ASN 36 ; HIS 37 SM ASN 36 ; CYS 38 SM HIS 37 ; CYS 38 SM HIS 37 ; LEU 39 SM CYS 38 ; LEU 39 SM CYS 38 ; ARG 40 SM LEU 39 ; ARG 40 SM LEU 39 ; ILE 41 SM ARG 40 ; ILE 41 SM ARG 40 ; LYS 42 SM ARG 40 ; ILE 43 SM ILE 41 ; LYS 42 SM LYS 42 ; ILE 43 SM LYS 42 ; LEU 44 SM LYS 42 ; CYS 47 SM ILE 43 ; LEU 44 SM ILE 43 ; GLY 45 SM ILE 43 ; CYS 47 SM LEU 44 ; GLY 45 SM LEU 44 ; ASP 46 SM LEU 44 ; CYS 47 SM GLY 45 ; ASP 46 SM GLY 45 ; CYS 47 SM ASP 46 ; CYS 47 // |
XX Intra-chain residue-residue contact data. XX TY INTRA XX EX THRESH 1.0; IGNORE 20.0; NMOD 1; NCHA 1 XX NE 1 XX EN [1] XX ID PDB 1ii7; DOM .; LIG . XX CN MO 1; CN1 1; CN2 .; ID1 A; ID2 .; NRES1 65; NRES2 . XX S1 SEQUENCE 65 AA; 7396 MW; 0CFB92A3 CRC32; MKFAHLADIH LGYEQFHKPQ REEEFAEAFK NALEIAVQEN VDFILIAGDL FHSSRPSPGT LKKAI XX NC SM 151; LI . XX SM ASP 8 ; ILE 9 SM ASP 8 ; HIS 10 SM ASP 8 ; GLY 48 SM ASP 8 ; ASP 49 SM ILE 9 ; HIS 10 SM ILE 9 ; LEU 11 SM ILE 9 ; PHE 25 SM ILE 9 ; PHE 29 SM ILE 9 ; ILE 46 SM ILE 9 ; ASP 49 SM ILE 9 ; LEU 50 SM HIS 10 ; LEU 11 SM HIS 10 ; GLY 12 SM HIS 10 ; TYR 13 SM HIS 10 ; PHE 25 SM HIS 10 ; ASP 49 SM HIS 10 ; LEU 50 SM LEU 11 ; GLY 12 SM LEU 11 ; TYR 13 SM LEU 11 ; ALA 26 SM LEU 11 ; PHE 29 SM LEU 11 ; LEU 50 SM GLY 12 ; TYR 13 SM GLY 12 ; GLU 14 SM GLY 12 ; GLU 22 SM TYR 13 ; GLU 14 SM TYR 13 ; GLN 15 SM TYR 13 ; GLU 22 SM TYR 13 ; PHE 25 SM GLU 14 ; GLN 15 [Part of this file has been deleted for brevity] SM ASN 31 ; ILE 35 SM ALA 32 ; LEU 33 SM ALA 32 ; GLU 34 SM ALA 32 ; ILE 35 SM ALA 32 ; ALA 36 SM LEU 33 ; GLU 34 SM LEU 33 ; ILE 35 SM LEU 33 ; ALA 36 SM LEU 33 ; VAL 37 SM LEU 33 ; ILE 44 SM GLU 34 ; ILE 35 SM GLU 34 ; ALA 36 SM GLU 34 ; VAL 37 SM GLU 34 ; GLN 38 SM ILE 35 ; ALA 36 SM ILE 35 ; VAL 37 SM ILE 35 ; GLN 38 SM ILE 35 ; GLU 39 SM ALA 36 ; VAL 37 SM ALA 36 ; GLN 38 SM ALA 36 ; GLU 39 SM ALA 36 ; ASN 40 SM ALA 36 ; VAL 41 SM ALA 36 ; ILE 44 SM VAL 37 ; GLN 38 SM VAL 37 ; GLU 39 SM VAL 37 ; ASN 40 SM GLN 38 ; GLU 39 SM GLN 38 ; ASN 40 SM GLU 39 ; ASN 40 SM GLU 39 ; VAL 41 SM ASN 40 ; VAL 41 SM ASN 40 ; ASP 42 SM VAL 41 ; ASP 42 SM VAL 41 ; PHE 43 SM VAL 41 ; ILE 44 SM ASP 42 ; PHE 43 SM PHE 43 ; ILE 44 SM PHE 43 ; LEU 45 SM ILE 44 ; LEU 45 SM ILE 44 ; ILE 46 SM LEU 45 ; ILE 46 SM LEU 45 ; ALA 47 SM ILE 46 ; ALA 47 SM ILE 46 ; GLY 48 SM ILE 46 ; LEU 50 SM ALA 47 ; GLY 48 SM GLY 48 ; ASP 49 SM GLY 48 ; LEU 50 SM ASP 49 ; LEU 50 // |
XX Intra-chain residue-residue contact data. XX TY INTRA XX EX THRESH 1.0; IGNORE 20.0; NMOD 1; NCHA 1 XX NE 1 XX EN [1] XX ID PDB 1CS4; DOM D1CS4A_; LIG . XX CN MO 1; CN1 1; CN2 .; ID1 A; ID2 .; NRES1 225; NRES2 . XX S1 SEQUENCE 225 AA; 25486 MW; 437C8290 CRC32; MHHHHHHAME MKADINAKQE DMMFHKIYIQ KHDNVSILFA DIEGFTSLAS QCTAQELVMT LNELFARFDK LAAENHCLRI KILGDCYYCV SGLPEARADH AHCCVEMGMD MIEAISLVRE MTGVNVNMRV GIHSGRVHCG VLGLRKWQFD VWSNDVTLAN HMEAGGKAGR IHITKATLSY LNGDYEVEPG CGGERNAYLK EHSIETFLIL RCTQKRKEEK AMIAK XX NC SM 843; LI . XX SM MET 22 ; MET 23 SM MET 22 ; PHE 24 SM MET 22 ; HIS 25 SM MET 22 ; LYS 26 SM MET 23 ; PHE 24 SM MET 23 ; HIS 25 SM PHE 24 ; HIS 25 SM PHE 24 ; LYS 26 SM HIS 25 ; LYS 26 SM HIS 25 ; ILE 27 SM HIS 25 ; LEU 144 SM LYS 26 ; ILE 27 SM LYS 26 ; TYR 28 SM ILE 27 ; TYR 28 SM ILE 27 ; ILE 29 SM TYR 28 ; ILE 29 SM TYR 28 ; GLY 140 SM TYR 28 ; VAL 141 SM TYR 28 ; TRP 147 SM ILE 29 ; GLN 30 SM ILE 29 ; HIS 138 SM ILE 29 ; CYS 139 SM ILE 29 ; GLY 140 SM ILE 29 ; VAL 141 SM ILE 29 ; LEU 142 SM ILE 29 ; TRP 152 SM GLN 30 ; LYS 31 SM GLN 30 ; HIS 32 [Part of this file has been deleted for brevity] SM GLY 192 ; GLU 194 SM GLY 192 ; ARG 195 SM GLY 192 ; ASN 196 SM GLY 192 ; LEU 199 SM GLY 192 ; THR 206 SM GLY 193 ; GLU 194 SM GLY 193 ; ARG 195 SM GLY 193 ; ASN 196 SM GLY 193 ; LEU 199 SM GLY 193 ; LYS 200 SM GLU 194 ; ARG 195 SM GLU 194 ; ASN 196 SM ARG 195 ; ASN 196 SM ASN 196 ; ALA 197 SM ASN 196 ; TYR 198 SM ASN 196 ; LEU 199 SM ASN 196 ; LYS 200 SM ALA 197 ; TYR 198 SM ALA 197 ; LEU 199 SM ALA 197 ; LYS 200 SM ALA 197 ; GLU 201 SM TYR 198 ; LEU 199 SM TYR 198 ; LYS 200 SM TYR 198 ; GLU 201 SM TYR 198 ; HIS 202 SM LEU 199 ; LYS 200 SM LEU 199 ; GLU 201 SM LEU 199 ; HIS 202 SM LEU 199 ; SER 203 SM LEU 199 ; ILE 204 SM LEU 199 ; THR 206 SM LYS 200 ; GLU 201 SM LYS 200 ; HIS 202 SM LYS 200 ; SER 203 SM GLU 201 ; HIS 202 SM GLU 201 ; SER 203 SM HIS 202 ; SER 203 SM HIS 202 ; ILE 204 SM SER 203 ; ILE 204 SM SER 203 ; GLU 205 SM ILE 204 ; GLU 205 SM ILE 204 ; THR 206 SM GLU 205 ; THR 206 SM GLU 205 ; PHE 207 SM THR 206 ; PHE 207 SM PHE 207 ; LEU 208 SM PHE 207 ; ILE 209 SM LEU 208 ; ILE 209 SM LEU 208 ; LEU 210 SM ILE 209 ; LEU 210 // |
XX Intra-chain residue-residue contact data. XX TY INTRA XX EX THRESH 1.0; IGNORE 20.0; NMOD 1; NCHA 1 XX NE 1 XX EN [1] XX ID PDB 1II7; DOM D1II7A_; LIG . XX CN MO 1; CN1 1; CN2 .; ID1 A; ID2 .; NRES1 65; NRES2 . XX S1 SEQUENCE 65 AA; 7396 MW; 0CFB92A3 CRC32; MKFAHLADIH LGYEQFHKPQ REEEFAEAFK NALEIAVQEN VDFILIAGDL FHSSRPSPGT LKKAI XX NC SM 151; LI . XX SM ASP 8 ; ILE 9 SM ASP 8 ; HIS 10 SM ASP 8 ; GLY 48 SM ASP 8 ; ASP 49 SM ILE 9 ; HIS 10 SM ILE 9 ; LEU 11 SM ILE 9 ; PHE 25 SM ILE 9 ; PHE 29 SM ILE 9 ; ILE 46 SM ILE 9 ; ASP 49 SM ILE 9 ; LEU 50 SM HIS 10 ; LEU 11 SM HIS 10 ; GLY 12 SM HIS 10 ; TYR 13 SM HIS 10 ; PHE 25 SM HIS 10 ; ASP 49 SM HIS 10 ; LEU 50 SM LEU 11 ; GLY 12 SM LEU 11 ; TYR 13 SM LEU 11 ; ALA 26 SM LEU 11 ; PHE 29 SM LEU 11 ; LEU 50 SM GLY 12 ; TYR 13 SM GLY 12 ; GLU 14 SM GLY 12 ; GLU 22 SM TYR 13 ; GLU 14 SM TYR 13 ; GLN 15 SM TYR 13 ; GLU 22 SM TYR 13 ; PHE 25 SM GLU 14 ; GLN 15 [Part of this file has been deleted for brevity] SM ASN 31 ; ILE 35 SM ALA 32 ; LEU 33 SM ALA 32 ; GLU 34 SM ALA 32 ; ILE 35 SM ALA 32 ; ALA 36 SM LEU 33 ; GLU 34 SM LEU 33 ; ILE 35 SM LEU 33 ; ALA 36 SM LEU 33 ; VAL 37 SM LEU 33 ; ILE 44 SM GLU 34 ; ILE 35 SM GLU 34 ; ALA 36 SM GLU 34 ; VAL 37 SM GLU 34 ; GLN 38 SM ILE 35 ; ALA 36 SM ILE 35 ; VAL 37 SM ILE 35 ; GLN 38 SM ILE 35 ; GLU 39 SM ALA 36 ; VAL 37 SM ALA 36 ; GLN 38 SM ALA 36 ; GLU 39 SM ALA 36 ; ASN 40 SM ALA 36 ; VAL 41 SM ALA 36 ; ILE 44 SM VAL 37 ; GLN 38 SM VAL 37 ; GLU 39 SM VAL 37 ; ASN 40 SM GLN 38 ; GLU 39 SM GLN 38 ; ASN 40 SM GLU 39 ; ASN 40 SM GLU 39 ; VAL 41 SM ASN 40 ; VAL 41 SM ASN 40 ; ASP 42 SM VAL 41 ; ASP 42 SM VAL 41 ; PHE 43 SM VAL 41 ; ILE 44 SM ASP 42 ; PHE 43 SM PHE 43 ; ILE 44 SM PHE 43 ; LEU 45 SM ILE 44 ; LEU 45 SM ILE 44 ; ILE 46 SM LEU 45 ; ILE 46 SM LEU 45 ; ALA 47 SM ILE 46 ; ALA 47 SM ILE 46 ; GLY 48 SM ILE 46 ; LEU 50 SM ALA 47 ; GLY 48 SM GLY 48 ; ASP 49 SM GLY 48 ; LEU 50 SM ASP 49 ; LEU 50 // |
1cs4 1ii7 D1CS4A_ D1II7A_ |
contacts reads a directory of domain or protein coordinate files and writes a contacts file of intra-chain residue-residue contact data in embl-like format for each file in the input directory. Each output file contains residue contact data for every chain of every model in a protein coordinate file, or contact data for the single scop domain where a domain coordinate file is read. The paths and extensions for the coordinate (input) and contact (output) files are specified by the user. The scop domain or pdb identifier codes are used as appropriate to name the output files. A log file is also written.
The embl-like format used for the contact files (below) uses the following records:
(1) ID - either the 4-character PDB identifier code (where clean protein coordinate files are used as input) or the 7-character domain identifier code taken from scop (where domain coordinate files were used as input; see documentation for the EMBOSS application scope for further info.)
(2) DE - bibliographic information. The text "Residue-residue contact data" is always given.
(4) EX - experimental information. The value of the threshold contact distance is given as a floating point number after 'THRESH'. The number of models and number of polypeptide chains are given after 'NMOD' and 'NCHA' respectively. domain coordinate files a 1 is always given. Following the EX record, the file will have a section containing a CN, IN and SM records (see below) for each chain. The sections for each chain of a model are given after the MO record.
(5) MO - model number. The number given in brackets after this record indicates the start of a section of model-specific data.
(6) CN - chain number. The number given in brackets after this record indicates the start of a section of chain-specific data.
(7) IN - chain specific data. The character given after ID is the PDB chain identifier taken from the input file, (a '.` given in cases where a chain identifier was not specified in the original pdb file or, for domain coordinate files, the domain is comprised of more than one domain). The number of amino acid residues comprising the chain (or the chains from which a domain is comprised) is given after NR. The number of residue-residue contacts is given after NSMCON.
(8) SM - Line of residue contact data. Pairs of amino acid identifiers and residue numbers are delimited by a ';'. Residue numbers are taken from the clean coordinate file and give a correct index into the sequence (i.e. they are not necessarily the same as the original pdb file).
(9) XX - used for spacing.
(10) // - given on the last line of the file only.
Note - SM records are used for contacts between either either side-chain or main-chain atoms as defined above. In a future implementation, SS will be used for side-chain only contacts, MM will be used for main-chain only contacts, and there will probably be several other forms of contact too.
Excerpt from contacts output file
ID D1HBBB_ XX DE Residue-residue side-chain contact data XX EX THRESH 10.0; NMOD 1; NCHA 1; XX MO [1] XX CN [1] XX IN ID B; NR 146; NSMCON 2514; XX SM VAL 1 ; HIS 2 SM VAL 1 ; LEU 3 SM VAL 1 ; THR 4 SM VAL 1 ; PRO 5 SM VAL 1 ; GLU 6 SM VAL 1 ; GLU 7 SM VAL 1 ; LYS 8 SM VAL 1 ; VAL 11 SM VAL 1 ; PHE 71
DE File of van der Waals radii for atoms in proteins XX NR 24 XX AA ALA XX ID A XX NN 12 XX AT N ; 1.7 AT CA ; 1.9 AT C ; 1.7 AT O ; 1.4 AT CB ; 1.9 AT OXT ; 1.4 AT H ; 1.2 AT OH ; 1.4 AT HA ; 1.2 AT HB ; 1.2 AT HG ; 1.2 AT D ; 1.2 // AA ARG XX ID R XX NN 31 XX AT N ; 1.7 AT CA ; 1.9 AT C ; 1.7 AT O ; 1.4 AT N ; 1.7 ** < data ommitted for clarity > ** // AA XAA XX ID X XX NN 6 XX AT C ; 1.9 AT N ; 1.7 AT O ; 1.4 AT H ; 1.2 AT S ; 1.8 AT D ; 1.2 //
Excerpt of log file // DS002__ WARN Could not open for reading cpdb file s002.pxyz // DS003__ WARN Could not open for reading cpdb file s003.pxyz
Program name | Description |
---|---|
contactcount | Counts specific versus non-specific contacts in a directory of cleaned protein chain contact files |
domainalign | Generates DAF files (domain alignment files) of structure-based sequence alignments for nodes in a DCF file (domain classification file) |
domainrep | Reorder DCF file (domain classification file) so that the representative structure of each user-specified node is given first |
domainreso | Removes low resolution domains from a DCF file (domain classification file) |
interface | Reads CCF files (clean coordinate files) and writes CON files (contact files) of inter-chain residue-residue contact data |
libgen | Generates various types of discriminating elements for each alignment in a directory |
psiphi | Calculates phi and psi torsion angles from cleaned EMBOSS-style protein co-ordinate file |
rocon | Reads a DHF file (domain hits file) of hits (sequences of unknown structural classification) and a DHF file of validation sequences (known classification) and writes a 'hits file' for the hits, which are classified and rank-ordered on the basis of score |
rocplot | Provides interpretation and graphical display of the performance of discriminating elements (e.g. profiles for protein families). rocplot reads file(s) of hits from discriminator-database search(es), performs ROC analysis on the hits, and writes graphs illustrating the diagnostic performance of the discriminating elements |
seqalign | Reads a DAF file (domain alignment file) and a DHF file (domain hits file) and writes a DAF file extended with the hits |
seqfraggle | Removes fragments from DHF files (domain hits files) or other files of sequences |
seqsearch | Generate database hits (sequences) for nodes in a DCF file (domain classification file) by using PSI-BLAST |
seqsort | Reads DHF files (domain hits files) of database hits (sequences) and removes hits of ambiguous classification |
seqwords | Generates DHF files (domain hits files) of database hits (sequences) for nodes in a DCF file (domain classification file) by keyword search of UniProt |
siggen | Generates a sparse protein signature from an alignment and residue contact data |
sigscan | Generates a DHF file (domain hits file) of hits (sequences) from scanning a signature against a sequence database |
A 'domain coordinate file' contains coordinate and other data for a single scop domain. The files are generated by domainer and are in embl-like and pdb formats.
siggen uses contacts files as input.