EMBOSS: newcpgseek


Program newcpgseek

Function

Reports CpG rich regions

Description

Reports CpG rich regions of a sequence as candidate CpG islands.

Usage

Here is a sample session with newcpgseek.

% newcpgseek
Input sequence: embl:rnu68037
CpG score [17]: 
Output file [rnu68037.newcpgseek]: 

Command line arguments

   Mandatory qualifiers:
  [-sequence]          seqall     Sequence database USA
   -score              integer    CpG score
  [-outfile]           outfile    Output file name

   Optional qualifiers: (none)
   Advanced qualifiers: (none)

Mandatory qualifiers Allowed values Default
[-sequence]
(Parameter 1)
Sequence database USA Readable sequence(s) Required
-score CpG score Integer from 1 to 200 17
[-outfile]
(Parameter 2)
Output file name Output file <sequence>.newcpgseek
Optional qualifiers Allowed values Default
(none)
Advanced qualifiers Allowed values Default
(none)

Input file format

Output file format

Here is the output from the example run:

NEWCPGSEEK of RNU68037 from 1 to 1218
with score > 17 

 Begin    End  Score        CpG  %CG  CG/GC
*    96   1032   630         87  66.1   0.65
  1072   1100    26          3  62.1   0.00
  1183   1193    26          2  72.7   2.00
-------------------------------------------

Data files

Notes

References

Warnings

Diagnostic Error Messages

Exit status

Known bugs

See also

Program nameDescription
chaosCreate a chaos game representation plot for a sequence
chipsCodon usage statistics
codcmpCodon usage table comparison
compseqCounts the composition of dimer/trimer/etc words in a sequence
cpgplotPlot CpG rich areas
cpgreportReports all CpG rich regions
cuspCreate a codon usage table
diffseqFind differences (SNPs) between nearly identical sequences
dotmatcherDisplays a thresholded dotplot of two sequences
dotpathDisplays a non-overlapping wordmatch dotplot of two sequences
dottupDisplays a wordmatch dotplot of two sequences
einvertedFinds DNA inverted repeats
equicktandemFinds tandem repeats
etandemLooks for tandem repeats in a nucleotide sequence
freakResidue/base frequency table or plot
geeceeCalculates the fractional GC content of nucleic acid sequences
isochorePlots isochores in large DNA sequences
newcpgreportReport CpG rich areas
palindromeLooks for inverted repeats in a nucleotide sequence
polydotDisplays all-against-all dotplots of a set of sequences
redataSearch REBASE for enzyme name, references, suppliers etc
restrictFinds restriction enzyme cleavage sites
showseqDisplay a sequence with features, translation etc
silentSilent mutation restriction enzyme scan
tfscanScans DNA sequences for transcription factors
wobbleWobble base plot
wordcountCounts words of a specified size in a DNA sequence

Author(s)

This application was written by Rodrigo Lopez (rls@ebi.ac.uk) European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge, CB10 1SD, UK.

History

Target users

This program is intended to be used by everyone and everything, from naive users to embedded scripts.

Comments