Class Bio::KEGG::GENES
In: lib/bio/db/kegg/genes.rb
Parent: KEGGDB

Methods

Included Modules

Common::DblinksAsHash Common::PathwaysAsHash Common::OrthologsAsHash

Constants

DELIMITER = RS = "\n///\n"
TAGSIZE = 12

Public Class methods

Creates a new Bio::KEGG::GENES object.


Arguments:

Returns:Bio::KEGG::GENES object

[Source]

     # File lib/bio/db/kegg/genes.rb, line 115
115:   def initialize(entry)
116:     super(entry, TAGSIZE)
117:   end

Public Instance methods

Returns length of the amino acid sequence described in the AASEQ lines.


Returns:Integer

[Source]

     # File lib/bio/db/kegg/genes.rb, line 393
393:   def aalen
394:     fetch('AASEQ')[/\d+/].to_i
395:   end

Returns amino acid sequence described in the AASEQ lines.


Returns:Bio::Sequence::AA object

[Source]

     # File lib/bio/db/kegg/genes.rb, line 383
383:   def aaseq
384:     unless @data['AASEQ']
385:       @data['AASEQ'] = Bio::Sequence::AA.new(fetch('AASEQ').gsub(/\d+/, ''))
386:     end
387:     @data['AASEQ']
388:   end

Chromosome described in the POSITION line.


Returns:String or nil

[Source]

     # File lib/bio/db/kegg/genes.rb, line 264
264:   def chromosome
265:     if position[/:/]
266:       position.sub(/:.*/, '')
267:     elsif ! position[/\.\./]
268:       position
269:     else
270:       nil
271:     end
272:   end

Codon usage data described in the CODON_USAGE lines. (Deprecated: no more exists)


Returns:Hash

[Source]

     # File lib/bio/db/kegg/genes.rb, line 350
350:   def codon_usage(codon = nil)
351:     unless @data['CODON_USAGE']
352:       hash = Hash.new
353:       list = cu_list
354:       base = %w(t c a g)
355:       base.each_with_index do |x, i|
356:         base.each_with_index do |y, j|
357:           base.each_with_index do |z, k|
358:             hash["#{x}#{y}#{z}"] = list[i*16 + j*4 + k]
359:           end
360:         end
361:       end
362:       @data['CODON_USAGE'] = hash
363:     end
364:     @data['CODON_USAGE']
365:   end

Codon usage data described in the CODON_USAGE lines as an array.


Returns:Array

[Source]

     # File lib/bio/db/kegg/genes.rb, line 370
370:   def cu_list
371:     ary = []
372:     get('CODON_USAGE').sub(/.*/,'').each_line do |line| # cut 1st line
373:       line.chomp.sub(/^.{11}/, '').scan(/..../) do |cu|
374:         ary.push(cu.to_i)
375:       end
376:     end
377:     return ary
378:   end
dblinks()

Alias for dblinks_as_hash

Returns a Hash of the DB name and an Array of entry IDs in DBLINKS field.

[Source]

    # File lib/bio/db/kegg/genes.rb, line 97
97:   def dblinks_as_hash; super; end

Links to other databases described in the DBLINKS lines.


Returns:Array containing String objects

[Source]

     # File lib/bio/db/kegg/genes.rb, line 332
332:   def dblinks_as_strings
333:     lines_fetch('DBLINKS')
334:   end

Definition of the entry, described in the DEFINITION line.


Returns:String

[Source]

     # File lib/bio/db/kegg/genes.rb, line 199
199:   def definition
200:     field_fetch('DEFINITION')
201:   end

Division of the entry, described in the ENTRY line.


Returns:String

[Source]

     # File lib/bio/db/kegg/genes.rb, line 149
149:   def division
150:     entry['division']                   # CDS, tRNA etc.
151:   end

Enzyme‘s EC numbers shown in the DEFINITION line.


Returns:Array containing String

[Source]

     # File lib/bio/db/kegg/genes.rb, line 206
206:   def eclinks
207:     unless defined? @eclinks
208:       ec_list = 
209:         definition.slice(/\[EC\:([^\]]+)\]/, 1) ||
210:         definition.slice(/\(EC\:([^\)]+)\)/, 1)
211:       ary = ec_list ? ec_list.strip.split(/\s+/) : []
212:       @eclinks = ary
213:     end
214:     @eclinks
215:   end

Returns the "ENTRY" line content as a Hash. For example,

  {"organism"=>"E.coli", "division"=>"CDS", "id"=>"b0356"}

Returns:Hash

[Source]

     # File lib/bio/db/kegg/genes.rb, line 125
125:   def entry
126:     unless @data['ENTRY']
127:       hash = Hash.new('')
128:       if get('ENTRY').length > 30
129:         e = get('ENTRY')
130:         hash['id']       = e[12..29].strip
131:         hash['division'] = e[30..39].strip
132:         hash['organism'] = e[40..80].strip
133:       end
134:       @data['ENTRY'] = hash
135:     end
136:     @data['ENTRY']
137:   end

ID of the entry, described in the ENTRY line.


Returns:String

[Source]

     # File lib/bio/db/kegg/genes.rb, line 142
142:   def entry_id
143:     entry['id']
144:   end

The position in the genome described in the POSITION line as GenBank feature table location formatted string.


Returns:String

[Source]

     # File lib/bio/db/kegg/genes.rb, line 278
278:   def gbposition
279:     position.sub(/.*?:/, '')
280:   end

The method will be deprecated. Use entry.names.first instead.

Returns the first gene name described in the NAME line.


Returns:String

[Source]

     # File lib/bio/db/kegg/genes.rb, line 192
192:   def gene
193:     genes.first
194:   end

The method will be deprecated. Use Bio::KEGG::GENES#names.

Names of the entry as an Array, described in the NAME line.


Returns:Array containing String

[Source]

     # File lib/bio/db/kegg/genes.rb, line 182
182:   def genes
183:     names_as_array
184:   end

Returns CLASS field of the entry.

[Source]

     # File lib/bio/db/kegg/genes.rb, line 242
242:   def keggclass
243:     field_fetch('CLASS')
244:   end

Returns an Array of biological classes in CLASS field.

[Source]

     # File lib/bio/db/kegg/genes.rb, line 247
247:   def keggclasses
248:     keggclass.gsub(/ \[[^\]]+/, '').split(/\] ?/)
249:   end

The position in the genome described in the POSITION line as Bio::Locations object.


Returns:Bio::Locations object

[Source]

     # File lib/bio/db/kegg/genes.rb, line 286
286:   def locations
287:     Bio::Locations.new(gbposition)
288:   end

The specification of the method will be changed in the future. Please use Bio::KEGG::GENES#motifs.

Motif information described in the MOTIF lines.


Returns:Hash

[Source]

     # File lib/bio/db/kegg/genes.rb, line 325
325:   def motif
326:     motifs
327:   end
motifs()

Alias for motifs_as_hash

Motif information described in the MOTIF lines.


Returns:Hash

[Source]

     # File lib/bio/db/kegg/genes.rb, line 300
300:   def motifs_as_hash
301:     unless @data['MOTIF']
302:       hash = {}
303:       db = nil
304:       motifs_as_strings.each do |line|
305:         if line[/^\S+:/]
306:           db, str = line.split(/:/, 2)
307:         else
308:           str = line
309:         end
310:         hash[db] ||= []
311:         hash[db] += str.strip.split(/\s+/)
312:       end
313:       @data['MOTIF'] = hash
314:     end
315:     @data['MOTIF']              # Hash of Array of IDs in MOTIF
316:   end

Motif information described in the MOTIF lines.


Returns:Strings

[Source]

     # File lib/bio/db/kegg/genes.rb, line 293
293:   def motifs_as_strings
294:     lines_fetch('MOTIF')
295:   end
nalen()

Alias for ntlen

Returns the NAME line.


Returns:String

[Source]

     # File lib/bio/db/kegg/genes.rb, line 163
163:   def name
164:     field_fetch('NAME')
165:   end
names()

Alias for names_as_array

Names of the entry as an Array, described in the NAME line.


Returns:Array containing String

[Source]

     # File lib/bio/db/kegg/genes.rb, line 171
171:   def names_as_array
172:     name.split(', ')
173:   end
naseq()

Alias for ntseq

Returns nucleic acid sequence length.


Returns:Integer

[Source]

     # File lib/bio/db/kegg/genes.rb, line 411
411:   def ntlen
412:     fetch('NTSEQ')[/\d+/].to_i
413:   end

Returns nucleic acid sequence described in the NTSEQ lines.


Returns:Bio::Sequence::NA object

[Source]

     # File lib/bio/db/kegg/genes.rb, line 400
400:   def ntseq
401:     unless @data['NTSEQ']
402:       @data['NTSEQ'] = Bio::Sequence::NA.new(fetch('NTSEQ').gsub(/\d+/, ''))
403:     end
404:     @data['NTSEQ']
405:   end

Organism name of the entry, described in the ENTRY line.


Returns:String

[Source]

     # File lib/bio/db/kegg/genes.rb, line 156
156:   def organism
157:     entry['organism']                   # H.sapiens etc.
158:   end
orthologs()

Alias for orthologs_as_hash

Returns a Hash of the orthology ID and definition in ORTHOLOGY field.

[Source]

     # File lib/bio/db/kegg/genes.rb, line 107
107:   def orthologs_as_hash; super; end

Orthologs described in the ORTHOLOGY lines.


Returns:Array containing String

[Source]

     # File lib/bio/db/kegg/genes.rb, line 220
220:   def orthologs_as_strings
221:     lines_fetch('ORTHOLOGY')
222:   end

Returns the PATHWAY lines as a String.


Returns:String

[Source]

     # File lib/bio/db/kegg/genes.rb, line 227
227:   def pathway
228:     unless defined? @pathway
229:       @pathway = fetch('PATHWAY')
230:     end
231:     @pathway
232:   end
pathways()

Alias for pathways_as_hash

Returns a Hash of the pathway ID and name in PATHWAY field.

[Source]

     # File lib/bio/db/kegg/genes.rb, line 102
102:   def pathways_as_hash; super; end

Pathways described in the PATHWAY lines.


Returns:Array containing String

[Source]

     # File lib/bio/db/kegg/genes.rb, line 237
237:   def pathways_as_strings
238:     lines_fetch('PATHWAY')
239:   end

The position in the genome described in the POSITION line.


Returns:String

[Source]

     # File lib/bio/db/kegg/genes.rb, line 254
254:   def position
255:     unless @data['POSITION']
256:       @data['POSITION'] = fetch('POSITION').gsub(/\s/, '')
257:     end
258:     @data['POSITION']
259:   end

Returns structure ID information described in the STRUCTURE lines.


Returns:Array containing String

[Source]

     # File lib/bio/db/kegg/genes.rb, line 339
339:   def structure
340:     unless @data['STRUCTURE']
341:       @data['STRUCTURE'] = fetch('STRUCTURE').sub(/(PDB: )*/,'').split(/\s+/)
342:     end
343:     @data['STRUCTURE'] # ['PDB:1A9X', ...]
344:   end
structures()

Alias for structure

[Validate]