Class Bio::NBRF
In: lib/bio/db/nbrf.rb
Parent: DB

Sequence data class for NBRF/PIR flatfile format.

Methods

aalen   aaseq   entry   length   nalen   naseq   new   seq   seq_class   to_nbrf   to_s  

Constants

DELIMITER = RS = "\n>"   Delimiter of each entry. Bio::FlatFile uses it.
DELIMITER_OVERRUN = 1   (Integer) excess read size included in DELIMITER.

External Aliases

entry_id -> accession

Attributes

data  [RW]  sequence data of the entry (???)
definition  [RW]  Returns the description line of the NBRF/PIR formatted data.
entry_id  [RW]  Returns ID described in the entry.
entry_overrun  [R]  piece of next entry. Bio::FlatFile uses it.
seq_type  [RW]  Returns sequence type described in the entry.
 P1 (protein), F1 (protein fragment)
 DL (DNA linear), DC (DNA circular)
 RL (DNA linear), RC (DNA circular)
 N3 (tRNA), N1 (other functional RNA)

Public Class methods

Creates a new NBRF object. It stores the comment and sequence information from one entry of the NBRF/PIR format string. If the argument contains more than one entry, only the first entry is used.

[Source]

    # File lib/bio/db/nbrf.rb, line 45
45:     def initialize(str)
46:       str = str.sub(/\A[\r\n]+/, '') # remove first void lines
47:       line1, line2, rest = str.split(/^/, 3)
48: 
49:       rest = rest.to_s
50:       rest.sub!(/^>.*/m, '') # remove trailing entries for sure
51:       @entry_overrun = $&
52:       rest.sub!(/\*\s*\z/, '') # remove last '*' and "\n"
53:       @data = rest
54: 
55:       @definition = line2.to_s.chomp
56:       if /^>?([A-Za-z0-9]{2})\;(.*)/ =~ line1.to_s then
57:         @seq_type = $1
58:         @entry_id = $2
59:       end
60:     end

Creates a NBRF/PIR formatted text. Parameters can be omitted.

[Source]

     # File lib/bio/db/nbrf.rb, line 167
167:     def self.to_nbrf(hash)
168:       seq_type = hash[:seq_type]
169:       seq = hash[:seq]
170:       unless seq_type
171:         if seq.is_a?(Bio::Sequence::AA) then
172:           seq_type = 'P1'
173:         elsif seq.is_a?(Bio::Sequence::NA) then
174:           seq_type = /u/i =~ seq ? 'RL' : 'DL'
175:         else
176:           seq_type = 'XX'
177:         end
178:       end
179:       width = hash.has_key?(:width) ? hash[:width] : 70
180:       if width then
181:         seq = seq.to_s + "*"
182:         seq.gsub!(Regexp.new(".{1,#{width}}"), "\\0\n")
183:       else
184:         seq = seq.to_s + "*\n"
185:       end
186:       ">#{seq_type};#{hash[:entry_id]}\n#{hash[:definition]}\n#{seq}"
187:     end

Public Instance methods

Returens the length of protein (amino acids) sequence. If you call aaseq for nucleic acids sequence, RuntimeError will be occurred. Use the method if you know whether the sequence is NA or AA.

[Source]

     # File lib/bio/db/nbrf.rb, line 157
157:     def aalen
158:       aaseq.length
159:     end

Returens the protein (amino acids) sequence. If you call aaseq for nucleic acids sequence, RuntimeError will be occurred. Use the method if you know whether the sequence is NA or AA.

[Source]

     # File lib/bio/db/nbrf.rb, line 143
143:     def aaseq
144:       if seq.is_a?(Bio::Sequence::NA) then
145:         raise 'not nucleic but protein sequence'
146:       elsif seq.is_a?(Bio::Sequence::AA) then
147:         seq
148:       else
149:         Bio::Sequence::AA.new(seq)
150:       end
151:     end

Returns the stored one entry as a NBRF/PIR format. (same as to_s)

[Source]

    # File lib/bio/db/nbrf.rb, line 84
84:     def entry
85:       @entry = ">#{@seq_type or 'XX'};#{@entry_id}\n#{definition}\n#{@data}*\n"
86:     end

Returns sequence length.

[Source]

     # File lib/bio/db/nbrf.rb, line 115
115:     def length
116:       seq.length
117:     end

Returens the length of sequence. If you call nalen for protein sequence, RuntimeError will be occurred. Use the method if you know whether the sequence is NA or AA.

[Source]

     # File lib/bio/db/nbrf.rb, line 135
135:     def nalen
136:       naseq.length
137:     end

Returens the nucleic acid sequence. If you call naseq for protein sequence, RuntimeError will be occurred. Use the method if you know whether the sequence is NA or AA.

[Source]

     # File lib/bio/db/nbrf.rb, line 122
122:     def naseq
123:       if seq.is_a?(Bio::Sequence::AA) then
124:         raise 'not nucleic but protein sequence'
125:       elsif seq.is_a?(Bio::Sequence::NA) then
126:         seq
127:       else
128:         Bio::Sequence::NA.new(seq)
129:       end
130:     end

Returns sequence data. Returns Bio::Sequence::NA, Bio::Sequence::AA or Bio::Sequence, according to the sequence type.

[Source]

     # File lib/bio/db/nbrf.rb, line 107
107:     def seq
108:       unless defined?(@seq)
109:         @seq = seq_class.new(@data.tr(" \t\r\n0-9", '')) # lazy clean up
110:       end
111:       @seq
112:     end

Returns Bio::Sequence::AA, Bio::Sequence::NA, or Bio::Sequence, depending on sequence type.

[Source]

     # File lib/bio/db/nbrf.rb, line 91
 91:     def seq_class
 92:       case @seq_type
 93:       when /[PF]1/
 94:         # protein
 95:         Sequence::AA
 96:       when /[DR][LC]/, /N[13]/
 97:         # nucleic
 98:         Sequence::NA
 99:       else
100:         Sequence
101:       end
102:     end
to_s()

Alias for entry

[Validate]