Class | Bio::NBRF |
In: |
lib/bio/db/nbrf.rb
|
Parent: | DB |
DELIMITER | = | RS = "\n>" | Delimiter of each entry. Bio::FlatFile uses it. | |
DELIMITER_OVERRUN | = | 1 | (Integer) excess read size included in DELIMITER. |
entry_id | -> | accession |
data | [RW] | sequence data of the entry (???) |
definition | [RW] | Returns the description line of the NBRF/PIR formatted data. |
entry_id | [RW] | Returns ID described in the entry. |
entry_overrun | [R] | piece of next entry. Bio::FlatFile uses it. |
seq_type | [RW] |
Returns sequence type described in the entry.
P1 (protein), F1 (protein fragment) DL (DNA linear), DC (DNA circular) RL (DNA linear), RC (DNA circular) N3 (tRNA), N1 (other functional RNA) |
Creates a new NBRF object. It stores the comment and sequence information from one entry of the NBRF/PIR format string. If the argument contains more than one entry, only the first entry is used.
# File lib/bio/db/nbrf.rb, line 45 45: def initialize(str) 46: str = str.sub(/\A[\r\n]+/, '') # remove first void lines 47: line1, line2, rest = str.split(/^/, 3) 48: 49: rest = rest.to_s 50: rest.sub!(/^>.*/m, '') # remove trailing entries for sure 51: @entry_overrun = $& 52: rest.sub!(/\*\s*\z/, '') # remove last '*' and "\n" 53: @data = rest 54: 55: @definition = line2.to_s.chomp 56: if /^>?([A-Za-z0-9]{2})\;(.*)/ =~ line1.to_s then 57: @seq_type = $1 58: @entry_id = $2 59: end 60: end
Creates a NBRF/PIR formatted text. Parameters can be omitted.
# File lib/bio/db/nbrf.rb, line 167 167: def self.to_nbrf(hash) 168: seq_type = hash[:seq_type] 169: seq = hash[:seq] 170: unless seq_type 171: if seq.is_a?(Bio::Sequence::AA) then 172: seq_type = 'P1' 173: elsif seq.is_a?(Bio::Sequence::NA) then 174: seq_type = /u/i =~ seq ? 'RL' : 'DL' 175: else 176: seq_type = 'XX' 177: end 178: end 179: width = hash.has_key?(:width) ? hash[:width] : 70 180: if width then 181: seq = seq.to_s + "*" 182: seq.gsub!(Regexp.new(".{1,#{width}}"), "\\0\n") 183: else 184: seq = seq.to_s + "*\n" 185: end 186: ">#{seq_type};#{hash[:entry_id]}\n#{hash[:definition]}\n#{seq}" 187: end
Returens the protein (amino acids) sequence. If you call aaseq for nucleic acids sequence, RuntimeError will be occurred. Use the method if you know whether the sequence is NA or AA.
# File lib/bio/db/nbrf.rb, line 143 143: def aaseq 144: if seq.is_a?(Bio::Sequence::NA) then 145: raise 'not nucleic but protein sequence' 146: elsif seq.is_a?(Bio::Sequence::AA) then 147: seq 148: else 149: Bio::Sequence::AA.new(seq) 150: end 151: end
Returens the nucleic acid sequence. If you call naseq for protein sequence, RuntimeError will be occurred. Use the method if you know whether the sequence is NA or AA.
# File lib/bio/db/nbrf.rb, line 122 122: def naseq 123: if seq.is_a?(Bio::Sequence::AA) then 124: raise 'not nucleic but protein sequence' 125: elsif seq.is_a?(Bio::Sequence::NA) then 126: seq 127: else 128: Bio::Sequence::NA.new(seq) 129: end 130: end
Returns sequence data. Returns Bio::Sequence::NA, Bio::Sequence::AA or Bio::Sequence, according to the sequence type.
# File lib/bio/db/nbrf.rb, line 107 107: def seq 108: unless defined?(@seq) 109: @seq = seq_class.new(@data.tr(" \t\r\n0-9", '')) # lazy clean up 110: end 111: @seq 112: end
Returns Bio::Sequence::AA, Bio::Sequence::NA, or Bio::Sequence, depending on sequence type.
# File lib/bio/db/nbrf.rb, line 91 91: def seq_class 92: case @seq_type 93: when /[PF]1/ 94: # protein 95: Sequence::AA 96: when /[DR][LC]/, /N[13]/ 97: # nucleic 98: Sequence::NA 99: else 100: Sequence 101: end 102: end