Module | Bio::Alignment::SiteMethods |
In: |
lib/bio/alignment.rb
|
Bio::Alignment::SiteMethods is a set of methods for Bio::Alignment::Site. It can also be used for extending an array of single-letter strings.
IUPAC_NUC | = | [ %w( t u ), %w( m a c ), %w( r a g ), %w( w a t u ), %w( s c g ), %w( y c t u ), %w( k g t u ), %w( v a c g m r s ), %w( h a c t u m w y ), %w( d a g t u r w k ), %w( b c g t u s y k ), %w( n a c g t u m r w s y k v h d b ) | IUPAC nucleotide groups. Internal use only. | |
StrongConservationGroups | = | %w(STA NEQK NHQK NDEQ QHRK MILV MILF HY FYW).collect { |x| x.split('').sort } |
Table of strongly conserved amino-acid groups.
The value of the tables are taken from BioPerl (Bio/SimpleAlign.pm in BioPerl 1.0), and the BioPerl‘s document says that it is taken from Clustalw documentation and These are all the positively scoring groups that occur in the Gonnet Pam250 matrix. The strong and weak groups are defined as strong score >0.5 and weak score =<0.5 respectively. |
|
WeakConservationGroups | = | %w(CSA ATV SAG STNK STPA SGND SNDEQK NDEQHK NEQHRK FVLIM HFY).collect { |x| x.split('').sort } |
Table of weakly conserved amino-acid groups.
Please refer StrongConservationGroups document for the origin of the table. |
Returns an IUPAC consensus base for the site. If consensus is found, eturns a single-letter string. If not, returns nil.
# File lib/bio/alignment.rb, line 218 218: def consensus_iupac 219: a = self.collect { |x| x.downcase }.sort.uniq 220: if a.size == 1 then 221: case a[0] 222: when 'a', 'c', 'g', 't' 223: a[0] 224: when 'u' 225: 't' 226: else 227: IUPAC_NUC.find { |x| a[0] == x[0] } ? a[0] : nil 228: end 229: elsif r = IUPAC_NUC.find { |x| (a - x).size <= 0 } then 230: r[0] 231: else 232: nil 233: end 234: end
Returns consensus character of the site. If consensus is found, eturns a single-letter string. If not, returns nil.
# File lib/bio/alignment.rb, line 181 181: def consensus_string(threshold = 1.0) 182: return nil if self.size <= 0 183: return self[0] if self.sort.uniq.size == 1 184: h = Hash.new(0) 185: self.each { |x| h[x] += 1 } 186: total = self.size 187: b = h.to_a.sort do |x,y| 188: z = (y[1] <=> x[1]) 189: z = (self.index(x[0]) <=> self.index(y[0])) if z == 0 190: z 191: end 192: if total * threshold <= b[0][1] then 193: b[0][0] 194: else 195: nil 196: end 197: end
If there are gaps, returns true. Otherwise, returns false.
# File lib/bio/alignment.rb, line 164 164: def has_gap? 165: (find { |x| is_gap?(x) }) ? true : false 166: end
Returns the match-line character for the site. This is amino-acid version.
# File lib/bio/alignment.rb, line 258 258: def match_line_amino(opt = {}) 259: # opt[:match_line_char] ==> 100% equal default: '*' 260: # opt[:strong_match_char] ==> strong match default: ':' 261: # opt[:weak_match_char] ==> weak match default: '.' 262: # opt[:mismatch_char] ==> mismatch default: ' ' 263: mlc = (opt[:match_line_char] or '*') 264: smc = (opt[:strong_match_char] or ':') 265: wmc = (opt[:weak_match_char] or '.') 266: mmc = (opt[:mismatch_char] or ' ') 267: a = self.collect { |c| c.upcase }.sort.uniq 268: a.extend(SiteMethods) 269: if a.has_gap? then 270: mmc 271: elsif a.size == 1 then 272: mlc 273: elsif StrongConservationGroups.find { |x| (a - x).empty? } then 274: smc 275: elsif WeakConservationGroups.find { |x| (a - x).empty? } then 276: wmc 277: else 278: mmc 279: end 280: end
Returns the match-line character for the site. This is nucleic-acid version.
# File lib/bio/alignment.rb, line 284 284: def match_line_nuc(opt = {}) 285: # opt[:match_line_char] ==> 100% equal default: '*' 286: # opt[:mismatch_char] ==> mismatch default: ' ' 287: mlc = (opt[:match_line_char] or '*') 288: mmc = (opt[:mismatch_char] or ' ') 289: a = self.collect { |c| c.upcase }.sort.uniq 290: a.extend(SiteMethods) 291: if a.has_gap? then 292: mmc 293: elsif a.size == 1 then 294: mlc 295: else 296: mmc 297: end 298: end