nexus.rb

Path: lib/bio/db/nexus.rb
Last Update: Sun Dec 04 02:16:41 +0000 2011

bio/db/nexus.rb - Nexus Standard phylogenetic tree parser / formatter

Copyright:Copyright (C) 2006 Christian M Zmasek <cmzmasek@yahoo.com>
License:The Ruby License

$Id: nexus.rb,v 1.3 2007/04/05 23:35:40 trevor Exp $

Description

This file contains classes that implement a parser for NEXUS formatted data as well as objects to store, access, and write the parsed data.

The following five blocks: taxa, characters, distances, trees, data are recognizable and parsable.

The parser can deal with (nested) comments (indicated by square brackets), unless the comments are inside a command or data item (e.g. "Dim[comment]ensions" or inside a matrix).

Single or double quoted TaxLabels are processed as follows (by way of example): "mus musculus" -> mus_musculus

USAGE

  require 'bio/db/nexus'

  # Create a new parser:
  nexus = Bio::Nexus.new( nexus_data_as_string )

  # Get first taxa block:
  taxa_block = nexus.get_taxa_blocks[ 0 ]
  # Get number of taxa:
  number_of_taxa = taxa_block.get_number_of_taxa.to_i
  # Get name of first taxon:
  first_taxon = taxa_block.get_taxa[ 0 ]

  # Get first data block:
  data_block = nexus.get_data_blocks[ 0 ]
  # Get first characters name:
  seq_name = data_block.get_row_name( 0 )
  # Get first characters row named "taxon_2" as Bio::Sequence sequence:
  seq_tax_2 = data_block.get_sequences_by_name( "taxon_2" )[ 0 ]
  # Get third characters row as Bio::Sequence sequence:
  seq_2 = data_block.get_sequence( 2 )
  # Get first characters row named "taxon_3" as String:
  string_tax_3 = data_block.get_characters_strings_by_name( "taxon_3" )
  # Get name of first taxon:
  taxon_0 = data_block.get_taxa[ 0 ]
  # Get characters matrix as Bio::Nexus::NexusMatrix (names are in column 0)
  characters_matrix = data_block.get_matrix

  # Get first characters block (same methods as Nexus::DataBlock except
  # it lacks get_taxa method):
  characters_block = nexus.get_characters_blocks[ 0 ]

  # Get trees block(s):
  trees_block = nexus.get_trees_blocks[ 0 ]
  # Get first tree named "best" as String:
  string_fish = trees_block.get_tree_strings_by_name( "best" )[ 0 ]
  # Get first tree named "best" as Bio::Db::Newick object:
  tree_fish = trees_block.get_trees_by_name( "best" )[ 0 ]
  # Get first tree as Bio::Db::Newick object:
  tree_first = trees_block.get_tree( 0 )

  # Get distances block(s):
  distances_blocks = nexus.get_distances_blocks
  # Get matrix as Bio::Nexus::NexusMatrix object:
  matrix = distances_blocks[ 0 ].get_matrix
  # Get value (column 0 are names):
  val = matrix.get_value( 1, 5 )

  # Get blocks for which no class exists (private blocks):
  private_blocks = nexus.get_blocks_by_name( "my_block" )
  # Get first block names "my_block":
  my_block_0 = private_blocks[ 0 ]
  # Get first token in first block names "my_block":
  first_token = my_block_0.get_tokens[ 0 ]

References

  • Maddison DR, Swofford DL, Maddison WP (1997). NEXUS: an extensible file format for systematic information. Syst Biol. 1997 46(4):590-621.

Required files

bio/sequence   bio/tree   bio/db/newick  

[Validate]