class Bio::Tree
This is the class for phylogenetic tree. It stores a phylogenetic tree.
Internally, it is based on Bio::Pathway
class. However, users cannot handle Bio::Pathway
object directly.
This is alpha version. Incompatible changes may be made frequently.
Constants
- DEFAULT_OPTIONS
default options
Attributes
tree options; mainly used for tree output
root node of this tree (even if unrooted tree, it is used by some methods)
Public Class Methods
Creates a new phylogenetic tree. When no arguments are given, it creates a new empty tree. When a Tree
object is given, it copies the tree. Note that the new tree shares Node
and Edge
objects with the given tree.
# File lib/bio/tree.rb 258 def initialize(tree = nil) 259 # creates an undirected adjacency list graph 260 @pathway = Bio::Pathway.new([], true) 261 @root = nil 262 @options = {} 263 _init_cache 264 self.concat(tree) if tree 265 end
Public Instance Methods
Adds a new edge to the tree. Returns the newly added edge. If the edge already exists, it is overwritten with new one.
# File lib/bio/tree.rb 380 def add_edge(source, target, edge = Edge.new) 381 _clear_cache 382 @pathway.append(Bio::Relation.new(source, target, edge)) 383 edge 384 end
Adds a node to the tree. Returns self. If the node already exists, it does nothing.
# File lib/bio/tree.rb 402 def add_node(node) 403 _clear_cache 404 @pathway.graph[node] ||= {} 405 self 406 end
Shows the adjacency matrix representation of the tree. It shows matrix only for given nodes. If nodes is nil or is ommitted, it acts the same as tree.adjacency_matrix(tree.nodes). If a block is given, for each edge, it yields source, target, and edge, and uses the returned value of the block. Without blocks, it uses edge. Returns a matrix object.
# File lib/bio/tree.rb 822 def adjacency_matrix(nodes = nil, 823 default_value = nil, 824 diagonal_value = nil) #:yields: source, target, edge 825 nodes ||= self.nodes 826 size = nodes.size 827 hash = {} 828 nodes.each_with_index { |x, i| hash[x] = i } 829 # prepares an matrix 830 matrix = Array.new(size, nil) 831 matrix.collect! { |x| Array.new(size, default_value) } 832 (0...size).each { |i| matrix[i][i] = diagonal_value } 833 # fills the matrix from each edge 834 self.each_edge do |source, target, edge| 835 i_source = hash[source] 836 i_target = hash[target] 837 if i_source and i_target then 838 val = block_given? ? (yield source, target, edge) : edge 839 matrix[i_source][i_target] = val 840 matrix[i_target][i_source] = val 841 end 842 end 843 Matrix.rows(matrix, false) 844 end
Returns an array of adjacent nodes of the given node.
# File lib/bio/tree.rb 331 def adjacent_nodes(node) 332 h = @pathway.graph[node] 333 h ? h.keys : [] 334 end
Gets all ancestral nodes of the node. If root isn’t specified or root is nil
, @root is used. Returns an array of Node
s. The result is unspecified for cyclic trees.
# File lib/bio/tree.rb 757 def ancestors(node, root = nil) 758 root ||= @root 759 (self.path(root, node) - [ node ]).reverse 760 end
Gets the adjacent children nodes of the node. If root isn’t specified or root is nil
, @root is used. Returns an array of Node
s. The result is unspecified for cyclic trees.
# File lib/bio/tree.rb 701 def children(node, root = nil) 702 root ||= @root 703 c = self.adjacent_nodes(node) 704 c.delete(self.parent(node, root)) 705 c 706 end
Clears all nodes and edges. Returns self. Note that options and root are also cleared.
# File lib/bio/tree.rb 289 def clear 290 initialize 291 self 292 end
Removes all edges connected with the node. Returns self. If the node does not exist, raises IndexError.
# File lib/bio/tree.rb 417 def clear_node(node) 418 unless self.include?(node) 419 raise IndexError, 'the node does not exist' 420 end 421 _clear_cache 422 @pathway.relations.delete_if do |rel| 423 rel.node.include?(node) 424 end 425 @pathway.graph[node].each_key do |k| 426 @pathway.graph[k].delete(node) 427 end 428 @pathway.graph[node].clear 429 self 430 end
Replaces each edge by each block’s return value. Returns self.
# File lib/bio/tree.rb 527 def collect_edge! #:yields: source, target, edge 528 _clear_cache 529 @pathway.relations.each do |rel| 530 newedge = yield rel.node[0], rel.node[1], rel.relation 531 rel.edge = newedge 532 @pathway.append(rel, false) 533 end 534 self 535 end
Replaces each node by each block’s return value. Returns self.
# File lib/bio/tree.rb 506 def collect_node! #:yields: node 507 _clear_cache 508 tr = {} 509 self.each_node do |node| 510 tr[node] = yield node 511 end 512 # replaces nodes in @pathway.relations 513 @pathway.relations.each do |rel| 514 rel.node.collect! { |node| tr[node] } 515 end 516 # re-generates @pathway from relations 517 @pathway.to_list 518 # adds orphan nodes 519 tr.each_value do |newnode| 520 @pathway.graph[newnode] ||= {} 521 end 522 self 523 end
Concatenates the other tree. If the same edge exists, the edge in other is used. Returns self. The result is unspecified if other isn’t a Tree
object. Note that the Node
and Edge
objects in the other tree are shared in the concatinated tree.
# File lib/bio/tree.rb 595 def concat(other) 596 #raise TypeError unless other.kind_of?(self.class) 597 _clear_cache 598 other.each_node do |node| 599 self.add_node(node) 600 end 601 other.each_edge do |node1, node2, edge| 602 self.add_edge(node1, node2, edge) 603 end 604 self 605 end
Gets all descendent nodes of the node. If root isn’t specified or root is nil
, @root is used. Returns an array of Node
s. The result is unspecified for cyclic trees.
# File lib/bio/tree.rb 712 def descendents(node, root = nil) 713 root ||= @root 714 distance, route = @pathway.breadth_first_search(root) 715 d = distance[node] 716 result = [] 717 distance.each do |key, val| 718 if val > d then 719 x = key 720 while x = route[x] 721 if x == node then 722 result << key 723 break 724 end 725 break if distance[x] <= d 726 end 727 end 728 end 729 result 730 end
Returns distance between node1 and node2. It would raise error if the edges didn’t contain distance values. The result is unspecified for cyclic trees.
# File lib/bio/tree.rb 640 def distance(node1, node2) 641 distance = 0 642 self.each_edge_in_path(node1, node2) do |source, target, edge| 643 distance += get_edge_distance(edge) 644 end 645 distance 646 end
Calculates distance matrix of given nodes. If nodes is nil, or is ommited, it acts the same as tree.distance_matrix(tree.leaves). Returns a matrix object. The result is unspecified for cyclic trees. Note 1: The diagonal values of the matrix are 0. Note 2: If the distance cannot be calculated, nil will be set.
# File lib/bio/tree.rb 793 def distance_matrix(nodes = nil) 794 nodes ||= self.leaves 795 matrix = [] 796 nodes.each_index do |i| 797 row = [] 798 nodes.each_index do |j| 799 if i == j then 800 distance = 0 801 elsif r = matrix[j] and val = r[i] then 802 distance = val 803 else 804 distance = (self.distance(nodes[i], nodes[j]) rescue nil) 805 end 806 row << distance 807 end 808 matrix << row 809 end 810 Matrix.rows(matrix, false) 811 end
Iterates over each edges of this tree.
# File lib/bio/tree.rb 311 def each_edge #:yields: source, target, edge 312 @pathway.relations.each do |rel| 313 yield rel.node[0], rel.node[1], rel.relation 314 end 315 self 316 end
Iterates over each edge from node1 to node2. The result is unspecified for cyclic trees.
# File lib/bio/tree.rb 626 def each_edge_in_path(node1, node2) 627 path = self.path(node1, node2) 628 source = path.shift 629 path.each do |target| 630 edge = self.get_edge(source, target) 631 yield source, target, edge 632 source = target 633 end 634 self 635 end
Iterates over each node of this tree.
# File lib/bio/tree.rb 305 def each_node(&x) #:yields: node 306 @pathway.graph.each_key(&x) 307 self 308 end
Iterates over each connected edges of the given node. Returns self.
The reason why the method name is “each_out_edge” is that it comes from the Boost Graph Library.
# File lib/bio/tree.rb 355 def each_out_edge(source) #:yields: source, target, edge 356 h = @pathway.graph[source] 357 h.each { |key, val| yield source, key, val } if h 358 self 359 end
Returns all edges an array of [ node0, node1, edge ]
# File lib/bio/tree.rb 319 def edges 320 @pathway.relations.collect do |rel| 321 [ rel.node[0], rel.node[1], rel.relation ] 322 end 323 end
Returns an edge from source to target. If source and target are not adjacent nodes, returns nil.
# File lib/bio/tree.rb 372 def get_edge(source, target) 373 h = @pathway.graph[source] 374 h ? h[target] : nil 375 end
Gets distance value from the given edge. Returns float or any other numeric value or nil.
# File lib/bio/tree.rb 100 def get_edge_distance(edge) 101 begin 102 dist = edge.distance 103 rescue NoMethodError 104 dist = edge 105 end 106 dist 107 end
Gets distance string from the given edge. Returns a string or nil.
# File lib/bio/tree.rb 111 def get_edge_distance_string(edge) 112 begin 113 dist = edge.distance_string 114 rescue NoMethodError 115 dist = (edge ? edge.to_s : nil) 116 end 117 dist 118 end
Returns edge1 + edge2
# File lib/bio/tree.rb 121 def get_edge_merged(edge1, edge2) 122 dist1 = get_edge_distance(edge1) 123 dist2 = get_edge_distance(edge2) 124 if dist1 and dist2 then 125 Edge.new(dist1 + dist2) 126 elsif dist1 then 127 Edge.new(dist1) 128 elsif dist2 then 129 Edge.new(dist2) 130 else 131 Edge.new 132 end 133 end
# File lib/bio/tree.rb 237 def get_node_bootstrap(node) 238 begin 239 node.bootstrap 240 rescue NoMethodError 241 nil 242 end 243 end
# File lib/bio/tree.rb 245 def get_node_bootstrap_string(node) 246 begin 247 node.bootstrap_string 248 rescue NoMethodError 249 nil 250 end 251 end
Finds a node in the tree by given name and returns the node. If the node does not found, returns nil. If multiple nodes with the same name exist, the result would be one of those (unspecified).
# File lib/bio/tree.rb 390 def get_node_by_name(str) 391 self.each_node do |node| 392 if get_node_name(node) == str 393 return node 394 end 395 end 396 nil 397 end
Gets node name
# File lib/bio/tree.rb 229 def get_node_name(node) 230 begin 231 node.name 232 rescue NoMethodError 233 node.to_s 234 end 235 end
If the node exists, returns true. Otherwise, returns false.
# File lib/bio/tree.rb 410 def include?(node) 411 @pathway.graph[node] ? true : false 412 end
Insert a new node between adjacent nodes node1 and node2. The old edge between node1 and node2 are changed to the edge between new_node and node2. The edge between node1 and new_node is newly created.
If new_distance is specified, the distance between node1 and new_node is set to new_distance, and distance between new_node and node2 is set to tree.get_edge(node1, node2).distance - new_distance
.
Returns self. If node1 and node2 are not adjacent, raises IndexError.
If new_node already exists in the tree, the tree would become circular. In addition, if the edge between new_node and node1 (or node2) already exists, it will be erased.
# File lib/bio/tree.rb 890 def insert_node(node1, node2, new_node, new_distance = nil) 891 unless edge = self.get_edge(node1, node2) then 892 raise IndexError, 'nodes not found or two nodes are not adjacent' 893 end 894 _clear_cache 895 new_edge = Edge.new(new_distance) 896 self.remove_edge(node1, node2) 897 self.add_edge(node1, new_node, new_edge) 898 if new_distance and old_distance = get_edge_distance(edge) then 899 old_distance -= new_distance 900 begin 901 edge.distance = old_distance 902 rescue NoMethodError 903 edge = old_distance 904 end 905 end 906 self.add_edge(new_node, node2, edge) 907 self 908 end
If node is nil, returns an array of all leaves (nodes connected with one edge). Otherwise, gets all descendent leaf nodes of the node. If root isn’t specified or root is nil
, @root is used. Returns an array of Node
s. The result is unspecified for cyclic trees.
# File lib/bio/tree.rb 738 def leaves(node = nil, root = nil) 739 unless node then 740 nodes = [] 741 self.each_node do |x| 742 nodes << x if self.out_degree(x) == 1 743 end 744 return nodes 745 else 746 root ||= @root 747 self.descendents(node, root).find_all do |x| 748 self.adjacent_nodes(x).size == 1 749 end 750 end 751 end
Gets the lowest common ancestor of the two nodes. If root isn’t specified or root is nil
, @root is used. Returns a Node
object or nil. The result is unspecified for cyclic trees.
# File lib/bio/tree.rb 766 def lowest_common_ancestor(node1, node2, root = nil) 767 root ||= @root 768 _, route = @pathway.breadth_first_search(root) 769 x = node1; r1 = [] 770 begin; r1 << x; end while x = route[x] 771 x = node2; r2 = [] 772 begin; r2 << x; end while x = route[x] 773 return (r1 & r2).first 774 end
Returns all nodes as an array.
# File lib/bio/tree.rb 295 def nodes 296 @pathway.graph.keys 297 end
Returns number of edges in the tree.
# File lib/bio/tree.rb 326 def number_of_edges 327 @pathway.relations.size 328 end
Number of nodes.
# File lib/bio/tree.rb 300 def number_of_nodes 301 @pathway.nodes 302 end
Returns number of edges in the given node.
The reason why the method name is “out_degree” is that it comes from the Boost Graph Library.
# File lib/bio/tree.rb 365 def out_degree(source) 366 h = @pathway.graph[source] 367 h ? h.size : 0 368 end
Returns all connected edges with adjacent nodes. Returns an array of the array [ source, target, edge ].
The reason why the method name is “out_edges” is that it comes from the Boost Graph Library.
# File lib/bio/tree.rb 341 def out_edges(source) 342 h = @pathway.graph[source] 343 if h 344 h.collect { |key, val| [ source, key, val ] } 345 else 346 [] 347 end 348 end
Returns formatted text (or something) of the tree Currently supported format is: :newick, :nhx
# File lib/bio/tree/output.rb 230 def output(format, *arg, &block) 231 case format 232 when :newick 233 output_newick(*arg, &block) 234 when :nhx 235 output_nhx(*arg, &block) 236 when :phylip_distance_matrix 237 output_phylip_distance_matrix(*arg, &block) 238 else 239 raise 'Unknown format' 240 end 241 end
Returns a newick formatted string. If block is given, the order of the node is sorted (as the same manner as Enumerable#sort).
Available options:
:indent
-
indent string; set false to disable (default: ‘ ’)
:bootstrap_style
-
:disabled
disables bootstrap representations.:traditional
for traditional style.:molphy
for Molphy style (default).
# File lib/bio/tree/output.rb 198 def output_newick(options = {}, &block) #:yields: node1, node2 199 root = @root 200 root ||= self.nodes.first 201 return '();' unless root 202 __to_newick([], root, 0, :__to_newick_format_leaf, options, &block) + 203 __to_newick_format_leaf(root, Edge.new, options) + 204 ";\n" 205 end
Returns a NHX (New Hampshire eXtended) formatted string. If block is given, the order of the node is sorted (as the same manner as Enumerable#sort).
Available options:
:indent
-
indent string; set false to disable (default: ‘ ’)
# File lib/bio/tree/output.rb 218 def output_nhx(options = {}, &block) #:yields: node1, node2 219 root = @root 220 root ||= self.nodes.first 221 return '();' unless root 222 __to_newick([], root, 0, 223 :__to_newick_format_leaf_NHX, options, &block) + 224 __to_newick_format_leaf_NHX(root, Edge.new, options) + 225 ";\n" 226 end
Generates phylip-style distance matrix as a string. if nodes is not given, all leaves in the tree are used. If the names of some of the given (or default) nodes are not defined or are empty, the names are automatically generated.
# File lib/bio/tree/output.rb 251 def output_phylip_distance_matrix(nodes = nil, options = {}) 252 nodes = self.leaves unless nodes 253 names = nodes.collect do |x| 254 y = get_node_name(x) 255 y = sprintf("%x", x.__id__.abs) if y.empty? 256 y 257 end 258 m = self.distance_matrix(nodes) 259 Bio::Phylip::DistanceMatrix.generate(m, names, options) 260 end
Gets the parent node of the node. If root isn’t specified or root is nil
, @root is used. Returns an Node
object or nil. The result is unspecified for cyclic trees.
# File lib/bio/tree.rb 687 def parent(node, root = nil) 688 root ||= @root 689 raise IndexError, 'can not get parent for unrooted tree' unless root 690 unless ret = _get_cached_parent(node, root) then 691 ret = self.path(root, node)[-2] 692 _cache_parent(node, ret, root) 693 end 694 ret 695 end
Gets path from node1 to node2. Returns an array of nodes, including node1 and node2. If node1 and/or node2 do not exist, IndexError is raised. If node1 and node2 are not connected, NoPathError
is raised. The result is unspecified for cyclic trees.
# File lib/bio/tree.rb 612 def path(node1, node2) 613 raise IndexError, 'node1 not found' unless @pathway.graph[node1] 614 raise IndexError, 'node2 not found' unless @pathway.graph[node2] 615 return [ node1 ] if node1 == node2 616 return [ node1, node2 ] if @pathway.graph[node1][node2] 617 _, path = @pathway.bfs_shortest_path(node1, node2) 618 unless path[0] == node1 and path[-1] == node2 then 619 raise NoPathError, 'node1 and node2 are not connected' 620 end 621 path 622 end
# Removes an edge between source and target. # Returns self. # If the edge does not exist, raises IndexError. +
# File lib/bio/tree.rb 465 def remove_edge(source, target) 466 unless self.get_edge(source, target) then 467 raise IndexError, 'edge not found' 468 end 469 _clear_cache 470 fwd = [ source, target ] 471 rev = [ target, source ] 472 @pathway.relations.delete_if do |rel| 473 rel.node == fwd or rel.node == rev 474 end 475 h = @pathway.graph[source] 476 h.delete(target) if h 477 h = @pathway.graph[target] 478 h.delete(source) if h 479 self 480 end
Removes each edge if the block returns not nil. Returns self.
# File lib/bio/tree.rb 484 def remove_edge_if #:yields: source, target, edge 485 _clear_cache 486 removed_rel = [] 487 @pathway.relations.delete_if do |rel| 488 if yield rel.node[0], rel.node[1], rel.edge then 489 removed_rel << rel 490 true 491 end 492 end 493 removed_rel.each do |rel| 494 source = rel.node[0] 495 target = rel.node[1] 496 h = @pathway.graph[source] 497 h.delete(target) if h 498 h = @pathway.graph[target] 499 h.delete(source) if h 500 end 501 self 502 end
Removes the given node from the tree. All edges connected with the node are also removed. Returns self. If the node does not exist, raises IndexError.
# File lib/bio/tree.rb 436 def remove_node(node) 437 #_clear_cache #done in clear_node(node) 438 self.clear_node(node) 439 @pathway.graph.delete(node) 440 self 441 end
Removes each node if the block returns not nil. All edges connected with the removed nodes are also removed. Returns self.
# File lib/bio/tree.rb 446 def remove_node_if 447 #_clear_cache #done in clear_node(node) 448 all = self.nodes 449 all.each do |node| 450 if yield node then 451 self.clear_node(node) 452 @pathway.graph.delete(node) 453 end 454 end 455 self 456 end
Removes all nodes that are not branches nor leaves. That is, removes nodes connected with exactly two edges. For each removed node, two adjacent edges are merged and a new edge are created. Returns removed nodes. Note that orphan nodes are still kept unchanged.
# File lib/bio/tree.rb 852 def remove_nonsense_nodes 853 _clear_cache 854 hash = {} 855 self.each_node do |node| 856 hash[node] = true if @pathway.graph[node].size == 2 857 end 858 hash.each_key do |node| 859 adjs = @pathway.graph[node].keys 860 edges = @pathway.graph[node].values 861 new_edge = get_edge_merged(edges[0], edges[1]) 862 @pathway.graph[adjs[0]].delete(node) 863 @pathway.graph[adjs[1]].delete(node) 864 @pathway.graph.delete(node) 865 @pathway.append(Bio::Relation.new(adjs[0], adjs[1], new_edge)) 866 end 867 #@pathway.to_relations 868 @pathway.relations.reject! do |rel| 869 hash[rel.node[0]] or hash[rel.node[1]] 870 end 871 return hash.keys 872 end
Gets the sub-tree consisted of given nodes. nodes must be an array of nodes. Nodes that do not exist in the original tree are ignored. Returns a Tree
object. Note that the sub-tree shares Node
and Edge
objects with the original tree.
# File lib/bio/tree.rb 543 def subtree(nodes) 544 nodes = nodes.find_all do |x| 545 @pathway.graph[x] 546 end 547 return self.class.new if nodes.empty? 548 # creates subtree 549 new_tree = self.class.new 550 nodes.each do |x| 551 new_tree.add_node(x) 552 end 553 self.each_edge do |node1, node2, edge| 554 if new_tree.include?(node1) and new_tree.include?(node2) then 555 new_tree.add_edge(node1, node2, edge) 556 end 557 end 558 return new_tree 559 end
Gets the sub-tree consisted of given nodes and all internal nodes connected between given nodes. nodes must be an array of nodes. Nodes that do not exist in the original tree are ignored. Returns a Tree
object. The result is unspecified for cyclic trees. Note that the sub-tree shares Node
and Edge
objects with the original tree.
# File lib/bio/tree.rb 569 def subtree_with_all_paths(nodes) 570 hash = {} 571 nodes.each { |x| hash[x] = true } 572 nodes.each_index do |i| 573 node1 = nodes[i] 574 (0...i).each do |j| 575 node2 = nodes[j] 576 unless node1 == node2 then 577 begin 578 path = self.path(node1, node2) 579 rescue IndexError, NoPathError 580 path = [] 581 end 582 path.each { |x| hash[x] = true } 583 end 584 end 585 end 586 self.subtree(hash.keys) 587 end
Returns total distance of all edges. It would raise error if some edges didn’t contain distance values.
# File lib/bio/tree.rb 778 def total_distance 779 distance = 0 780 self.each_edge do |source, target, edge| 781 distance += get_edge_distance(edge) 782 end 783 distance 784 end