class Bio::Pathway
Bio::Pathway
is a general graph object initially constructed by the list of the ((<Bio::Relation>)) objects. The basic concept of the Bio::Pathway
object is to store a graph as an adjacency list (in the instance variable @graph), and converting the list into an adjacency matrix by calling to_matrix
method on demand. However, in some cases, it is convenient to have the original list of the ((<Bio::Relation>))s, Bio::Pathway
object also stores the list (as the instance variable @relations) redundantly.
Note: you can clear the @relations list by calling clear_relations! method to reduce the memory usage, and the content of the @relations can be re-generated from the @graph by to_relations
method.
Attributes
Read-only accessor for the adjacency list of the graph.
Read-only accessor for the row/column index (@index) of the adjacency matrix. Contents of the hash @index is created by calling to_matrix
method.
Accessor for the hash of the label assigned to the each node. You can label some of the nodes in the graph by passing a hash to the label and select subgraphs which contain labeled nodes only by subgraph method.
hash = { 1 => 'red', 2 => 'green', 5 => 'black' } g.label = hash g.label g.subgraph # => new graph consists of the node 1, 2, 5 only
Read-only accessor for the internal list of the Bio::Relation
objects
Public Class Methods
Initial graph (adjacency list) generation from the list of Relation
.
Generate Bio::Pathway
object from the list of Bio::Relation
objects. If the second argument is true, undirected graph is generated.
r1 = Bio::Relation.new('a', 'b', 1) r2 = Bio::Relation.new('a', 'c', 5) r3 = Bio::Relation.new('b', 'c', 3) list = [ r1, r2, r3 ] g = Bio::Pathway.new(list, 'undirected')
# File lib/bio/pathway.rb 41 def initialize(relations, undirected = false) 42 @undirected = undirected 43 @relations = relations 44 @graph = {} # adjacency list expression of the graph 45 @index = {} # numbering each node in matrix 46 @label = {} # additional information on each node 47 self.to_list # generate adjacency list 48 end
Public Instance Methods
Add an Bio::Relation
object ‘rel’ to the @graph and @relations. If the second argument is false, @relations is not modified (only useful when genarating @graph from @relations internally).
# File lib/bio/pathway.rb 144 def append(rel, add_rel = true) 145 @relations.push(rel) if add_rel 146 if @graph[rel.from].nil? 147 @graph[rel.from] = {} 148 end 149 if @graph[rel.to].nil? 150 @graph[rel.to] = {} 151 end 152 @graph[rel.from][rel.to] = rel.relation 153 @graph[rel.to][rel.from] = rel.relation if @undirected 154 end
Bellman-Ford method for solving the single-source shortest-paths problem in the graph in which edge weights can be negative.
# File lib/bio/pathway.rb 592 def bellman_ford(root) 593 distance, predecessor = initialize_single_source(root) 594 (self.nodes - 1).times do 595 @graph.each_key do |u| 596 @graph[u].each do |v, w| 597 # relaxing procedure of root -> 'u' -> 'v' 598 if distance[v] > distance[u] + w 599 distance[v] = distance[u] + w 600 predecessor[v] = u 601 end 602 end 603 end 604 end 605 # negative cyclic loop check 606 @graph.each_key do |u| 607 @graph[u].each do |v, w| 608 if distance[v] > distance[u] + w 609 return false 610 end 611 end 612 end 613 return distance, predecessor 614 end
Calculates the shortest path between two nodes by using breadth_first_search
method and returns steps and the path as Array.
# File lib/bio/pathway.rb 450 def bfs_shortest_path(node1, node2) 451 distance, route = breadth_first_search(node1) 452 step = distance[node2] 453 node = node2 454 path = [ node2 ] 455 while node != node1 and route[node] 456 node = route[node] 457 path.unshift(node) 458 end 459 return step, path 460 end
Breadth first search solves steps and path to the each node and forms a tree contains all reachable vertices from the root node. This method returns the result in 2 hashes - 1st one shows the steps from root node and 2nd hash shows the structure of the tree.
The weight of the edges are not considered in this method.
# File lib/bio/pathway.rb 419 def breadth_first_search(root) 420 visited = {} 421 distance = {} 422 predecessor = {} 423 424 visited[root] = true 425 distance[root] = 0 426 predecessor[root] = nil 427 428 queue = [ root ] 429 430 while from = queue.shift 431 next unless @graph[from] 432 @graph[from].each_key do |to| 433 unless visited[to] 434 visited[to] = true 435 distance[to] = distance[from] + 1 436 predecessor[to] = from 437 queue.push(to) 438 end 439 end 440 end 441 return distance, predecessor 442 end
Clear @relations array to reduce the memory usage.
# File lib/bio/pathway.rb 114 def clear_relations! 115 @relations.clear 116 end
Not implemented yet.
# File lib/bio/pathway.rb 370 def clique 371 raise NotImplementedError 372 end
Returns completeness of the edge density among the surrounded nodes.
Calculates the value of cliquishness around the ‘node’. This value indicates completeness of the edge density among the surrounded nodes.
Note: cliquishness (clustering coefficient) for a directed graph is also calculated. Reference: en.wikipedia.org/wiki/Clustering_coefficient
Note: Cliquishness (clustering coefficient) for a node that has only one neighbor node is undefined. Currently, it returns NaN, but the behavior may be changed in the future.
# File lib/bio/pathway.rb 388 def cliquishness(node) 389 neighbors = @graph[node].keys 390 sg = subgraph(neighbors) 391 if sg.graph.size != 0 392 edges = sg.edges 393 nodes = neighbors.size 394 complete = (nodes * (nodes - 1)) 395 return edges.quo(complete) 396 else 397 return 0.0 398 end 399 end
Not implemented yet.
# File lib/bio/pathway.rb 364 def common_subgraph(graph) 365 raise NotImplementedError 366 end
Remove an edge indicated by the Bio::Relation
object ‘rel’ from the @graph and the @relations.
# File lib/bio/pathway.rb 158 def delete(rel) 159 @relations.delete_if do |x| 160 x === rel 161 end 162 @graph[rel.from].delete(rel.to) 163 @graph[rel.to].delete(rel.from) if @undirected 164 end
Depth first search yields much information about the structure of the graph especially on the classification of the edges. This method returns 5 hashes - 1st one shows the timestamps of each node containing the first discoverd time and the search finished time in an array. The 2nd, 3rd, 4th, and 5th hashes contain ‘tree edges’, ‘back edges’, ‘cross edges’, ‘forward edges’ respectively.
If $DEBUG is true (e.g. ruby -d), this method prints the progression of the search.
The weight of the edges are not considered in this method.
Note: The result of this method depends on the order of Hash#each (and each_key, etc.), which may be variable with Ruby version and Ruby interpreter variations (JRuby, etc.). For a workaround to remove such dependency, you can use @index to set order of Hash keys. Note that this bahavior might be changed in the future.
# File lib/bio/pathway.rb 481 def depth_first_search 482 visited = {} 483 timestamp = {} 484 tree_edges = {} 485 back_edges = {} 486 cross_edges = {} 487 forward_edges = {} 488 count = 0 489 490 # begin workaround removing depencency to order of Hash#each 491 if @index.empty? then 492 preference_of_nodes = nil 493 else 494 preference_of_nodes = {}.merge(@index) 495 i = preference_of_nodes.values.max 496 @graph.each_key do |node0| 497 preference_of_nodes[node0] ||= (i += 1) 498 end 499 end 500 # end workaround removing depencency to order of Hash#each 501 502 dfs_visit = Proc.new { |from| 503 visited[from] = true 504 timestamp[from] = [count += 1] 505 ary = @graph[from].keys 506 # begin workaround removing depencency to order of Hash#each 507 if preference_of_nodes then 508 ary = ary.sort_by { |node0| preference_of_nodes[node0] } 509 end 510 # end workaround removing depencency to order of Hash#each 511 ary.each do |to| 512 if visited[to] 513 if timestamp[to].size > 1 514 if timestamp[from].first < timestamp[to].first 515 # forward edge (black) 516 p "#{from} -> #{to} : forward edge" if $DEBUG 517 forward_edges[from] = to 518 else 519 # cross edge (black) 520 p "#{from} -> #{to} : cross edge" if $DEBUG 521 cross_edges[from] = to 522 end 523 else 524 # back edge (gray) 525 p "#{from} -> #{to} : back edge" if $DEBUG 526 back_edges[from] = to 527 end 528 else 529 # tree edge (white) 530 p "#{from} -> #{to} : tree edge" if $DEBUG 531 tree_edges[to] = from 532 dfs_visit.call(to) 533 end 534 end 535 timestamp[from].push(count += 1) 536 } 537 538 ary = @graph.keys 539 # begin workaround removing depencency to order of Hash#each 540 if preference_of_nodes then 541 ary = ary.sort_by { |node0| preference_of_nodes[node0] } 542 end 543 # end workaround removing depencency to order of Hash#each 544 ary.each do |node| 545 unless visited[node] 546 dfs_visit.call(node) 547 end 548 end 549 return timestamp, tree_edges, back_edges, cross_edges, forward_edges 550 end
Topological sort of the directed acyclic graphs (“dags”) by using depth_first_search.
# File lib/bio/pathway.rb 558 def dfs_topological_sort 559 # sorted by finished time reversely and collect node names only 560 timestamp, = self.depth_first_search 561 timestamp.sort {|a,b| b[1][1] <=> a[1][1]}.collect {|x| x.first } 562 end
Dijkstra method to solve the shortest path problem in the weighted graph.
# File lib/bio/pathway.rb 566 def dijkstra(root) 567 distance, predecessor = initialize_single_source(root) 568 @graph[root].each do |k, v| 569 distance[k] = v 570 predecessor[k] = root 571 end 572 queue = distance.dup 573 queue.delete(root) 574 575 while queue.size != 0 576 min = queue.min {|a, b| a[1] <=> b[1]} 577 u = min[0] # extranct a node having minimal distance 578 @graph[u].each do |k, v| 579 # relaxing procedure of root -> 'u' -> 'k' 580 if distance[k] > distance[u] + v 581 distance[k] = distance[u] + v 582 predecessor[k] = u 583 end 584 end 585 queue.delete(u) 586 end 587 return distance, predecessor 588 end
Changes the internal state of the graph from ‘undirected’ to ‘directed’ and re-generate adjacency list. The undirected graph can be converted to directed graph, however, the edge between two nodes will be simply doubled to both ends.
Note: this method can not be used without the list of the Bio::Relation
objects (internally stored in @relations variable). Thus if you already called clear_relations! method, call to_relations
first.
# File lib/bio/pathway.rb 92 def directed 93 if undirected? 94 @undirected = false 95 self.to_list 96 end 97 end
Returns true or false respond to the internal state of the graph.
# File lib/bio/pathway.rb 74 def directed? 75 @undirected ? false : true 76 end
Pretty printer of the adjacency list.
Useful when you want to check the internal state of the adjacency list (for debug purpose etc.) easily.
The result of this method depends on the order of Hash#each (and each_key, etc.), which may be variable with Ruby version and Ruby interpreter variations (JRuby, etc.). For a workaround to remove such dependency, you can use @index to set order of Hash keys. Note that this behavior might be changed in the future.
# File lib/bio/pathway.rb 293 def dump_list 294 # begin workaround removing depencency to order of Hash#each 295 if @index.empty? then 296 pref = nil 297 enum = @graph 298 else 299 pref = {}.merge(@index) 300 i = pref.values.max 301 @graph.each_key do |node| 302 pref[node] ||= (i += 1) 303 end 304 graph_to_a = @graph.to_a 305 graph_to_a.sort! { |x, y| pref[x[0]] <=> pref[y[0]] } 306 enum = graph_to_a 307 end 308 # end workaround removing depencency to order of Hash#each 309 310 list = "" 311 enum.each do |from, hash| 312 list << "#{from} => " 313 # begin workaround removing depencency to order of Hash#each 314 if pref then 315 ary = hash.to_a 316 ary.sort! { |x,y| pref[x[0]] <=> pref[y[0]] } 317 hash = ary 318 end 319 # end workaround removing depencency to order of Hash#each 320 a = [] 321 hash.each do |to, relation| 322 a.push("#{to} (#{relation})") 323 end 324 list << a.join(", ") + "\n" 325 end 326 list 327 end
Pretty printer of the adjacency matrix.
The dump_matrix
method accepts the same arguments as to_matrix. Useful when you want to check the internal state of the matrix (for debug purpose etc.) easily.
This method internally calls to_matrix
method. Read documents of to_matrix
for important informations.
# File lib/bio/pathway.rb 274 def dump_matrix(*arg) 275 matrix = self.to_matrix(*arg) 276 sorted = @index.sort {|a,b| a[1] <=> b[1]} 277 "[# " + sorted.collect{|x| x[0]}.join(", ") + "\n" + 278 matrix.to_a.collect{|row| ' ' + row.inspect}.join(",\n") + "\n]" 279 end
Returns the number of the edges in the graph.
# File lib/bio/pathway.rb 172 def edges 173 edges = 0 174 @graph.each_value do |v| 175 edges += v.size 176 end 177 edges 178 end
Floyd-Wardshall alogrithm for solving the all-pairs shortest-paths problem on a directed graph G = (V, E).
# File lib/bio/pathway.rb 619 def floyd_warshall 620 inf = 1 / 0.0 621 622 m = self.to_matrix(inf, 0) 623 d = m.dup 624 n = self.nodes 625 for k in 0 .. n - 1 do 626 for i in 0 .. n - 1 do 627 for j in 0 .. n - 1 do 628 if d[i, j] > d[i, k] + d[k, j] 629 d[i, j] = d[i, k] + d[k, j] 630 end 631 end 632 end 633 end 634 return d 635 end
Kruskal method for finding minimam spaninng trees
# File lib/bio/pathway.rb 641 def kruskal 642 # initialize 643 rel = self.to_relations.sort{|a, b| a <=> b} 644 index = [] 645 for i in 0 .. (rel.size - 1) do 646 for j in (i + 1) .. (rel.size - 1) do 647 if rel[i] == rel[j] 648 index << j 649 end 650 end 651 end 652 index.sort{|x, y| y<=>x}.each do |idx| 653 rel[idx, 1] = [] 654 end 655 mst = [] 656 seen = Hash.new() 657 @graph.each_key do |x| 658 seen[x] = nil 659 end 660 i = 1 661 # initialize end 662 663 rel.each do |r| 664 if seen[r.node[0]] == nil 665 seen[r.node[0]] = 0 666 end 667 if seen[r.node[1]] == nil 668 seen[r.node[1]] = 0 669 end 670 if seen[r.node[0]] == seen[r.node[1]] && seen[r.node[0]] == 0 671 mst << r 672 seen[r.node[0]] = i 673 seen[r.node[1]] = i 674 elsif seen[r.node[0]] != seen[r.node[1]] 675 mst << r 676 v1 = seen[r.node[0]].dup 677 v2 = seen[r.node[1]].dup 678 seen.each do |k, v| 679 if v == v1 || v == v2 680 seen[k] = i 681 end 682 end 683 end 684 i += 1 685 end 686 return Pathway.new(mst) 687 end
Returns the number of the nodes in the graph.
# File lib/bio/pathway.rb 167 def nodes 168 @graph.keys.length 169 end
Returns frequency of the nodes having same number of edges as hash
Calculates the frequency of the nodes having the same number of edges and returns the value as Hash.
# File lib/bio/pathway.rb 405 def small_world 406 freq = Hash.new(0) 407 @graph.each_value do |v| 408 freq[v.size] += 1 409 end 410 return freq 411 end
Select labeled nodes and generate subgraph
This method select some nodes and returns new Bio::Pathway
object consists of selected nodes only. If the list of the nodes (as Array) is assigned as the argument, use the list to select the nodes from the graph. If no argument is assigned, internal property of the graph @label is used to select the nodes.
hash = { 'a' => 'secret', 'b' => 'important', 'c' => 'important' } g.label = hash g.subgraph list = [ 'a', 'b', 'c' ] g.subgraph(list)
# File lib/bio/pathway.rb 343 def subgraph(list = nil) 344 if list 345 @label.clear 346 list.each do |node| 347 @label[node] = true 348 end 349 end 350 sub_graph = Pathway.new([], @undirected) 351 @graph.each do |from, hash| 352 next unless @label[from] 353 sub_graph.graph[from] ||= {} 354 hash.each do |to, relation| 355 next unless @label[to] 356 sub_graph.append(Relation.new(from, to, relation)) 357 end 358 end 359 return sub_graph 360 end
Graph (adjacency list) generation from the Relations
Generate the adjcancecy list @graph from @relations (called by initialize and in some other cases when @relations has been changed).
# File lib/bio/pathway.rb 134 def to_list 135 @graph.clear 136 @relations.each do |rel| 137 append(rel, false) # append to @graph without push to @relations 138 end 139 end
Convert adjacency list to adjacency matrix
Returns the adjacency matrix expression of the graph as a Matrix object. If the first argument was assigned, the matrix will be filled with the given value. The second argument indicates the value of the diagonal constituents of the matrix besides the above.
The result of this method depends on the order of Hash#each (and each_key, etc.), which may be variable with Ruby version and Ruby interpreter variations (JRuby, etc.). For a workaround to remove such dependency, you can use @index to set order of Hash keys. Note that this behavior might be changed in the future. Be careful that @index is overwritten by this method.
# File lib/bio/pathway.rb 196 def to_matrix(default_value = nil, diagonal_value = nil) 197 198 #-- 199 # Note: following code only fills the outer Array with the reference 200 # to the same inner Array object. 201 # 202 # matrix = Array.new(nodes, Array.new(nodes)) 203 # 204 # so create a new Array object for each row as follows: 205 #++ 206 207 matrix = Array.new 208 nodes.times do 209 matrix.push(Array.new(nodes, default_value)) 210 end 211 212 if diagonal_value 213 nodes.times do |i| 214 matrix[i][i] = diagonal_value 215 end 216 end 217 218 # assign index number 219 if @index.empty? then 220 # assign index number for each node 221 @graph.keys.each_with_index do |k, i| 222 @index[k] = i 223 end 224 else 225 # begin workaround removing depencency to order of Hash#each 226 # assign index number from the preset @index 227 indices = @index.to_a 228 indices.sort! { |i0, i1| i0[1] <=> i1[1] } 229 indices.collect! { |i0| i0[0] } 230 @index.clear 231 v = 0 232 indices.each do |k, i| 233 if @graph[k] and !@index[k] then 234 @index[k] = v; v += 1 235 end 236 end 237 @graph.each_key do |k| 238 unless @index[k] then 239 @index[k] = v; v += 1 240 end 241 end 242 # end workaround removing depencency to order of Hash#each 243 end 244 245 if @relations.empty? # only used after clear_relations! 246 @graph.each do |from, hash| 247 hash.each do |to, relation| 248 x = @index[from] 249 y = @index[to] 250 matrix[x][y] = relation 251 end 252 end 253 else 254 @relations.each do |rel| 255 x = @index[rel.from] 256 y = @index[rel.to] 257 matrix[x][y] = rel.relation 258 matrix[y][x] = rel.relation if @undirected 259 end 260 end 261 Matrix[*matrix] 262 end
Reconstruct @relations from the adjacency list @graph.
# File lib/bio/pathway.rb 119 def to_relations 120 @relations.clear 121 @graph.each_key do |from| 122 @graph[from].each do |to, w| 123 @relations << Relation.new(from, to, w) 124 end 125 end 126 return @relations 127 end
Changes the internal state of the graph from ‘directed’ to ‘undirected’ and re-generate adjacency list.
Note: this method can not be used without the list of the Bio::Relation
objects (internally stored in @relations variable). Thus if you already called clear_relations! method, call to_relations
first.
# File lib/bio/pathway.rb 106 def undirected 107 if directed? 108 @undirected = true 109 self.to_list 110 end 111 end
Returns true or false respond to the internal state of the graph.
# File lib/bio/pathway.rb 79 def undirected? 80 @undirected ? true : false 81 end