class Bio::Pathway
Bio::Pathway
is a general graph object initially constructed by the list of the ((<Bio::Relation>)) objects. The basic concept of the Bio::Pathway
object is to store a graph as an adjacency list (in the instance variable @graph), and converting the list into an adjacency matrix by calling to_matrix
method on demand. However, in some cases, it is convenient to have the original list of the ((<Bio::Relation>))s, Bio::Pathway
object also stores the list (as the instance variable @relations) redundantly.
Note: you can clear the @relations list by calling clear_relations! method to reduce the memory usage, and the content of the @relations can be re-generated from the @graph by to_relations
method.
Attributes
Read-only accessor for the adjacency list of the graph.
Read-only accessor for the row/column index (@index) of the adjacency matrix. Contents of the hash @index is created by calling to_matrix
method.
Accessor for the hash of the label assigned to the each node. You can label some of the nodes in the graph by passing a hash to the label and select subgraphs which contain labeled nodes only by subgraph method.
hash = { 1 => 'red', 2 => 'green', 5 => 'black' } g.label = hash g.label g.subgraph # => new graph consists of the node 1, 2, 5 only
Read-only accessor for the internal list of the Bio::Relation
objects
Public Class Methods
Source
# File lib/bio/pathway.rb 42 def initialize(relations, undirected = false) 43 @undirected = undirected 44 @relations = relations 45 @graph = {} # adjacency list expression of the graph 46 @index = {} # numbering each node in matrix 47 @label = {} # additional information on each node 48 self.to_list # generate adjacency list 49 end
Initial graph (adjacency list) generation from the list of Relation
.
Generate Bio::Pathway
object from the list of Bio::Relation
objects. If the second argument is true, undirected graph is generated.
r1 = Bio::Relation.new('a', 'b', 1) r2 = Bio::Relation.new('a', 'c', 5) r3 = Bio::Relation.new('b', 'c', 3) list = [ r1, r2, r3 ] g = Bio::Pathway.new(list, 'undirected')
Public Instance Methods
Source
# File lib/bio/pathway.rb 145 def append(rel, add_rel = true) 146 @relations.push(rel) if add_rel 147 if @graph[rel.from].nil? 148 @graph[rel.from] = {} 149 end 150 if @graph[rel.to].nil? 151 @graph[rel.to] = {} 152 end 153 @graph[rel.from][rel.to] = rel.relation 154 @graph[rel.to][rel.from] = rel.relation if @undirected 155 end
Add an Bio::Relation
object ‘rel’ to the @graph and @relations. If the second argument is false, @relations is not modified (only useful when genarating @graph from @relations internally).
Source
# File lib/bio/pathway.rb 593 def bellman_ford(root) 594 distance, predecessor = initialize_single_source(root) 595 (self.nodes - 1).times do 596 @graph.each_key do |u| 597 @graph[u].each do |v, w| 598 # relaxing procedure of root -> 'u' -> 'v' 599 if distance[v] > distance[u] + w 600 distance[v] = distance[u] + w 601 predecessor[v] = u 602 end 603 end 604 end 605 end 606 # negative cyclic loop check 607 @graph.each_key do |u| 608 @graph[u].each do |v, w| 609 if distance[v] > distance[u] + w 610 return false 611 end 612 end 613 end 614 return distance, predecessor 615 end
Bellman-Ford method for solving the single-source shortest-paths problem in the graph in which edge weights can be negative.
Source
# File lib/bio/pathway.rb 451 def bfs_shortest_path(node1, node2) 452 distance, route = breadth_first_search(node1) 453 step = distance[node2] 454 node = node2 455 path = [ node2 ] 456 while node != node1 and route[node] 457 node = route[node] 458 path.unshift(node) 459 end 460 return step, path 461 end
Calculates the shortest path between two nodes by using breadth_first_search
method and returns steps and the path as Array.
Source
# File lib/bio/pathway.rb 420 def breadth_first_search(root) 421 visited = {} 422 distance = {} 423 predecessor = {} 424 425 visited[root] = true 426 distance[root] = 0 427 predecessor[root] = nil 428 429 queue = [ root ] 430 431 while from = queue.shift 432 next unless @graph[from] 433 @graph[from].each_key do |to| 434 unless visited[to] 435 visited[to] = true 436 distance[to] = distance[from] + 1 437 predecessor[to] = from 438 queue.push(to) 439 end 440 end 441 end 442 return distance, predecessor 443 end
Breadth first search solves steps and path to the each node and forms a tree contains all reachable vertices from the root node. This method returns the result in 2 hashes - 1st one shows the steps from root node and 2nd hash shows the structure of the tree.
The weight of the edges are not considered in this method.
Source
# File lib/bio/pathway.rb 115 def clear_relations! 116 @relations.clear 117 end
Clear @relations array to reduce the memory usage.
Source
# File lib/bio/pathway.rb 371 def clique 372 raise NotImplementedError 373 end
Not implemented yet.
Source
# File lib/bio/pathway.rb 389 def cliquishness(node) 390 neighbors = @graph[node].keys 391 sg = subgraph(neighbors) 392 if sg.graph.size != 0 393 edges = sg.edges 394 nodes = neighbors.size 395 complete = (nodes * (nodes - 1)) 396 return edges.quo(complete) 397 else 398 return 0.0 399 end 400 end
Returns completeness of the edge density among the surrounded nodes.
Calculates the value of cliquishness around the ‘node’. This value indicates completeness of the edge density among the surrounded nodes.
Note: cliquishness (clustering coefficient) for a directed graph is also calculated. Reference: en.wikipedia.org/wiki/Clustering_coefficient
Note: Cliquishness (clustering coefficient) for a node that has only one neighbor node is undefined. Currently, it returns NaN, but the behavior may be changed in the future.
Source
# File lib/bio/pathway.rb 365 def common_subgraph(graph) 366 raise NotImplementedError 367 end
Not implemented yet.
Source
# File lib/bio/pathway.rb 159 def delete(rel) 160 @relations.delete_if do |x| 161 x === rel 162 end 163 @graph[rel.from].delete(rel.to) 164 @graph[rel.to].delete(rel.from) if @undirected 165 end
Remove an edge indicated by the Bio::Relation
object ‘rel’ from the @graph and the @relations.
Source
# File lib/bio/pathway.rb 482 def depth_first_search 483 visited = {} 484 timestamp = {} 485 tree_edges = {} 486 back_edges = {} 487 cross_edges = {} 488 forward_edges = {} 489 count = 0 490 491 # begin workaround removing depencency to order of Hash#each 492 if @index.empty? then 493 preference_of_nodes = nil 494 else 495 preference_of_nodes = {}.merge(@index) 496 i = preference_of_nodes.values.max 497 @graph.each_key do |node0| 498 preference_of_nodes[node0] ||= (i += 1) 499 end 500 end 501 # end workaround removing depencency to order of Hash#each 502 503 dfs_visit = Proc.new { |from| 504 visited[from] = true 505 timestamp[from] = [count += 1] 506 ary = @graph[from].keys 507 # begin workaround removing depencency to order of Hash#each 508 if preference_of_nodes then 509 ary = ary.sort_by { |node0| preference_of_nodes[node0] } 510 end 511 # end workaround removing depencency to order of Hash#each 512 ary.each do |to| 513 if visited[to] 514 if timestamp[to].size > 1 515 if timestamp[from].first < timestamp[to].first 516 # forward edge (black) 517 p "#{from} -> #{to} : forward edge" if $DEBUG 518 forward_edges[from] = to 519 else 520 # cross edge (black) 521 p "#{from} -> #{to} : cross edge" if $DEBUG 522 cross_edges[from] = to 523 end 524 else 525 # back edge (gray) 526 p "#{from} -> #{to} : back edge" if $DEBUG 527 back_edges[from] = to 528 end 529 else 530 # tree edge (white) 531 p "#{from} -> #{to} : tree edge" if $DEBUG 532 tree_edges[to] = from 533 dfs_visit.call(to) 534 end 535 end 536 timestamp[from].push(count += 1) 537 } 538 539 ary = @graph.keys 540 # begin workaround removing depencency to order of Hash#each 541 if preference_of_nodes then 542 ary = ary.sort_by { |node0| preference_of_nodes[node0] } 543 end 544 # end workaround removing depencency to order of Hash#each 545 ary.each do |node| 546 unless visited[node] 547 dfs_visit.call(node) 548 end 549 end 550 return timestamp, tree_edges, back_edges, cross_edges, forward_edges 551 end
Depth first search yields much information about the structure of the graph especially on the classification of the edges. This method returns 5 hashes - 1st one shows the timestamps of each node containing the first discoverd time and the search finished time in an array. The 2nd, 3rd, 4th, and 5th hashes contain ‘tree edges’, ‘back edges’, ‘cross edges’, ‘forward edges’ respectively.
If $DEBUG is true (e.g. ruby -d), this method prints the progression of the search.
The weight of the edges are not considered in this method.
Note: The result of this method depends on the order of Hash#each (and each_key, etc.), which may be variable with Ruby version and Ruby interpreter variations (JRuby, etc.). For a workaround to remove such dependency, you can use @index to set order of Hash keys. Note that this bahavior might be changed in the future.
Source
# File lib/bio/pathway.rb 559 def dfs_topological_sort 560 # sorted by finished time reversely and collect node names only 561 timestamp, = self.depth_first_search 562 timestamp.sort {|a,b| b[1][1] <=> a[1][1]}.collect {|x| x.first } 563 end
Topological sort of the directed acyclic graphs (“dags”) by using depth_first_search.
Source
# File lib/bio/pathway.rb 567 def dijkstra(root) 568 distance, predecessor = initialize_single_source(root) 569 @graph[root].each do |k, v| 570 distance[k] = v 571 predecessor[k] = root 572 end 573 queue = distance.dup 574 queue.delete(root) 575 576 while queue.size != 0 577 min = queue.min {|a, b| a[1] <=> b[1]} 578 u = min[0] # extranct a node having minimal distance 579 @graph[u].each do |k, v| 580 # relaxing procedure of root -> 'u' -> 'k' 581 if distance[k] > distance[u] + v 582 distance[k] = distance[u] + v 583 predecessor[k] = u 584 end 585 end 586 queue.delete(u) 587 end 588 return distance, predecessor 589 end
Dijkstra method to solve the shortest path problem in the weighted graph.
Source
# File lib/bio/pathway.rb 93 def directed 94 if undirected? 95 @undirected = false 96 self.to_list 97 end 98 end
Changes the internal state of the graph from ‘undirected’ to ‘directed’ and re-generate adjacency list. The undirected graph can be converted to directed graph, however, the edge between two nodes will be simply doubled to both ends.
Note: this method can not be used without the list of the Bio::Relation
objects (internally stored in @relations variable). Thus if you already called clear_relations! method, call to_relations
first.
Source
# File lib/bio/pathway.rb 75 def directed? 76 @undirected ? false : true 77 end
Returns true or false respond to the internal state of the graph.
Source
# File lib/bio/pathway.rb 294 def dump_list 295 # begin workaround removing depencency to order of Hash#each 296 if @index.empty? then 297 pref = nil 298 enum = @graph 299 else 300 pref = {}.merge(@index) 301 i = pref.values.max 302 @graph.each_key do |node| 303 pref[node] ||= (i += 1) 304 end 305 graph_to_a = @graph.to_a 306 graph_to_a.sort! { |x, y| pref[x[0]] <=> pref[y[0]] } 307 enum = graph_to_a 308 end 309 # end workaround removing depencency to order of Hash#each 310 311 list = String.new 312 enum.each do |from, hash| 313 list << "#{from} => " 314 # begin workaround removing depencency to order of Hash#each 315 if pref then 316 ary = hash.to_a 317 ary.sort! { |x,y| pref[x[0]] <=> pref[y[0]] } 318 hash = ary 319 end 320 # end workaround removing depencency to order of Hash#each 321 a = [] 322 hash.each do |to, relation| 323 a.push("#{to} (#{relation})") 324 end 325 list << a.join(", ") + "\n" 326 end 327 list 328 end
Pretty printer of the adjacency list.
Useful when you want to check the internal state of the adjacency list (for debug purpose etc.) easily.
The result of this method depends on the order of Hash#each (and each_key, etc.), which may be variable with Ruby version and Ruby interpreter variations (JRuby, etc.). For a workaround to remove such dependency, you can use @index to set order of Hash keys. Note that this behavior might be changed in the future.
Source
# File lib/bio/pathway.rb 275 def dump_matrix(*arg) 276 matrix = self.to_matrix(*arg) 277 sorted = @index.sort {|a,b| a[1] <=> b[1]} 278 "[# " + sorted.collect{|x| x[0]}.join(", ") + "\n" + 279 matrix.to_a.collect{|row| ' ' + row.inspect}.join(",\n") + "\n]" 280 end
Pretty printer of the adjacency matrix.
The dump_matrix
method accepts the same arguments as to_matrix. Useful when you want to check the internal state of the matrix (for debug purpose etc.) easily.
This method internally calls to_matrix
method. Read documents of to_matrix
for important informations.
Source
# File lib/bio/pathway.rb 173 def edges 174 edges = 0 175 @graph.each_value do |v| 176 edges += v.size 177 end 178 edges 179 end
Returns the number of the edges in the graph.
Source
# File lib/bio/pathway.rb 620 def floyd_warshall 621 inf = 1 / 0.0 622 623 m = self.to_matrix(inf, 0) 624 d = m.dup 625 n = self.nodes 626 for k in 0 .. n - 1 do 627 for i in 0 .. n - 1 do 628 for j in 0 .. n - 1 do 629 if d[i, j] > d[i, k] + d[k, j] 630 d[i, j] = d[i, k] + d[k, j] 631 end 632 end 633 end 634 end 635 return d 636 end
Floyd-Wardshall alogrithm for solving the all-pairs shortest-paths problem on a directed graph G = (V, E).
Source
# File lib/bio/pathway.rb 642 def kruskal 643 # initialize 644 rel = self.to_relations.sort{|a, b| a <=> b} 645 index = [] 646 for i in 0 .. (rel.size - 1) do 647 for j in (i + 1) .. (rel.size - 1) do 648 if rel[i] == rel[j] 649 index << j 650 end 651 end 652 end 653 index.sort{|x, y| y<=>x}.each do |idx| 654 rel[idx, 1] = [] 655 end 656 mst = [] 657 seen = Hash.new() 658 @graph.each_key do |x| 659 seen[x] = nil 660 end 661 i = 1 662 # initialize end 663 664 rel.each do |r| 665 if seen[r.node[0]] == nil 666 seen[r.node[0]] = 0 667 end 668 if seen[r.node[1]] == nil 669 seen[r.node[1]] = 0 670 end 671 if seen[r.node[0]] == seen[r.node[1]] && seen[r.node[0]] == 0 672 mst << r 673 seen[r.node[0]] = i 674 seen[r.node[1]] = i 675 elsif seen[r.node[0]] != seen[r.node[1]] 676 mst << r 677 v1 = seen[r.node[0]].dup 678 v2 = seen[r.node[1]].dup 679 seen.each do |k, v| 680 if v == v1 || v == v2 681 seen[k] = i 682 end 683 end 684 end 685 i += 1 686 end 687 return Pathway.new(mst) 688 end
Kruskal method for finding minimam spaninng trees
Source
# File lib/bio/pathway.rb 168 def nodes 169 @graph.keys.length 170 end
Returns the number of the nodes in the graph.
Source
# File lib/bio/pathway.rb 406 def small_world 407 freq = Hash.new(0) 408 @graph.each_value do |v| 409 freq[v.size] += 1 410 end 411 return freq 412 end
Returns frequency of the nodes having same number of edges as hash
Calculates the frequency of the nodes having the same number of edges and returns the value as Hash.
Source
# File lib/bio/pathway.rb 344 def subgraph(list = nil) 345 if list 346 @label.clear 347 list.each do |node| 348 @label[node] = true 349 end 350 end 351 sub_graph = Pathway.new([], @undirected) 352 @graph.each do |from, hash| 353 next unless @label[from] 354 sub_graph.graph[from] ||= {} 355 hash.each do |to, relation| 356 next unless @label[to] 357 sub_graph.append(Relation.new(from, to, relation)) 358 end 359 end 360 return sub_graph 361 end
Select labeled nodes and generate subgraph
This method select some nodes and returns new Bio::Pathway
object consists of selected nodes only. If the list of the nodes (as Array) is assigned as the argument, use the list to select the nodes from the graph. If no argument is assigned, internal property of the graph @label is used to select the nodes.
hash = { 'a' => 'secret', 'b' => 'important', 'c' => 'important' } g.label = hash g.subgraph list = [ 'a', 'b', 'c' ] g.subgraph(list)
Source
# File lib/bio/pathway.rb 135 def to_list 136 @graph.clear 137 @relations.each do |rel| 138 append(rel, false) # append to @graph without push to @relations 139 end 140 end
Graph (adjacency list) generation from the Relations
Generate the adjcancecy list @graph from @relations (called by initialize and in some other cases when @relations has been changed).
Source
# File lib/bio/pathway.rb 197 def to_matrix(default_value = nil, diagonal_value = nil) 198 199 #-- 200 # Note: following code only fills the outer Array with the reference 201 # to the same inner Array object. 202 # 203 # matrix = Array.new(nodes, Array.new(nodes)) 204 # 205 # so create a new Array object for each row as follows: 206 #++ 207 208 matrix = Array.new 209 nodes.times do 210 matrix.push(Array.new(nodes, default_value)) 211 end 212 213 if diagonal_value 214 nodes.times do |i| 215 matrix[i][i] = diagonal_value 216 end 217 end 218 219 # assign index number 220 if @index.empty? then 221 # assign index number for each node 222 @graph.keys.each_with_index do |k, i| 223 @index[k] = i 224 end 225 else 226 # begin workaround removing depencency to order of Hash#each 227 # assign index number from the preset @index 228 indices = @index.to_a 229 indices.sort! { |i0, i1| i0[1] <=> i1[1] } 230 indices.collect! { |i0| i0[0] } 231 @index.clear 232 v = 0 233 indices.each do |k, i| 234 if @graph[k] and !@index[k] then 235 @index[k] = v; v += 1 236 end 237 end 238 @graph.each_key do |k| 239 unless @index[k] then 240 @index[k] = v; v += 1 241 end 242 end 243 # end workaround removing depencency to order of Hash#each 244 end 245 246 if @relations.empty? # only used after clear_relations! 247 @graph.each do |from, hash| 248 hash.each do |to, relation| 249 x = @index[from] 250 y = @index[to] 251 matrix[x][y] = relation 252 end 253 end 254 else 255 @relations.each do |rel| 256 x = @index[rel.from] 257 y = @index[rel.to] 258 matrix[x][y] = rel.relation 259 matrix[y][x] = rel.relation if @undirected 260 end 261 end 262 Matrix[*matrix] 263 end
Convert adjacency list to adjacency matrix
Returns the adjacency matrix expression of the graph as a Matrix object. If the first argument was assigned, the matrix will be filled with the given value. The second argument indicates the value of the diagonal constituents of the matrix besides the above.
The result of this method depends on the order of Hash#each (and each_key, etc.), which may be variable with Ruby version and Ruby interpreter variations (JRuby, etc.). For a workaround to remove such dependency, you can use @index to set order of Hash keys. Note that this behavior might be changed in the future. Be careful that @index is overwritten by this method.
Source
# File lib/bio/pathway.rb 120 def to_relations 121 @relations.clear 122 @graph.each_key do |from| 123 @graph[from].each do |to, w| 124 @relations << Relation.new(from, to, w) 125 end 126 end 127 return @relations 128 end
Reconstruct @relations from the adjacency list @graph.
Source
# File lib/bio/pathway.rb 107 def undirected 108 if directed? 109 @undirected = true 110 self.to_list 111 end 112 end
Changes the internal state of the graph from ‘directed’ to ‘undirected’ and re-generate adjacency list.
Note: this method can not be used without the list of the Bio::Relation
objects (internally stored in @relations variable). Thus if you already called clear_relations! method, call to_relations
first.
Source
# File lib/bio/pathway.rb 80 def undirected? 81 @undirected ? true : false 82 end
Returns true or false respond to the internal state of the graph.