Class | Bio::Pathway |
In: |
lib/bio/pathway.rb
|
Parent: | Object |
Bio::Pathway is a general graph object initially constructed by the list of the ((<Bio::Relation>)) objects. The basic concept of the Bio::Pathway object is to store a graph as an adjacency list (in the instance variable @graph), and converting the list into an adjacency matrix by calling to_matrix method on demand. However, in some cases, it is convenient to have the original list of the ((<Bio::Relation>))s, Bio::Pathway object also stores the list (as the instance variable @relations) redundantly.
Note: you can clear the @relations list by calling clear_relations! method to reduce the memory usage, and the content of the @relations can be re-generated from the @graph by to_relations method.
graph | [R] | Read-only accessor for the adjacency list of the graph. |
index | [R] | Read-only accessor for the row/column index (@index) of the adjacency matrix. Contents of the hash @index is created by calling to_matrix method. |
label | [RW] |
Accessor for the hash of the label assigned to the each node. You can label
some of the nodes in the graph by
passing a hash to the label and select subgraphs which contain labeled nodes only by subgraph method.
hash = { 1 => 'red', 2 => 'green', 5 => 'black' } g.label = hash g.label g.subgraph # => new graph consists of the node 1, 2, 5 only |
relations | [R] | Read-only accessor for the internal list of the Bio::Relation objects |
Initial graph (adjacency list) generation from the list of Relation.
Generate Bio::Pathway object from the list of Bio::Relation objects. If the second argument is true, undirected graph is generated.
r1 = Bio::Relation.new('a', 'b', 1) r2 = Bio::Relation.new('a', 'c', 5) r3 = Bio::Relation.new('b', 'c', 3) list = [ r1, r2, r3 ] g = Bio::Pathway.new(list, 'undirected')
# File lib/bio/pathway.rb, line 41 41: def initialize(relations, undirected = false) 42: @undirected = undirected 43: @relations = relations 44: @graph = {} # adjacency list expression of the graph 45: @index = {} # numbering each node in matrix 46: @label = {} # additional information on each node 47: self.to_list # generate adjacency list 48: end
Add an Bio::Relation object ‘rel’ to the @graph and @relations. If the second argument is false, @relations is not modified (only useful when genarating @graph from @relations internally).
# File lib/bio/pathway.rb, line 144 144: def append(rel, add_rel = true) 145: @relations.push(rel) if add_rel 146: if @graph[rel.from].nil? 147: @graph[rel.from] = {} 148: end 149: if @graph[rel.to].nil? 150: @graph[rel.to] = {} 151: end 152: @graph[rel.from][rel.to] = rel.relation 153: @graph[rel.to][rel.from] = rel.relation if @undirected 154: end
Bellman-Ford method for solving the single-source shortest-paths problem in the graph in which edge weights can be negative.
# File lib/bio/pathway.rb, line 592 592: def bellman_ford(root) 593: distance, predecessor = initialize_single_source(root) 594: for i in 1 ..(self.nodes - 1) do 595: @graph.each_key do |u| 596: @graph[u].each do |v, w| 597: # relaxing procedure of root -> 'u' -> 'v' 598: if distance[v] > distance[u] + w 599: distance[v] = distance[u] + w 600: predecessor[v] = u 601: end 602: end
Calculates the shortest path between two nodes by using breadth_first_search method and returns steps and the path as Array.
# File lib/bio/pathway.rb, line 450 450: def bfs_shortest_path(node1, node2) 451: distance, route = breadth_first_search(node1) 452: step = distance[node2] 453: node = node2 454: path = [ node2 ] 455: while node != node1 and route[node] 456: node = route[node] 457: path.unshift(node) 458: end 459: return step, path 460: end
Breadth first search solves steps and path to the each node and forms a tree contains all reachable vertices from the root node. This method returns the result in 2 hashes - 1st one shows the steps from root node and 2nd hash shows the structure of the tree.
The weight of the edges are not considered in this method.
# File lib/bio/pathway.rb, line 419 419: def breadth_first_search(root) 420: visited = {} 421: distance = {} 422: predecessor = {} 423: 424: visited[root] = true 425: distance[root] = 0 426: predecessor[root] = nil 427: 428: queue = [ root ] 429: 430: while from = queue.shift 431: next unless @graph[from] 432: @graph[from].each_key do |to| 433: unless visited[to] 434: visited[to] = true 435: distance[to] = distance[from] + 1 436: predecessor[to] = from 437: queue.push(to) 438: end 439: end 440: end 441: return distance, predecessor 442: end
Clear @relations array to reduce the memory usage.
# File lib/bio/pathway.rb, line 114 114: def clear_relations! 115: @relations.clear 116: end
Not implemented yet.
# File lib/bio/pathway.rb, line 370 370: def clique 371: raise NotImplementedError 372: end
Returns completeness of the edge density among the surrounded nodes.
Calculates the value of cliquishness around the ‘node’. This value indicates completeness of the edge density among the surrounded nodes.
Note: cliquishness (clustering coefficient) for a directed graph is also calculated. Reference: en.wikipedia.org/wiki/Clustering_coefficient
Note: Cliquishness (clustering coefficient) for a node that has only one neighbor node is undefined. Currently, it returns NaN, but the behavior may be changed in the future.
# File lib/bio/pathway.rb, line 388 388: def cliquishness(node) 389: neighbors = @graph[node].keys 390: sg = subgraph(neighbors) 391: if sg.graph.size != 0 392: edges = sg.edges 393: nodes = neighbors.size 394: complete = (nodes * (nodes - 1)) 395: return edges.quo(complete) 396: else 397: return 0.0 398: end 399: end
Not implemented yet.
# File lib/bio/pathway.rb, line 364 364: def common_subgraph(graph) 365: raise NotImplementedError 366: end
Remove an edge indicated by the Bio::Relation object ‘rel’ from the @graph and the @relations.
# File lib/bio/pathway.rb, line 158 158: def delete(rel) 159: @relations.delete_if do |x| 160: x === rel 161: end 162: @graph[rel.from].delete(rel.to) 163: @graph[rel.to].delete(rel.from) if @undirected 164: end
Depth first search yields much information about the structure of the graph especially on the classification of the edges. This method returns 5 hashes - 1st one shows the timestamps of each node containing the first discoverd time and the search finished time in an array. The 2nd, 3rd, 4th, and 5th hashes contain ‘tree edges’, ‘back edges’, ‘cross edges’, ‘forward edges’ respectively.
If $DEBUG is true (e.g. ruby -d), this method prints the progression of the search.
The weight of the edges are not considered in this method.
Note: The result of this method depends on the order of Hash#each (and each_key, etc.), which may be variable with Ruby version and Ruby interpreter variations (JRuby, etc.). For a workaround to remove such dependency, you can use @index to set order of Hash keys. Note that this bahavior might be changed in the future.
# File lib/bio/pathway.rb, line 481 481: def depth_first_search 482: visited = {} 483: timestamp = {} 484: tree_edges = {} 485: back_edges = {} 486: cross_edges = {} 487: forward_edges = {} 488: count = 0 489: 490: # begin workaround removing depencency to order of Hash#each 491: if @index.empty? then 492: preference_of_nodes = nil 493: else 494: preference_of_nodes = {}.merge(@index) 495: i = preference_of_nodes.values.max 496: @graph.each_key do |node0| 497: preference_of_nodes[node0] ||= (i += 1) 498: end 499: end 500: # end workaround removing depencency to order of Hash#each 501: 502: dfs_visit = Proc.new { |from| 503: visited[from] = true 504: timestamp[from] = [count += 1] 505: ary = @graph[from].keys 506: # begin workaround removing depencency to order of Hash#each 507: if preference_of_nodes then 508: ary = ary.sort_by { |node0| preference_of_nodes[node0] } 509: end 510: # end workaround removing depencency to order of Hash#each 511: ary.each do |to| 512: if visited[to] 513: if timestamp[to].size > 1 514: if timestamp[from].first < timestamp[to].first 515: # forward edge (black) 516: p "#{from} -> #{to} : forward edge" if $DEBUG 517: forward_edges[from] = to 518: else 519: # cross edge (black) 520: p "#{from} -> #{to} : cross edge" if $DEBUG 521: cross_edges[from] = to 522: end 523: else 524: # back edge (gray) 525: p "#{from} -> #{to} : back edge" if $DEBUG 526: back_edges[from] = to 527: end 528: else 529: # tree edge (white) 530: p "#{from} -> #{to} : tree edge" if $DEBUG 531: tree_edges[to] = from 532: dfs_visit.call(to) 533: end 534: end 535: timestamp[from].push(count += 1) 536: } 537: 538: ary = @graph.keys 539: # begin workaround removing depencency to order of Hash#each 540: if preference_of_nodes then 541: ary = ary.sort_by { |node0| preference_of_nodes[node0] } 542: end 543: # end workaround removing depencency to order of Hash#each 544: ary.each do |node| 545: unless visited[node] 546: dfs_visit.call(node) 547: end 548: end 549: return timestamp, tree_edges, back_edges, cross_edges, forward_edges 550: end
Topological sort of the directed acyclic graphs ("dags") by using depth_first_search.
# File lib/bio/pathway.rb, line 558 558: def dfs_topological_sort 559: # sorted by finished time reversely and collect node names only 560: timestamp, = self.depth_first_search 561: timestamp.sort {|a,b| b[1][1] <=> a[1][1]}.collect {|x| x.first } 562: end
Dijkstra method to solve the shortest path problem in the weighted graph.
# File lib/bio/pathway.rb, line 566 566: def dijkstra(root) 567: distance, predecessor = initialize_single_source(root) 568: @graph[root].each do |k, v| 569: distance[k] = v 570: predecessor[k] = root 571: end 572: queue = distance.dup 573: queue.delete(root) 574: 575: while queue.size != 0 576: min = queue.min {|a, b| a[1] <=> b[1]} 577: u = min[0] # extranct a node having minimal distance 578: @graph[u].each do |k, v| 579: # relaxing procedure of root -> 'u' -> 'k' 580: if distance[k] > distance[u] + v 581: distance[k] = distance[u] + v 582: predecessor[k] = u 583: end 584: end 585: queue.delete(u) 586: end 587: return distance, predecessor 588: end
Changes the internal state of the graph from ‘undirected’ to ‘directed’ and re-generate adjacency list. The undirected graph can be converted to directed graph, however, the edge between two nodes will be simply doubled to both ends.
Note: this method can not be used without the list of the Bio::Relation objects (internally stored in @relations variable). Thus if you already called clear_relations! method, call to_relations first.
# File lib/bio/pathway.rb, line 92 92: def directed 93: if undirected? 94: @undirected = false 95: self.to_list 96: end 97: end
Returns true or false respond to the internal state of the graph.
# File lib/bio/pathway.rb, line 74 74: def directed? 75: @undirected ? false : true 76: end
Pretty printer of the adjacency list.
Useful when you want to check the internal state of the adjacency list (for debug purpose etc.) easily.
The result of this method depends on the order of Hash#each (and each_key, etc.), which may be variable with Ruby version and Ruby interpreter variations (JRuby, etc.). For a workaround to remove such dependency, you can use @index to set order of Hash keys. Note that this behavior might be changed in the future.
# File lib/bio/pathway.rb, line 293 293: def dump_list 294: # begin workaround removing depencency to order of Hash#each 295: if @index.empty? then 296: pref = nil 297: enum = @graph 298: else 299: pref = {}.merge(@index) 300: i = pref.values.max 301: @graph.each_key do |node| 302: pref[node] ||= (i += 1) 303: end 304: graph_to_a = @graph.to_a 305: graph_to_a.sort! { |x, y| pref[x[0]] <=> pref[y[0]] } 306: enum = graph_to_a 307: end 308: # end workaround removing depencency to order of Hash#each 309: 310: list = "" 311: enum.each do |from, hash| 312: list << "#{from} => " 313: # begin workaround removing depencency to order of Hash#each 314: if pref then 315: ary = hash.to_a 316: ary.sort! { |x,y| pref[x[0]] <=> pref[y[0]] } 317: hash = ary 318: end 319: # end workaround removing depencency to order of Hash#each 320: a = [] 321: hash.each do |to, relation| 322: a.push("#{to} (#{relation})") 323: end 324: list << a.join(", ") + "\n" 325: end 326: list 327: end
Pretty printer of the adjacency matrix.
The dump_matrix method accepts the same arguments as to_matrix. Useful when you want to check the internal state of the matrix (for debug purpose etc.) easily.
This method internally calls to_matrix method. Read documents of to_matrix for important informations.
# File lib/bio/pathway.rb, line 274 274: def dump_matrix(*arg) 275: matrix = self.to_matrix(*arg) 276: sorted = @index.sort {|a,b| a[1] <=> b[1]} 277: "[# " + sorted.collect{|x| x[0]}.join(", ") + "\n" + 278: matrix.to_a.collect{|row| ' ' + row.inspect}.join(",\n") + "\n]" 279: end
Returns frequency of the nodes having same number of edges as hash
Calculates the frequency of the nodes having the same number of edges and returns the value as Hash.
# File lib/bio/pathway.rb, line 405 405: def small_world 406: freq = Hash.new(0) 407: @graph.each_value do |v| 408: freq[v.size] += 1 409: end 410: return freq 411: end
Select labeled nodes and generate subgraph
This method select some nodes and returns new Bio::Pathway object consists of selected nodes only. If the list of the nodes (as Array) is assigned as the argument, use the list to select the nodes from the graph. If no argument is assigned, internal property of the graph @label is used to select the nodes.
hash = { 'a' => 'secret', 'b' => 'important', 'c' => 'important' } g.label = hash g.subgraph list = [ 'a', 'b', 'c' ] g.subgraph(list)
# File lib/bio/pathway.rb, line 343 343: def subgraph(list = nil) 344: if list 345: @label.clear 346: list.each do |node| 347: @label[node] = true 348: end 349: end 350: sub_graph = Pathway.new([], @undirected) 351: @graph.each do |from, hash| 352: next unless @label[from] 353: sub_graph.graph[from] ||= {} 354: hash.each do |to, relation| 355: next unless @label[to] 356: sub_graph.append(Relation.new(from, to, relation)) 357: end 358: end 359: return sub_graph 360: end
Graph (adjacency list) generation from the Relations
Generate the adjcancecy list @graph from @relations (called by initialize and in some other cases when @relations has been changed).
# File lib/bio/pathway.rb, line 134 134: def to_list 135: @graph.clear 136: @relations.each do |rel| 137: append(rel, false) # append to @graph without push to @relations 138: end 139: end
Convert adjacency list to adjacency matrix
Returns the adjacency matrix expression of the graph as a Matrix object. If the first argument was assigned, the matrix will be filled with the given value. The second argument indicates the value of the diagonal constituents of the matrix besides the above.
The result of this method depends on the order of Hash#each (and each_key, etc.), which may be variable with Ruby version and Ruby interpreter variations (JRuby, etc.). For a workaround to remove such dependency, you can use @index to set order of Hash keys. Note that this behavior might be changed in the future. Be careful that @index is overwritten by this method.
# File lib/bio/pathway.rb, line 196 196: def to_matrix(default_value = nil, diagonal_value = nil) 197: 198: #-- 199: # Note: following code only fills the outer Array with the reference 200: # to the same inner Array object. 201: # 202: # matrix = Array.new(nodes, Array.new(nodes)) 203: # 204: # so create a new Array object for each row as follows: 205: #++ 206: 207: matrix = Array.new 208: nodes.times do 209: matrix.push(Array.new(nodes, default_value)) 210: end 211: 212: if diagonal_value 213: nodes.times do |i| 214: matrix[i][i] = diagonal_value 215: end 216: end 217: 218: # assign index number 219: if @index.empty? then 220: # assign index number for each node 221: @graph.keys.each_with_index do |k, i| 222: @index[k] = i 223: end 224: else 225: # begin workaround removing depencency to order of Hash#each 226: # assign index number from the preset @index 227: indices = @index.to_a 228: indices.sort! { |i0, i1| i0[1] <=> i1[1] } 229: indices.collect! { |i0| i0[0] } 230: @index.clear 231: v = 0 232: indices.each do |k, i| 233: if @graph[k] and !@index[k] then 234: @index[k] = v; v += 1 235: end 236: end 237: @graph.each_key do |k| 238: unless @index[k] then 239: @index[k] = v; v += 1 240: end 241: end 242: # end workaround removing depencency to order of Hash#each 243: end 244: 245: if @relations.empty? # only used after clear_relations! 246: @graph.each do |from, hash| 247: hash.each do |to, relation| 248: x = @index[from] 249: y = @index[to] 250: matrix[x][y] = relation 251: end 252: end 253: else 254: @relations.each do |rel| 255: x = @index[rel.from] 256: y = @index[rel.to] 257: matrix[x][y] = rel.relation 258: matrix[y][x] = rel.relation if @undirected 259: end 260: end 261: Matrix[*matrix] 262: end
Reconstruct @relations from the adjacency list @graph.
# File lib/bio/pathway.rb, line 119 119: def to_relations 120: @relations.clear 121: @graph.each_key do |from| 122: @graph[from].each do |to, w| 123: @relations << Relation.new(from, to, w) 124: end 125: end 126: return @relations 127: end
Changes the internal state of the graph from ‘directed’ to ‘undirected’ and re-generate adjacency list.
Note: this method can not be used without the list of the Bio::Relation objects (internally stored in @relations variable). Thus if you already called clear_relations! method, call to_relations first.
# File lib/bio/pathway.rb, line 106 106: def undirected 107: if directed? 108: @undirected = true 109: self.to_list 110: end 111: end