class Bio::KEGG::Keggtab
Description¶ ↑
Parse ‘keggtab’ KEGG
database definition file which also includes Taxonomic category of the KEGG
organisms.
References
¶ ↑
The ‘keggtab’ file is included in
Format¶ ↑
File format is something like
# KEGGTAB # # name type directory abbreviation # enzyme enzyme $BIOROOT/db/ideas/ligand ec ec alias enzyme (snip) # Human h.sapiens genes $BIOROOT/db/kegg/genes hsa H.sapiens alias h.sapiens hsa alias h.sapiens (snip) # # Taxonomy # (snip) animals alias hsa+mmu+rno+dre+dme+cel eukaryotes alias animals+plants+protists+fungi genes alias eubacteria+archaea+eukaryotes
Attributes
Returns a string of the BIOROOT path prefix.
Public Class Methods
Source
# File lib/bio/db/kegg/keggtab.rb 54 def initialize(file_path, bioroot = nil) 55 @bioroot = ENV['BIOROOT'] || bioroot 56 @db_names = Hash.new 57 @database = Hash.new 58 @taxonomy = Hash.new 59 File.open(file_path) do |f| 60 parse_keggtab(f.read) 61 end 62 end
Path for keggtab file and optionally set bioroot top directory. Environmental variable BIOROOT overrides bioroot.
Public Instance Methods
Source
# File lib/bio/db/kegg/keggtab.rb 141 def alias_list(db_name) 142 if @db_names[db_name] 143 @db_names[db_name].aliases 144 end 145 end
deprecated
Source
# File lib/bio/db/kegg/keggtab.rb 112 def aliases(db_abbrev) 113 if @database[db_abbrev] 114 @database[db_abbrev].aliases 115 end 116 end
Returns an Array containing all alias names for the database. (e.g. ‘hsa’ -> [“H.sapiens”, “hsa”], ‘hpj’ -> [“H.pylori_J99”, “hpj”])
Source
# File lib/bio/db/kegg/keggtab.rb 196 def child_nodes(node = 'genes') 197 return @taxonomy[node] 198 end
Source
# File lib/bio/db/kegg/keggtab.rb 102 def database(db_abbrev = nil) 103 if db_abbrev 104 @database[db_abbrev] 105 else 106 @database 107 end 108 end
Returns a hash containing DB
definition section of the keggtab file. If database name is given as an argument, returns a Keggtab::DB
object.
Source
# File lib/bio/db/kegg/keggtab.rb 157 def db_by_abbrev(db_abbrev) 158 @db_names.each do |k, db| 159 return db if db.abbrev == db_abbrev 160 end 161 return nil 162 end
deprecated
Source
# File lib/bio/db/kegg/keggtab.rb 148 def db_path(db_name) 149 if @bioroot 150 "#{@db_names[db_name].path.sub(/\$BIOROOT/,@bioroot)}/#{db_name}" 151 else 152 "#{@db_names[db_name].path}/#{db_name}" 153 end 154 end
deprecated
Source
# File lib/bio/db/kegg/keggtab.rb 170 def db_path_by_abbrev(db_abbrev) 171 db_name = name_by_abbrev(db_abbrev) 172 db_path(db_name) 173 end
deprecated
Source
# File lib/bio/db/kegg/keggtab.rb 225 def korg2taxo(keggorg) 226 tmp = Array.new 227 traverse = Proc.new {|keggorg| 228 @taxonomy.each do |k,v| 229 if v.include?(keggorg) 230 tmp.push(k) 231 traverse.call(k) 232 break 233 end 234 end 235 } 236 traverse.call(keggorg) 237 return tmp 238 end
Returns an array of taxonomy names the organism belongs. (e.g. ‘eco’ -> [‘proteogamma’,‘proteobacteria’,‘eubacteria’,‘genes’]) This method has aliases as keggorg2taxo, korg2taxonomy, keggorg2taxonomy.
Source
# File lib/bio/db/kegg/keggtab.rb 120 def name(db_abbrev) 121 if @database[db_abbrev] 122 @database[db_abbrev].name 123 end 124 end
Returns a canonical database name for the abbreviation. (e.g. ‘ec’ -> ‘enzyme’, ‘hsa’ -> ‘h.sapies’, …)
Source
# File lib/bio/db/kegg/keggtab.rb 165 def name_by_abbrev(db_abbrev) 166 db_by_abbrev(db_abbrev).name 167 end
deprecated
Source
# File lib/bio/db/kegg/keggtab.rb 128 def path(db_abbrev) 129 if @database[db_abbrev] 130 file = @database[db_abbrev].name 131 if @bioroot 132 "#{@database[db_abbrev].path.sub(/\$BIOROOT/,@bioroot)}/#{file}" 133 else 134 "#{@database[db_abbrev].path}/#{file}" 135 end 136 end 137 end
Returns an absolute path for the flat file database. (e.g. ‘/bio/db/kegg/genes’, …)
Source
# File lib/bio/db/kegg/keggtab.rb 192 def taxa_list 193 @taxonomy.keys.sort 194 end
List of all node labels from Taxonomy section. (e.g. [“actinobacteria”, “animals”, “archaea”, “bacillales”, …)
Source
# File lib/bio/db/kegg/keggtab.rb 203 def taxo2korgs(node = 'genes') 204 if node.length == 3 205 return node 206 else 207 if @taxonomy[node] 208 tmp = Array.new 209 @taxonomy[node].each do |x| 210 tmp.push(taxo2korgs(x)) 211 end 212 return tmp 213 else 214 return nil 215 end 216 end 217 end
Returns an array of organism names included in the specified taxon label. (e.g. ‘proteobeta’ -> [“nme”, “nma”, “rso”]) This method has taxo2keggorgs, taxon2korgs, and taxon2keggorgs aliases.
Source
# File lib/bio/db/kegg/keggtab.rb 182 def taxonomy(node = nil) 183 if node 184 @taxonomy[node] 185 else 186 @taxonomy 187 end 188 end
Returns a hash containing Taxonomy section of the keggtab file. If argument is given, returns a List of all child nodes belongs to the label node. (e.g. “eukaryotes” -> [“animals”, “plants”, “protists”, “fungi”], …)