class Bio::KEGG::GENES

Description

KEGG GENES entry parser.

References

Constants

DELIMITER
TAGSIZE

Public Class Methods

new(entry) click to toggle source

Creates a new Bio::KEGG::GENES object.


Arguments:

  • (required) entry: (String) single entry as a string

Returns

Bio::KEGG::GENES object

Calls superclass method Bio::NCBIDB::new
    # File lib/bio/db/kegg/genes.rb
120 def initialize(entry)
121   super(entry, TAGSIZE)
122 end

Public Instance Methods

aalen() click to toggle source

Returns length of the amino acid sequence described in the AASEQ lines.


Returns

Integer

    # File lib/bio/db/kegg/genes.rb
419 def aalen
420   fetch('AASEQ')[/\d+/].to_i
421 end
aaseq() click to toggle source

Returns amino acid sequence described in the AASEQ lines.


Returns

Bio::Sequence::AA object

    # File lib/bio/db/kegg/genes.rb
409 def aaseq
410   unless @data['AASEQ']
411     @data['AASEQ'] = Bio::Sequence::AA.new(fetch('AASEQ').gsub(/\d+/, ''))
412   end
413   @data['AASEQ']
414 end
chromosome() click to toggle source

Chromosome described in the POSITION line.


Returns

String or nil

    # File lib/bio/db/kegg/genes.rb
290 def chromosome
291   if position[/:/]
292     position.sub(/:.*/, '')
293   elsif ! position[/\.\./]
294     position
295   else
296     nil
297   end
298 end
codon_usage(codon = nil) click to toggle source

Codon usage data described in the CODON_USAGE lines. (Deprecated: no more exists)


Returns

Hash

    # File lib/bio/db/kegg/genes.rb
376 def codon_usage(codon = nil)
377   unless @data['CODON_USAGE']
378     hash = Hash.new
379     list = cu_list
380     base = %w(t c a g)
381     base.each_with_index do |x, i|
382       base.each_with_index do |y, j|
383         base.each_with_index do |z, k|
384           hash["#{x}#{y}#{z}"] = list[i*16 + j*4 + k]
385         end
386       end
387     end
388     @data['CODON_USAGE'] = hash
389   end
390   @data['CODON_USAGE']
391 end
cu_list() click to toggle source

Codon usage data described in the CODON_USAGE lines as an array.


Returns

Array

    # File lib/bio/db/kegg/genes.rb
396 def cu_list
397   ary = []
398   get('CODON_USAGE').sub(/.*/,'').each_line do |line| # cut 1st line
399     line.chomp.sub(/^.{11}/, '').scan(/..../) do |cu|
400       ary.push(cu.to_i)
401     end
402   end
403   return ary
404 end
definition() click to toggle source

Definition of the entry, described in the DEFINITION line.


Returns

String

    # File lib/bio/db/kegg/genes.rb
204 def definition
205   field_fetch('DEFINITION')
206 end
diseases()
Alias for: diseases_as_hash
diseases_as_hash() click to toggle source

Returns a Hash of the disease ID and its definition

    # File lib/bio/db/kegg/genes.rb
112 def diseases_as_hash; super; end
Also aliased as: diseases
diseases_as_strings() click to toggle source

Diseases described in the DISEASE lines.


Returns

Array containing String

    # File lib/bio/db/kegg/genes.rb
256 def diseases_as_strings
257   lines_fetch('DISEASE')
258 end
division() click to toggle source

Division of the entry, described in the ENTRY line.


Returns

String

    # File lib/bio/db/kegg/genes.rb
154 def division
155   entry['division']                   # CDS, tRNA etc.
156 end
drug_targets_as_strings() click to toggle source

Drug targets described in the DRUG_TARGET lines.


Returns

Array containing String

    # File lib/bio/db/kegg/genes.rb
263 def drug_targets_as_strings
264   lines_fetch('DRUG_TARGET')
265 end
entry() click to toggle source

Returns the “ENTRY” line content as a Hash. For example,

{"organism"=>"E.coli", "division"=>"CDS", "id"=>"b0356"}

Returns

Hash

    # File lib/bio/db/kegg/genes.rb
130 def entry
131   unless @data['ENTRY']
132     hash = Hash.new('')
133     if get('ENTRY').length > 30
134       e = get('ENTRY')
135       hash['id']       = e[12..29].strip
136       hash['division'] = e[30..39].strip
137       hash['organism'] = e[40..80].strip
138     end
139     @data['ENTRY'] = hash
140   end
141   @data['ENTRY']
142 end
entry_id() click to toggle source

ID of the entry, described in the ENTRY line.


Returns

String

    # File lib/bio/db/kegg/genes.rb
147 def entry_id
148   entry['id']
149 end
gbposition() click to toggle source

The position in the genome described in the POSITION line as GenBank feature table location formatted string.


Returns

String

    # File lib/bio/db/kegg/genes.rb
304 def gbposition
305   position.sub(/.*?:/, '')
306 end
gene() click to toggle source

The method will be deprecated. Use entry.names.first instead.

Returns the first gene name described in the NAME line.


Returns

String

    # File lib/bio/db/kegg/genes.rb
197 def gene
198   genes.first
199 end
genes() click to toggle source

The method will be deprecated. Use Bio::KEGG::GENES#names.

Names of the entry as an Array, described in the NAME line.


Returns

Array containing String

    # File lib/bio/db/kegg/genes.rb
187 def genes
188   names_as_array
189 end
keggclass() click to toggle source

Returns CLASS field of the entry.

    # File lib/bio/db/kegg/genes.rb
268 def keggclass
269   field_fetch('CLASS')
270 end
keggclasses() click to toggle source

Returns an Array of biological classes in CLASS field.

    # File lib/bio/db/kegg/genes.rb
273 def keggclasses
274   keggclass.gsub(/ \[[^\]]+/, '').split(/\] ?/)
275 end
locations() click to toggle source

The position in the genome described in the POSITION line as Bio::Locations object.


Returns

Bio::Locations object

    # File lib/bio/db/kegg/genes.rb
312 def locations
313   Bio::Locations.new(gbposition)
314 end
motif() click to toggle source

The specification of the method will be changed in the future. Please use Bio::KEGG::GENES#motifs.

Motif information described in the MOTIF lines.


Returns

Hash

    # File lib/bio/db/kegg/genes.rb
351 def motif
352   motifs
353 end
motifs()
Alias for: motifs_as_hash
motifs_as_hash() click to toggle source

Motif information described in the MOTIF lines.


Returns

Hash

    # File lib/bio/db/kegg/genes.rb
326 def motifs_as_hash
327   unless @data['MOTIF']
328     hash = {}
329     db = nil
330     motifs_as_strings.each do |line|
331       if line[/^\S+:/]
332         db, str = line.split(/:/, 2)
333       else
334         str = line
335       end
336       hash[db] ||= []
337       hash[db] += str.strip.split(/\s+/)
338     end
339     @data['MOTIF'] = hash
340   end
341   @data['MOTIF']              # Hash of Array of IDs in MOTIF
342 end
Also aliased as: motifs
motifs_as_strings() click to toggle source

Motif information described in the MOTIF lines.


Returns

Strings

    # File lib/bio/db/kegg/genes.rb
319 def motifs_as_strings
320   lines_fetch('MOTIF')
321 end
nalen()
Alias for: ntlen
name() click to toggle source

Returns the NAME line.


Returns

String

    # File lib/bio/db/kegg/genes.rb
168 def name
169   field_fetch('NAME')
170 end
names()
Alias for: names_as_array
names_as_array() click to toggle source

Names of the entry as an Array, described in the NAME line.


Returns

Array containing String

    # File lib/bio/db/kegg/genes.rb
176 def names_as_array
177   name.split(', ')
178 end
Also aliased as: names
naseq()
Alias for: ntseq
networks_as_strings() click to toggle source

Networks described in the NETWORK lines.


Returns

Array containing String

    # File lib/bio/db/kegg/genes.rb
249 def networks_as_strings
250   lines_fetch('NETWORK')
251 end
ntlen() click to toggle source

Returns nucleic acid sequence length.


Returns

Integer

    # File lib/bio/db/kegg/genes.rb
437 def ntlen
438   fetch('NTSEQ')[/\d+/].to_i
439 end
Also aliased as: nalen
ntseq() click to toggle source

Returns nucleic acid sequence described in the NTSEQ lines.


Returns

Bio::Sequence::NA object

    # File lib/bio/db/kegg/genes.rb
426 def ntseq
427   unless @data['NTSEQ']
428     @data['NTSEQ'] = Bio::Sequence::NA.new(fetch('NTSEQ').gsub(/\d+/, ''))
429   end
430   @data['NTSEQ']
431 end
Also aliased as: naseq
organism() click to toggle source

Organism name of the entry, described in the ENTRY line.


Returns

String

    # File lib/bio/db/kegg/genes.rb
161 def organism
162   entry['organism']                   # H.sapiens etc.
163 end
orthologs()
Alias for: orthologs_as_hash
orthologs_as_hash() click to toggle source

Returns a Hash of the orthology ID and definition in ORTHOLOGY field.

    # File lib/bio/db/kegg/genes.rb
107 def orthologs_as_hash; super; end
Also aliased as: orthologs
orthologs_as_strings() click to toggle source

Orthologs described in the ORTHOLOGY lines.


Returns

Array containing String

    # File lib/bio/db/kegg/genes.rb
225 def orthologs_as_strings
226   lines_fetch('ORTHOLOGY')
227 end
pathway() click to toggle source

Returns the PATHWAY lines as a String.


Returns

String

    # File lib/bio/db/kegg/genes.rb
232 def pathway
233   unless defined? @pathway
234     @pathway = fetch('PATHWAY')
235   end
236   @pathway
237 end
pathways()
Alias for: pathways_as_hash
pathways_as_hash() click to toggle source

Returns a Hash of the pathway ID and name in PATHWAY field.

    # File lib/bio/db/kegg/genes.rb
102 def pathways_as_hash; super; end
Also aliased as: pathways
pathways_as_strings() click to toggle source

Pathways described in the PATHWAY lines.


Returns

Array containing String

    # File lib/bio/db/kegg/genes.rb
242 def pathways_as_strings
243   lines_fetch('PATHWAY')
244 end
position() click to toggle source

The position in the genome described in the POSITION line.


Returns

String

    # File lib/bio/db/kegg/genes.rb
280 def position
281   unless @data['POSITION']
282     @data['POSITION'] = fetch('POSITION').gsub(/\s/, '')
283   end
284   @data['POSITION']
285 end
structure() click to toggle source

Returns structure ID information described in the STRUCTURE lines.


Returns

Array containing String

    # File lib/bio/db/kegg/genes.rb
365 def structure
366   unless @data['STRUCTURE']
367     @data['STRUCTURE'] = fetch('STRUCTURE').sub(/(PDB: )*/,'').split(/\s+/)
368   end
369   @data['STRUCTURE'] # ['PDB:1A9X', ...]
370 end
Also aliased as: structures
structures()
Alias for: structure