class Bio::KEGG::GENES

Description

KEGG GENES entry parser.

References

Constants

DELIMITER
TAGSIZE

Public Class Methods

new(entry) click to toggle source

Creates a new Bio::KEGG::GENES object.


Arguments:

  • (required) entry: (String) single entry as a string

Returns

Bio::KEGG::GENES object

Calls superclass method Bio::NCBIDB.new
# File lib/bio/db/kegg/genes.rb, line 115
def initialize(entry)
  super(entry, TAGSIZE)
end

Public Instance Methods

aalen() click to toggle source

Returns length of the amino acid sequence described in the AASEQ lines.


Returns

Integer

# File lib/bio/db/kegg/genes.rb, line 393
def aalen
  fetch('AASEQ')[/\d+/].to_i
end
aaseq() click to toggle source

Returns amino acid sequence described in the AASEQ lines.


Returns

Bio::Sequence::AA object

# File lib/bio/db/kegg/genes.rb, line 383
def aaseq
  unless @data['AASEQ']
    @data['AASEQ'] = Bio::Sequence::AA.new(fetch('AASEQ').gsub(/\d+/, ''))
  end
  @data['AASEQ']
end
chromosome() click to toggle source

Chromosome described in the POSITION line.


Returns

String or nil

# File lib/bio/db/kegg/genes.rb, line 264
def chromosome
  if position[/:/]
    position.sub(/:.*/, '')
  elsif ! position[/\.\./]
    position
  else
    nil
  end
end
codon_usage(codon = nil) click to toggle source

Codon usage data described in the CODON_USAGE lines. (Deprecated: no more exists)


Returns

Hash

# File lib/bio/db/kegg/genes.rb, line 350
def codon_usage(codon = nil)
  unless @data['CODON_USAGE']
    hash = Hash.new
    list = cu_list
    base = %w(t c a g)
    base.each_with_index do |x, i|
      base.each_with_index do |y, j|
        base.each_with_index do |z, k|
          hash["#{x}#{y}#{z}"] = list[i*16 + j*4 + k]
        end
      end
    end
    @data['CODON_USAGE'] = hash
  end
  @data['CODON_USAGE']
end
cu_list() click to toggle source

Codon usage data described in the CODON_USAGE lines as an array.


Returns

Array

# File lib/bio/db/kegg/genes.rb, line 370
def cu_list
  ary = []
  get('CODON_USAGE').sub(/.*/,'').each_line do |line| # cut 1st line
    line.chomp.sub(/^.{11}/, '').scan(/..../) do |cu|
      ary.push(cu.to_i)
    end
  end
  return ary
end
definition() click to toggle source

Definition of the entry, described in the DEFINITION line.


Returns

String

# File lib/bio/db/kegg/genes.rb, line 199
def definition
  field_fetch('DEFINITION')
end
division() click to toggle source

Division of the entry, described in the ENTRY line.


Returns

String

# File lib/bio/db/kegg/genes.rb, line 149
def division
  entry['division']                   # CDS, tRNA etc.
end
entry() click to toggle source

Returns the “ENTRY” line content as a Hash. For example,

{"organism"=>"E.coli", "division"=>"CDS", "id"=>"b0356"}

Returns

Hash

# File lib/bio/db/kegg/genes.rb, line 125
def entry
  unless @data['ENTRY']
    hash = Hash.new('')
    if get('ENTRY').length > 30
      e = get('ENTRY')
      hash['id']       = e[12..29].strip
      hash['division'] = e[30..39].strip
      hash['organism'] = e[40..80].strip
    end
    @data['ENTRY'] = hash
  end
  @data['ENTRY']
end
entry_id() click to toggle source

ID of the entry, described in the ENTRY line.


Returns

String

# File lib/bio/db/kegg/genes.rb, line 142
def entry_id
  entry['id']
end
gbposition() click to toggle source

The position in the genome described in the POSITION line as GenBank feature table location formatted string.


Returns

String

# File lib/bio/db/kegg/genes.rb, line 278
def gbposition
  position.sub(/.*?:/, '')
end
gene() click to toggle source

The method will be deprecated. Use entry.names.first instead.

Returns the first gene name described in the NAME line.


Returns

String

# File lib/bio/db/kegg/genes.rb, line 192
def gene
  genes.first
end
genes() click to toggle source

The method will be deprecated. Use #names.

Names of the entry as an Array, described in the NAME line.


Returns

Array containing String

# File lib/bio/db/kegg/genes.rb, line 182
def genes
  names_as_array
end
keggclass() click to toggle source

Returns CLASS field of the entry.

# File lib/bio/db/kegg/genes.rb, line 242
def keggclass
  field_fetch('CLASS')
end
keggclasses() click to toggle source

Returns an Array of biological classes in CLASS field.

# File lib/bio/db/kegg/genes.rb, line 247
def keggclasses
  keggclass.gsub(/ \[[^\]]+/, '').split(/\] ?/)
end
locations() click to toggle source

The position in the genome described in the POSITION line as Bio::Locations object.


Returns

Bio::Locations object

# File lib/bio/db/kegg/genes.rb, line 286
def locations
  Bio::Locations.new(gbposition)
end
motif() click to toggle source

The specification of the method will be changed in the future. Please use #motifs.

Motif information described in the MOTIF lines.


Returns

Hash

# File lib/bio/db/kegg/genes.rb, line 325
def motif
  motifs
end
motifs()
Alias for: motifs_as_hash
motifs_as_hash() click to toggle source

Motif information described in the MOTIF lines.


Returns

Hash

# File lib/bio/db/kegg/genes.rb, line 300
def motifs_as_hash
  unless @data['MOTIF']
    hash = {}
    db = nil
    motifs_as_strings.each do |line|
      if line[/^\S+:/]
        db, str = line.split(/:/, 2)
      else
        str = line
      end
      hash[db] ||= []
      hash[db] += str.strip.split(/\s+/)
    end
    @data['MOTIF'] = hash
  end
  @data['MOTIF']              # Hash of Array of IDs in MOTIF
end
Also aliased as: motifs
motifs_as_strings() click to toggle source

Motif information described in the MOTIF lines.


Returns

Strings

# File lib/bio/db/kegg/genes.rb, line 293
def motifs_as_strings
  lines_fetch('MOTIF')
end
nalen()
Alias for: ntlen
name() click to toggle source

Returns the NAME line.


Returns

String

# File lib/bio/db/kegg/genes.rb, line 163
def name
  field_fetch('NAME')
end
names()
Alias for: names_as_array
names_as_array() click to toggle source

Names of the entry as an Array, described in the NAME line.


Returns

Array containing String

# File lib/bio/db/kegg/genes.rb, line 171
def names_as_array
  name.split(', ')
end
Also aliased as: names
naseq()
Alias for: ntseq
ntlen() click to toggle source

Returns nucleic acid sequence length.


Returns

Integer

# File lib/bio/db/kegg/genes.rb, line 411
def ntlen
  fetch('NTSEQ')[/\d+/].to_i
end
Also aliased as: nalen
ntseq() click to toggle source

Returns nucleic acid sequence described in the NTSEQ lines.


Returns

Bio::Sequence::NA object

# File lib/bio/db/kegg/genes.rb, line 400
def ntseq
  unless @data['NTSEQ']
    @data['NTSEQ'] = Bio::Sequence::NA.new(fetch('NTSEQ').gsub(/\d+/, ''))
  end
  @data['NTSEQ']
end
Also aliased as: naseq
organism() click to toggle source

Organism name of the entry, described in the ENTRY line.


Returns

String

# File lib/bio/db/kegg/genes.rb, line 156
def organism
  entry['organism']                   # H.sapiens etc.
end
orthologs()
Alias for: orthologs_as_hash
orthologs_as_hash() click to toggle source

Returns a Hash of the orthology ID and definition in ORTHOLOGY field.

# File lib/bio/db/kegg/genes.rb, line 107
def orthologs_as_hash; super; end
Also aliased as: orthologs
orthologs_as_strings() click to toggle source

Orthologs described in the ORTHOLOGY lines.


Returns

Array containing String

# File lib/bio/db/kegg/genes.rb, line 220
def orthologs_as_strings
  lines_fetch('ORTHOLOGY')
end
pathway() click to toggle source

Returns the PATHWAY lines as a String.


Returns

String

# File lib/bio/db/kegg/genes.rb, line 227
def pathway
  unless defined? @pathway
    @pathway = fetch('PATHWAY')
  end
  @pathway
end
pathways()
Alias for: pathways_as_hash
pathways_as_hash() click to toggle source

Returns a Hash of the pathway ID and name in PATHWAY field.

# File lib/bio/db/kegg/genes.rb, line 102
def pathways_as_hash; super; end
Also aliased as: pathways
pathways_as_strings() click to toggle source

Pathways described in the PATHWAY lines.


Returns

Array containing String

# File lib/bio/db/kegg/genes.rb, line 237
def pathways_as_strings
  lines_fetch('PATHWAY')
end
position() click to toggle source

The position in the genome described in the POSITION line.


Returns

String

# File lib/bio/db/kegg/genes.rb, line 254
def position
  unless @data['POSITION']
    @data['POSITION'] = fetch('POSITION').gsub(/\s/, '')
  end
  @data['POSITION']
end
structure() click to toggle source

Returns structure ID information described in the STRUCTURE lines.


Returns

Array containing String

# File lib/bio/db/kegg/genes.rb, line 339
def structure
  unless @data['STRUCTURE']
    @data['STRUCTURE'] = fetch('STRUCTURE').sub(/(PDB: )*/,'').split(/\s+/)
  end
  @data['STRUCTURE'] # ['PDB:1A9X', ...]
end
Also aliased as: structures
structures()
Alias for: structure