class Bio::Blat::Report::Hit

Hit class for the BLAT result parser. Similar to Bio::Blast::Report::Hit but lacks many methods. Its object may contain some Bio::Blat::Report::SegmentPair objects.

Attributes

data[R]

Raw data of the hit. (Note that it doesn't add 1 to position numbers.)

Public Class Methods

new(str) click to toggle source

Creates a new Hit object from a piece of BLAT result text. It is designed to be called internally from Bio::Blat::Report object. Users shall not use it directly.

# File lib/bio/appl/blat/report.rb, line 293
def initialize(str)
  @data = str.chomp.split(/\t/)
end

Public Instance Methods

block_count() click to toggle source

Number of blocks(exons, segment pairs).

# File lib/bio/appl/blat/report.rb, line 350
def block_count; @data[17].to_i; end
block_sizes() click to toggle source

Sizes of all blocks(exons, segment pairs). Returns an array of numbers.

# File lib/bio/appl/blat/report.rb, line 354
def block_sizes
  unless defined?(@block_sizes) then
    @block_sizes = split_comma(@data[18]).collect { |x| x.to_i }
  end
  @block_sizes
end
blocks() click to toggle source

Returns blocks(exons, segment pairs) of the hit. Returns an array of Bio::Blat::Report::SegmentPair objects.

# File lib/bio/appl/blat/report.rb, line 363
def blocks
  unless defined?(@blocks)
    bs    = block_sizes
    qst   = query.starts
    tst   = target.starts
    qseqs = query.seqs
    tseqs = target.seqs
    pflag = self.protein?
    @blocks = (0...block_count).collect do |i|
      SegmentPair.new(query.size, target.size, strand, bs[i],
                      qst[i], tst[i], qseqs[i], tseqs[i],
                      pflag)
    end
  end
  @blocks
end
Also aliased as: exons, hsps
definition()
Alias for: target_def
each() { |segmentpair| ... } click to toggle source

Iterates over each block(exon, segment pair) of the hit.

Yields a Bio::Blat::Report::SegmentPair object.
# File lib/bio/appl/blat/report.rb, line 404
def each(&x) #:yields: segmentpair
  exons.each(&x)
end
exons()
Alias for: blocks
hsps()
Alias for: blocks
len()
Alias for: target_len
match() click to toggle source

Match nucleotides.

# File lib/bio/appl/blat/report.rb, line 332
def match;       @data[0].to_i;  end
milli_bad() click to toggle source

Calculates the pslCalcMilliBad value defined in the BLAT FAQ (genome.ucsc.edu/FAQ/FAQblat#blat4).

The algorithm is taken from the BLAT FAQ (genome.ucsc.edu/FAQ/FAQblat#blat4).

# File lib/bio/appl/blat/report.rb, line 418
def milli_bad
  w = (self.protein? ? 3 : 1)
  qalen = w * (self.query.end - self.query.start)
  talen = self.target.end - self.target.start
  alen = (if qalen < talen then qalen; else talen; end)
  return 0 if alen <= 0
  d = qalen - talen
  d = 0 if d < 0
  total = w * (self.match + self.rep_match + self.mismatch)
  return 0 if total == 0
  return (1000 * (self.mismatch * w + self.query.gap_count +
                    (3 * Math.log(1 + d)).round) / total)
end
mismatch() click to toggle source

Mismatch nucleotides.

# File lib/bio/appl/blat/report.rb, line 334
def mismatch;    @data[1].to_i;  end
n_s() click to toggle source

“N's”. Number of 'N' bases.

# File lib/bio/appl/blat/report.rb, line 342
def n_s;         @data[3].to_i;  end
percent_identity() click to toggle source

Calculates the percent identity compatible with the BLAT web server as described in the BLAT FAQ (genome.ucsc.edu/FAQ/FAQblat#blat4).

The algorithm is taken from the BLAT FAQ (genome.ucsc.edu/FAQ/FAQblat#blat4).

# File lib/bio/appl/blat/report.rb, line 438
def percent_identity
  100.0 - self.milli_bad * 0.1
end
protein?() click to toggle source

When the output data comes from the protein query, returns true. Otherwise (nucleotide query), returns false. It returns nil if this cannot be determined.

The algorithm is taken from the BLAT FAQ (genome.ucsc.edu/FAQ/FAQblat#blat4).

Note: It seems that it returns true only when protein query with nucleotide database (blat options: -q=prot -t=dnax).

# File lib/bio/appl/blat/report.rb, line 451
def protein?
  return nil if self.block_sizes.empty?
  case self.strand[1,1]
  when '+'
    if self.target.end == self.target.starts[-1] +
        3 * self.block_sizes[-1] then
      true
    else
      false
    end
  when '-'
    if self.target.start == self.target.size -
        self.target.starts[-1] - 3 * self.block_sizes[-1] then
      true
    else
      false
    end
  else
    nil
  end
end
query() click to toggle source

Returns sequence informations of the query. Returns a Bio::Blat::Report::SeqDesc object. This would be Bio::Blat specific method.

# File lib/bio/appl/blat/report.rb, line 310
def query
  unless defined?(@query)
    d = @data
    @query = SeqDesc.new(d[4], d[5], d[9], d[10], d[11], d[12],
                         split_comma(d[19]), split_comma(d[21]))
  end
  @query
end
query_def() click to toggle source

Returns the name of query sequence.

# File lib/bio/appl/blat/report.rb, line 390
def query_def;  query.name;  end
Also aliased as: query_id
query_id()
Alias for: query_def
query_len() click to toggle source

Returns the length of query sequence.

# File lib/bio/appl/blat/report.rb, line 387
def query_len;  query.size;  end
rep_match() click to toggle source

“rep. match”. Number of bases that match but are part of repeats. Note that current version of BLAT always set 0.

# File lib/bio/appl/blat/report.rb, line 339
def rep_match;   @data[2].to_i;  end
score() click to toggle source

Calculates the score compatible with the BLAT web server as described in the BLAT FAQ (genome.ucsc.edu/FAQ/FAQblat#blat4).

The algorithm is taken from the BLAT FAQ (genome.ucsc.edu/FAQ/FAQblat#blat4).

# File lib/bio/appl/blat/report.rb, line 479
def score
  w = (self.protein? ? 3 : 1)
  w * (self.match + (self.rep_match >> 1)) -
    w * self.mismatch - self.query.gap_count - self.target.gap_count
end
strand() click to toggle source

Returns strand information of the hit. Returns '+' or '-'. This would be a Bio::Blat specific method.

# File lib/bio/appl/blat/report.rb, line 347
def strand;      @data[8];       end
target() click to toggle source

Returns sequence informations of the target(hit). Returns a Bio::Blat::Report::SeqDesc object. This would be Bio::Blat specific method.

# File lib/bio/appl/blat/report.rb, line 322
def target
  unless defined?(@target)
    d = @data
    @target = SeqDesc.new(d[6], d[7], d[13], d[14], d[15], d[16],
                          split_comma(d[20]), split_comma(d[22]))
  end
  @target
end
target_def() click to toggle source

Returns the name of the target(subject) sequence.

# File lib/bio/appl/blat/report.rb, line 398
def target_def; target.name; end
Also aliased as: target_id, definition
target_id()
Alias for: target_def
target_len() click to toggle source

Returns the length of the target(subject) sequence.

# File lib/bio/appl/blat/report.rb, line 394
def target_len; target.size; end
Also aliased as: len