class Bio::Blat::Report::Hit
Hit
class for the BLAT result parser. Similar to Bio::Blast::Report::Hit
but lacks many methods. Its object may contain some Bio::Blat::Report::SegmentPair
objects.
Attributes
Raw data of the hit. (Note that it doesn’t add 1 to position numbers.)
Public Class Methods
Creates a new Hit
object from a piece of BLAT result text. It is designed to be called internally from Bio::Blat::Report
object. Users shall not use it directly.
# File lib/bio/appl/blat/report.rb 293 def initialize(str) 294 @data = str.chomp.split(/\t/) 295 end
Public Instance Methods
Number of blocks(exons, segment pairs).
# File lib/bio/appl/blat/report.rb 350 def block_count; @data[17].to_i; end
Sizes of all blocks(exons, segment pairs). Returns an array of numbers.
# File lib/bio/appl/blat/report.rb 354 def block_sizes 355 unless defined?(@block_sizes) then 356 @block_sizes = split_comma(@data[18]).collect { |x| x.to_i } 357 end 358 @block_sizes 359 end
Returns blocks(exons, segment pairs) of the hit. Returns an array of Bio::Blat::Report::SegmentPair
objects.
# File lib/bio/appl/blat/report.rb 363 def blocks 364 unless defined?(@blocks) 365 bs = block_sizes 366 qst = query.starts 367 tst = target.starts 368 qseqs = query.seqs 369 tseqs = target.seqs 370 pflag = self.protein? 371 @blocks = (0...block_count).collect do |i| 372 SegmentPair.new(query.size, target.size, strand, bs[i], 373 qst[i], tst[i], qseqs[i], tseqs[i], 374 pflag) 375 end 376 end 377 @blocks 378 end
Iterates over each block(exon, segment pair) of the hit.
Yields a Bio::Blat::Report::SegmentPair object.
# File lib/bio/appl/blat/report.rb 404 def each(&x) #:yields: segmentpair 405 exons.each(&x) 406 end
Match nucleotides.
# File lib/bio/appl/blat/report.rb 332 def match; @data[0].to_i; end
Calculates the pslCalcMilliBad value defined in the BLAT FAQ (genome.ucsc.edu/FAQ/FAQblat#blat4).
The algorithm is taken from the BLAT FAQ (genome.ucsc.edu/FAQ/FAQblat#blat4).
# File lib/bio/appl/blat/report.rb 418 def milli_bad 419 w = (self.protein? ? 3 : 1) 420 qalen = w * (self.query.end - self.query.start) 421 talen = self.target.end - self.target.start 422 alen = (if qalen < talen then qalen; else talen; end) 423 return 0 if alen <= 0 424 d = qalen - talen 425 d = 0 if d < 0 426 total = w * (self.match + self.rep_match + self.mismatch) 427 return 0 if total == 0 428 return (1000 * (self.mismatch * w + self.query.gap_count + 429 (3 * Math.log(1 + d)).round) / total) 430 end
Mismatch nucleotides.
# File lib/bio/appl/blat/report.rb 334 def mismatch; @data[1].to_i; end
“N’s”. Number of ‘N’ bases.
# File lib/bio/appl/blat/report.rb 342 def n_s; @data[3].to_i; end
Calculates the percent identity compatible with the BLAT web server as described in the BLAT FAQ (genome.ucsc.edu/FAQ/FAQblat#blat4).
The algorithm is taken from the BLAT FAQ (genome.ucsc.edu/FAQ/FAQblat#blat4).
# File lib/bio/appl/blat/report.rb 438 def percent_identity 439 100.0 - self.milli_bad * 0.1 440 end
When the output data comes from the protein query, returns true. Otherwise (nucleotide query), returns false. It returns nil if this cannot be determined.
The algorithm is taken from the BLAT FAQ (genome.ucsc.edu/FAQ/FAQblat#blat4).
Note: It seems that it returns true only when protein query with nucleotide database (blat options: -q=prot -t=dnax).
# File lib/bio/appl/blat/report.rb 451 def protein? 452 return nil if self.block_sizes.empty? 453 case self.strand[1,1] 454 when '+' 455 if self.target.end == self.target.starts[-1] + 456 3 * self.block_sizes[-1] then 457 true 458 else 459 false 460 end 461 when '-' 462 if self.target.start == self.target.size - 463 self.target.starts[-1] - 3 * self.block_sizes[-1] then 464 true 465 else 466 false 467 end 468 else 469 nil 470 end 471 end
Returns sequence informations of the query. Returns a Bio::Blat::Report::SeqDesc
object. This would be Bio::Blat
specific method.
# File lib/bio/appl/blat/report.rb 310 def query 311 unless defined?(@query) 312 d = @data 313 @query = SeqDesc.new(d[4], d[5], d[9], d[10], d[11], d[12], 314 split_comma(d[19]), split_comma(d[21])) 315 end 316 @query 317 end
Returns the name of query sequence.
# File lib/bio/appl/blat/report.rb 390 def query_def; query.name; end
Returns the length of query sequence.
# File lib/bio/appl/blat/report.rb 387 def query_len; query.size; end
“rep. match”. Number of bases that match but are part of repeats. Note that current version of BLAT always set 0.
# File lib/bio/appl/blat/report.rb 339 def rep_match; @data[2].to_i; end
Calculates the score compatible with the BLAT web server as described in the BLAT FAQ (genome.ucsc.edu/FAQ/FAQblat#blat4).
The algorithm is taken from the BLAT FAQ (genome.ucsc.edu/FAQ/FAQblat#blat4).
# File lib/bio/appl/blat/report.rb 479 def score 480 w = (self.protein? ? 3 : 1) 481 w * (self.match + (self.rep_match >> 1)) - 482 w * self.mismatch - self.query.gap_count - self.target.gap_count 483 end
Returns strand information of the hit. Returns ‘+’ or ‘-’. This would be a Bio::Blat
specific method.
# File lib/bio/appl/blat/report.rb 347 def strand; @data[8]; end
Returns sequence informations of the target(hit). Returns a Bio::Blat::Report::SeqDesc
object. This would be Bio::Blat
specific method.
# File lib/bio/appl/blat/report.rb 322 def target 323 unless defined?(@target) 324 d = @data 325 @target = SeqDesc.new(d[6], d[7], d[13], d[14], d[15], d[16], 326 split_comma(d[20]), split_comma(d[22])) 327 end 328 @target 329 end
Returns the name of the target(subject) sequence.
# File lib/bio/appl/blat/report.rb 398 def target_def; target.name; end
Returns the length of the target(subject) sequence.
# File lib/bio/appl/blat/report.rb 394 def target_len; target.size; end