class Bio::Locations
Description¶ ↑
The Bio::Locations
class is a container for Bio::Location
objects: creating a Bio::Locations
object (based on a GenBank
style position string) will spawn an array of Bio::Location
objects.
Usage¶ ↑
locations = Bio::Locations.new('join(complement(500..550), 600..625)') locations.each do |loc| puts "class = " + loc.class.to_s puts "range = #{loc.from}..#{loc.to} (strand = #{loc.strand})" end # Output would be: # class = Bio::Location # range = 500..550 (strand = -1) # class = Bio::Location # range = 600..625 (strand = 1) # For the following three location strings, print the span and range ['one-of(898,900)..983', 'one-of(5971..6308,5971..6309)', '8050..one-of(10731,10758,10905,11242)'].each do |loc| location = Bio::Locations.new(loc) puts location.span puts location.range end
GenBank
location descriptor classification¶ ↑
Definition of the position notation of the GenBank
location format¶ ↑
According to the GenBank
manual ‘gbrel.txt’, position notations were classified into 10 patterns - (A) to (J).
3.4.12.2 Feature Location The second column of the feature descriptor line designates the location of the feature in the sequence. The location descriptor begins at position 22. Several conventions are used to indicate sequence location. Base numbers in location descriptors refer to numbering in the entry, which is not necessarily the same as the numbering scheme used in the published report. The first base in the presented sequence is numbered base 1. Sequences are presented in the 5 to 3 direction. Location descriptors can be one of the following: (A) 1. A single base; (B) 2. A contiguous span of bases; (C) 3. A site between two bases; (D) 4. A single base chosen from a range of bases; (E) 5. A single base chosen from among two or more specified bases; (F) 6. A joining of sequence spans; (G) 7. A reference to an entry other than the one to which the feature belongs (i.e., a remote entry), followed by a location descriptor referring to the remote sequence; (H) 8. A literal sequence (a string of bases enclosed in quotation marks).
Description commented with pattern IDs.¶ ↑
(C) A site between two residues, such as an endonuclease cleavage site, is indicated by listing the two bases separated by a carat (e.g., 23^24). (D) A single residue chosen from a range of residues is indicated by the number of the first and last bases in the range separated by a single period (e.g., 23.79). The symbols < and > indicate that the end point (I) of the range is beyond the specified base number. (B) A contiguous span of bases is indicated by the number of the first and last bases in the range separated by two periods (e.g., 23..79). The (I) symbols < and > indicate that the end point of the range is beyond the specified base number. Starting and ending positions can be indicated by base number or by one of the operators described below. Operators are prefixes that specify what must be done to the indicated sequence to locate the feature. The following are the operators available, along with their most common format and a description. (J) complement (location): The feature is complementary to the location indicated. Complementary strands are read 5 to 3. (F) join (location, location, .. location): The indicated elements should be placed end to end to form one contiguous sequence. (F) order (location, location, .. location): The elements are found in the specified order in the 5 to 3 direction, but nothing is implied about the rationality of joining them. (F) group (location, location, .. location): The elements are related and should be grouped together, but no order is implied. (E) one-of (location, location, .. location): The element can be any one, but only one, of the items listed.
Reduction strategy of the position notations¶ ↑
Attributes
(Array) An Array of Bio::Location
objects
(Symbol or nil) Operator. nil (means :join), :order, or :group (obsolete).
Public Class Methods
Parses a GenBank
style position string and returns a Bio::Locations
object, which contains a list of Bio::Location
objects.
locations = Bio::Locations.new('join(complement(500..550), 600..625)')
Arguments:
-
(required) str:
GenBank
style position string
- Returns
-
Bio::Locations
object
# File lib/bio/location.rb 346 def initialize(position) 347 @operator = nil 348 if position.is_a? Array 349 @locations = position 350 else 351 position = gbl_cleanup(position) # preprocessing 352 @locations = gbl_pos2loc(position) # create an Array of Bio::Location objects 353 end 354 end
Public Instance Methods
If other is equal with the self, returns true. Otherwise, returns false.
Arguments:
-
(required) other: any object
- Returns
-
true or false
# File lib/bio/location.rb 381 def ==(other) 382 return true if super(other) 383 return false unless other.instance_of?(self.class) 384 if self.locations == other.locations and 385 self.operator == other.operator then 386 true 387 else 388 false 389 end 390 end
Returns nth Bio::Location
object.
# File lib/bio/location.rb 400 def [](n) 401 @locations[n] 402 end
Converts relative position in the locus to position in the whole of the DNA sequence.
This method can for example be used to relate positions in a DNA-sequence with those in RNA. In this use, the optional ‘:aa’-flag returns the position of the associated amino-acid rather than the nucleotide.
loc = Bio::Locations.new('complement(12838..13533)') puts loc.absolute(10) # => 13524 puts loc.absolute(10, :aa) # => 13506
Arguments:
-
(required) position: nucleotide position within locus
-
:aa: flag to be used if position is a aminoacid position rather than a nucleotide position
- Returns
-
position within the whole of the sequence
# File lib/bio/location.rb 490 def absolute(n, type = nil) 491 case type 492 when :location 493 ; 494 when :aa 495 n = (n - 1) * 3 + 1 496 rel2abs(n) 497 else 498 rel2abs(n) 499 end 500 end
Iterates on each Bio::Location
object.
# File lib/bio/location.rb 393 def each 394 @locations.each do |x| 395 yield(x) 396 end 397 end
Evaluate equality of Bio::Locations
object.
# File lib/bio/location.rb 364 def equals?(other) 365 if ! other.kind_of?(Bio::Locations) 366 return nil 367 end 368 if self.sort == other.sort 369 return true 370 else 371 return false 372 end 373 end
Returns first Bio::Location
object.
# File lib/bio/location.rb 405 def first 406 @locations.first 407 end
Returns last Bio::Location
object.
# File lib/bio/location.rb 410 def last 411 @locations.last 412 end
Returns a length of the spliced RNA.
# File lib/bio/location.rb 429 def length 430 len = 0 431 @locations.each do |x| 432 if x.sequence 433 len += x.sequence.size 434 else 435 len += (x.to - x.from + 1) 436 end 437 end 438 len 439 end
Similar to span, but returns a Range object min..max
# File lib/bio/location.rb 423 def range 424 min, max = span 425 min..max 426 end
Converts absolute position in the whole of the DNA sequence to relative position in the locus.
This method can for example be used to relate positions in a DNA-sequence with those in RNA. In this use, the optional ‘:aa’-flag returns the position of the associated amino-acid rather than the nucleotide.
loc = Bio::Locations.new('complement(12838..13533)') puts loc.relative(13524) # => 10 puts loc.relative(13506, :aa) # => 3
Arguments:
-
(required) position: nucleotide position within whole of the sequence
-
:aa: flag that lets method return position in aminoacid coordinates
- Returns
-
position within the location
# File lib/bio/location.rb 458 def relative(n, type = nil) 459 case type 460 when :location 461 ; 462 when :aa 463 if n = abs2rel(n) 464 (n - 1) / 3 + 1 465 else 466 nil 467 end 468 else 469 abs2rel(n) 470 end 471 end
Returns an Array containing overall min and max position [min, max] of this Bio::Locations
object.
# File lib/bio/location.rb 416 def span 417 span_min = @locations.min { |a,b| a.from <=> b.from } 418 span_max = @locations.max { |a,b| a.to <=> b.to } 419 return span_min.from, span_max.to 420 end
String representation.
Note: In some cases, it fails to detect whether “complement(join(…))” or “join(complement(..))”, and whether “complement(order(…))” or “order(complement(..))”.
- Returns
-
String
# File lib/bio/location.rb 511 def to_s 512 return '' if @locations.empty? 513 complement_join = false 514 locs = @locations 515 if locs.size >= 2 and locs.inject(true) do |flag, loc| 516 # check if each location is complement 517 (flag && (loc.strand == -1) && !loc.xref_id) 518 end and locs.inject(locs[0].from) do |pos, loc| 519 if pos then 520 (pos >= loc.from) ? loc.from : false 521 else 522 false 523 end 524 end then 525 locs = locs.reverse 526 complement_join = true 527 end 528 locs = locs.collect do |loc| 529 lt = loc.lt ? '<' : '' 530 gt = loc.gt ? '>' : '' 531 str = if loc.from == loc.to then 532 "#{lt}#{gt}#{loc.from.to_i}" 533 elsif loc.carat then 534 "#{lt}#{loc.from.to_i}^#{gt}#{loc.to.to_i}" 535 else 536 "#{lt}#{loc.from.to_i}..#{gt}#{loc.to.to_i}" 537 end 538 if loc.xref_id and !loc.xref_id.empty? then 539 str = "#{loc.xref_id}:#{str}" 540 end 541 if loc.strand == -1 and !complement_join then 542 str = "complement(#{str})" 543 end 544 if loc.sequence then 545 str = "replace(#{str},\"#{loc.sequence}\")" 546 end 547 str 548 end 549 if locs.size >= 2 then 550 op = (self.operator || 'join').to_s 551 result = "#{op}(#{locs.join(',')})" 552 else 553 result = locs[0] 554 end 555 if complement_join then 556 result = "complement(#{result})" 557 end 558 result 559 end