class Bio::RestrictionEnzyme::Range::SequenceRange

A defined range over a nucleotide sequence.

This class accomadates having cuts defined on a sequence and returning the fragments made by those cuts.

Constants

Bin

A Bio::RestrictionEnzyme::Range::SequenceRange::Bin holds an Array of indexes for the primary and complement strands (p and c accessors).

Example hash with Bin values:

{0=>#<struct Bio::RestrictionEnzyme::Range::SequenceRange::Bin c=[0, 1], p=[0]>,
 2=>#<struct Bio::RestrictionEnzyme::Range::SequenceRange::Bin c=[], p=[1, 2]>,
 3=>#<struct Bio::RestrictionEnzyme::Range::SequenceRange::Bin c=[2, 3], p=[]>,
 4=>#<struct Bio::RestrictionEnzyme::Range::SequenceRange::Bin c=[4, 5], p=[3, 4, 5]>}

Note that the bin cannot be easily stored as a range since there may be nucleotides excised in the middle of a range.

TODO: Perhaps store the bins as one-or-many ranges since missing nucleotides due to enzyme cutting is a special case.

Attributes

c_left[R]

Left-most index of complementary strand

c_right[R]

Right-most index of complementary strand

cut_ranges[R]
left[R]

Left-most index of DNA sequence

p_left[R]

Left-most index of primary strand

p_right[R]

Right-most index of primary strand

right[R]

Right-most index of DNA sequence

size[R]

Size of DNA sequence

Public Class Methods

new( p_left = nil, p_right = nil, c_left = nil, c_right = nil ) click to toggle source
   # File lib/bio/util/restriction_enzyme/range/sequence_range.rb
55 def initialize( p_left = nil, p_right = nil, c_left = nil, c_right = nil )
56   raise ArgumentError if p_left == nil and c_left == nil
57   raise ArgumentError if p_right == nil and c_right == nil
58   (raise ArgumentError unless p_left <= p_right) unless p_left == nil or p_right == nil
59   (raise ArgumentError unless c_left <= c_right) unless c_left == nil or c_right == nil
60 
61   @p_left, @p_right, @c_left, @c_right = p_left, p_right, c_left, c_right
62   @left = [p_left, c_left].compact.sort.first
63   @right = [p_right, c_right].compact.sort.last
64   @size = (@right - @left) + 1 unless @left == nil or @right == nil
65   @cut_ranges = CutRanges.new
66   @__fragments_current = false
67 end

Public Instance Methods

add_cut_range( p_cut_left=nil, p_cut_right=nil, c_cut_left=nil, c_cut_right=nil ) click to toggle source

If the first object is HorizontalCutRange or VerticalCutRange, that is added to the SequenceRange. Otherwise this method builds a VerticalCutRange object and adds it to the SequenceRange.

Note: Cut occurs immediately after the index supplied. For example, a cut at ‘0’ would mean a cut occurs between bases 0 and 1.


Arguments

  • p_cut_left: (optional) Left-most cut on the primary strand or a CutRange object. nil to skip

  • p_cut_right: (optional) Right-most cut on the primary strand. nil to skip

  • c_cut_left: (optional) Left-most cut on the complementary strand. nil to skip

  • c_cut_right: (optional) Right-most cut on the complementary strand. nil to skip

Returns

nothing

   # File lib/bio/util/restriction_enzyme/range/sequence_range.rb
85 def add_cut_range( p_cut_left=nil, p_cut_right=nil, c_cut_left=nil, c_cut_right=nil )
86   @__fragments_current = false
87   if p_cut_left.kind_of? CutRange # shortcut
88     @cut_ranges << p_cut_left
89   else
90     [p_cut_left, p_cut_right, c_cut_left, c_cut_right].each { |n| (raise IndexError unless n >= @left and n <= @right) unless n == nil }
91     @cut_ranges << VerticalCutRange.new( p_cut_left, p_cut_right, c_cut_left, c_cut_right )
92   end
93 end
add_cut_ranges(*cut_ranges) click to toggle source

Add a series of CutRange objects (HorizontalCutRange or VerticalCutRange).


Arguments

  • cut_ranges: A series of CutRange objects

Returns

nothing

    # File lib/bio/util/restriction_enzyme/range/sequence_range.rb
101 def add_cut_ranges(*cut_ranges)
102   cut_ranges.flatten.each do |cut_range|
103     raise TypeError, "Not of type CutRange" unless cut_range.kind_of? CutRange
104     self.add_cut_range( cut_range )
105   end
106 end
add_horizontal_cut_range( left, right=left ) click to toggle source

Builds a HorizontalCutRange object and adds it to the SequenceRange.


Arguments

  • left: Left-most cut

  • right: (optional) Right side - by default this equals the left side, default is recommended.

Returns

nothing

    # File lib/bio/util/restriction_enzyme/range/sequence_range.rb
115 def add_horizontal_cut_range( left, right=left )
116   @__fragments_current = false
117   @cut_ranges << HorizontalCutRange.new( left, right )
118 end
fragments() click to toggle source

Calculates the fragments over this sequence range as defined after using the methods add_cut_range, add_cut_ranges, and/or add_horizontal_cut_range

Example return value:

[#<Bio::RestrictionEnzyme::Range::SequenceRange::Fragment:0x277bdc
  @complement_bin=[0, 1],
  @primary_bin=[0]>,
 #<Bio::RestrictionEnzyme::Range::SequenceRange::Fragment:0x277bc8
  @complement_bin=[],
  @primary_bin=[1, 2]>,
 #<Bio::RestrictionEnzyme::Range::SequenceRange::Fragment:0x277bb4
  @complement_bin=[2, 3],
  @primary_bin=[]>,
 #<Bio::RestrictionEnzyme::Range::SequenceRange::Fragment:0x277ba0
  @complement_bin=[4, 5],
  @primary_bin=[3, 4, 5]>]

Arguments

  • none

Returns

Bio::RestrictionEnzyme::Range::SequenceRange::Fragments

    # File lib/bio/util/restriction_enzyme/range/sequence_range.rb
157 def fragments
158   return @__fragments if @__fragments_current == true
159   @__fragments_current = true
160   
161   num_txt = '0123456789'
162   num_txt_repeat = (num_txt * ( @size.div(num_txt.size) + 1))[0..@size-1]
163   fragments = Fragments.new(num_txt_repeat, num_txt_repeat)
164 
165   cc = Bio::RestrictionEnzyme::Range::SequenceRange::CalculatedCuts.new(@size)
166   cc.add_cuts_from_cut_ranges(@cut_ranges)
167   cc.remove_incomplete_cuts
168   
169   create_bins(cc).sort.each { |k, bin| fragments << Fragment.new( bin.p, bin.c ) }
170   @__fragments = fragments
171   return fragments
172 end

Protected Instance Methods

create_bins(cc) click to toggle source

Example:

cc = Bio::RestrictionEnzyme::Range::SequenceRange::CalculatedCuts.new(@size)
cc.add_cuts_from_cut_ranges(@cut_ranges)
cc.remove_incomplete_cuts
bins = create_bins(cc)

Example return value:

{0=>#<struct Bio::RestrictionEnzyme::Range::SequenceRange::Bin c=[0, 1], p=[0]>,
 2=>#<struct Bio::RestrictionEnzyme::Range::SequenceRange::Bin c=[], p=[1, 2]>,
 3=>#<struct Bio::RestrictionEnzyme::Range::SequenceRange::Bin c=[2, 3], p=[]>,
 4=>#<struct Bio::RestrictionEnzyme::Range::SequenceRange::Bin c=[4, 5], p=[3, 4, 5]>}

Arguments

Returns

Hash Keys are unique, values are Bio::RestrictionEnzyme::Range::SequenceRange::Bin objects filled with indexes of the sequence locations they represent.

    # File lib/bio/util/restriction_enzyme/range/sequence_range.rb
194 def create_bins(cc)
195   p_cut = cc.vc_primary_as_original_class
196   c_cut = cc.vc_complement_as_original_class
197   h_cut = cc.hc_between_strands_as_original_class
198   
199   if (defined? @circular) && @circular
200     # NOTE
201     # if it's circular we should start at the beginning of a cut for orientation
202     # scan for it, hack off the first set of hcuts and move them to the back
203 
204     unique_id = 0
205   else
206     p_cut.unshift(-1) unless p_cut.include?(-1)
207     c_cut.unshift(-1) unless c_cut.include?(-1)
208     unique_id = -1
209   end
210 
211   p_bin_id = c_bin_id = unique_id
212   bins = {}
213   setup_new_bin(bins, unique_id)
214 
215   -1.upto(@size-1) do |idx| # NOTE - circular, for the future - should '-1' be replace with 'unique_id'?
216     
217     # if bin_ids are out of sync but the strands are attached
218     if (p_bin_id != c_bin_id) and !h_cut.include?(idx)
219       min_id, max_id = [p_bin_id, c_bin_id].sort
220       bins.delete(max_id)
221       p_bin_id = c_bin_id = min_id
222     end
223 
224     bins[ p_bin_id ].p << idx
225     bins[ c_bin_id ].c << idx
226     
227     if p_cut.include? idx
228       p_bin_id = (unique_id += 1)
229       setup_new_bin(bins, p_bin_id)
230     end
231 
232     if c_cut.include? idx             # repetition
233       c_bin_id = (unique_id += 1)     # repetition
234       setup_new_bin(bins, c_bin_id)   # repetition
235     end                               # repetition
236      
237   end
238 
239   # Bin "-1" is an easy way to indicate the start of a strand just in case
240   # there is a horizontal cut at position 0
241   bins.delete(-1) unless ((defined? @circular) && @circular)
242   bins
243 end
setup_new_bin(bins, bin_id) click to toggle source

Modifies bins in place by creating a new element with key bin_id and initializing the bin.

    # File lib/bio/util/restriction_enzyme/range/sequence_range.rb
247 def setup_new_bin(bins, bin_id)
248   bins[ bin_id ] = Bin.new
249   bins[ bin_id ].p = DenseIntArray[] #could be replaced by SortedNumArray[]
250   bins[ bin_id ].c = DenseIntArray[] #could be replaced by SortedNumArray[]
251 end