class Bio::RestrictionEnzyme::Analysis

Public Class Methods

cut( sequence, *args ) click to toggle source

See cut instance method

   # File lib/bio/util/restriction_enzyme/analysis.rb
23 def self.cut( sequence, *args )
24   self.new.cut( sequence, *args )
25 end
cut_without_permutations( sequence, *args ) click to toggle source

See cut_without_permutations instance method

   # File lib/bio/util/restriction_enzyme/analysis_basic.rb
21 def self.cut_without_permutations( sequence, *args )
22   self.new.cut_without_permutations( sequence, *args )
23 end

Public Instance Methods

cut( sequence, *args ) click to toggle source

See main documentation for Bio::RestrictionEnzyme

cut takes into account permutations of cut variations based on competitiveness of enzymes for an enzyme cutsite or enzyme bindsite on a sequence.

Example:

FIXME add output

Bio::RestrictionEnzyme::Analysis.cut('gaattc', 'EcoRI')

_same as:_

Bio::RestrictionEnzyme::Analysis.cut('gaattc', 'g^aattc')

Arguments

  • sequence: String kind of object that will be used as a nucleic acid sequence.

  • args: Series of enzyme names, enzymes sequences with cut marks, or RestrictionEnzyme objects.

Returns

Bio::RestrictionEnzyme::Fragments object populated with Bio::RestrictionEnzyme::Fragment objects. (Note: unrelated to Bio::RestrictionEnzyme::Range::SequenceRange::Fragments) or a Symbol containing an error code

   # File lib/bio/util/restriction_enzyme/analysis.rb
48 def cut( sequence, *args )
49   view_ranges = false
50   
51   args.select { |i| i.class == Hash }.each do |hsh|
52     hsh.each do |key, value|
53       if key == :view_ranges
54         unless ( value.kind_of?(TrueClass) or value.kind_of?(FalseClass) )
55           raise ArgumentError, "view_ranges must be set to true or false, currently #{value.inspect}."
56         end
57         view_ranges = value
58       end
59     end
60   end
61   
62   res = cut_and_return_by_permutations( sequence, *args )
63   return res if res.class == Symbol
64   # Format the fragments for the user
65   fragments_for_display( res, view_ranges )
66 end
cut_without_permutations( sequence, *args ) click to toggle source

See main documentation for Bio::RestrictionEnzyme

Bio::RestrictionEnzyme.cut is preferred over this!

USE AT YOUR OWN RISK

This is a simpler version of method cut. cut takes into account permutations of cut variations based on competitiveness of enzymes for an enzyme cutsite or enzyme bindsite on a sequence. This does not take into account those possibilities and is therefore faster, but less likely to be accurate.

This code is mainly included as an academic example without having to wade through the extra layer of complexity added by the permutations.

Example:

FIXME add output

Bio::RestrictionEnzyme::Analysis.cut_without_permutations('gaattc', 'EcoRI')

_same as:_

Bio::RestrictionEnzyme::Analysis.cut_without_permutations('gaattc', 'g^aattc')

Arguments

  • sequence: String kind of object that will be used as a nucleic acid sequence.

  • args: Series of enzyme names, enzymes sequences with cut marks, or RestrictionEnzyme objects.

Returns

Bio::RestrictionEnzyme::Fragments object populated with Bio::RestrictionEnzyme::Fragment objects. (Note: unrelated to Bio::RestrictionEnzyme::Range::SequenceRange::Fragments)

   # File lib/bio/util/restriction_enzyme/analysis_basic.rb
55 def cut_without_permutations( sequence, *args )
56   return fragments_for_display( {} ) if !sequence.kind_of?(String) or sequence.empty?
57   sequence = Bio::Sequence::NA.new( sequence )
58 
59   # create_enzyme_actions returns two seperate array elements, they're not
60   # needed separated here so we put them into one array
61   enzyme_actions = create_enzyme_actions( sequence, *args ).flatten
62   return fragments_for_display( {} ) if enzyme_actions.empty?
63   
64   # Primary and complement strands are both measured from '0' to 'sequence.size-1' here
65   sequence_range = Bio::RestrictionEnzyme::Range::SequenceRange.new( 0, 0, sequence.size-1, sequence.size-1 )
66   
67   # Add the cuts to the sequence_range from each enzyme_action
68   enzyme_actions.each do |enzyme_action|
69     enzyme_action.cut_ranges.each do |cut_range|
70       sequence_range.add_cut_range(cut_range)
71     end
72   end
73 
74   # Fill in the source sequence for sequence_range so it knows what bases
75   # to use
76   sequence_range.fragments.primary = sequence
77   sequence_range.fragments.complement = sequence.forward_complement
78   
79   # Format the fragments for the user
80   fragments_for_display( {0 => sequence_range} )
81 end

Protected Instance Methods

create_enzyme_actions( sequence, *args ) click to toggle source

Creates an array of EnzymeActions based on the DNA sequence and supplied enzymes.


Arguments

  • sequence: The string of DNA to match the enzyme recognition sites against

  • args

    The enzymes to use.

Returns

Array with the first element being an array of EnzymeAction objects that sometimes_cut, and are subject to competition. The second is an array of EnzymeAction objects that always_cut and are not subject to competition.

    # File lib/bio/util/restriction_enzyme/analysis_basic.rb
120 def create_enzyme_actions( sequence, *args )
121   all_enzyme_actions = []
122   
123   args.each do |enzyme|
124     enzyme = Bio::RestrictionEnzyme.new(enzyme) unless enzyme.class == Bio::RestrictionEnzyme::DoubleStranded
125 
126     # make sure pattern is the proper size
127     # for more info see the internal documentation of
128     # Bio::RestrictionEnzyme::DoubleStranded.create_action_at
129     pattern = Bio::Sequence::NA.new(
130       Bio::RestrictionEnzyme::DoubleStranded::AlignedStrands.align(
131         enzyme.primary, enzyme.complement
132       ).primary
133     ).to_re
134     
135     find_match_locations( sequence, pattern ).each do |offset|
136       all_enzyme_actions << enzyme.create_action_at( offset )
137     end
138   end
139   
140   # FIXME VerticalCutRange should really be called VerticalAndHorizontalCutRange
141   
142   # * all_enzyme_actions is now full of EnzymeActions at specific locations across
143   #   the sequence.
144   # * all_enzyme_actions will now be examined to see if any EnzymeActions may
145   #   conflict with one another, and if they do they'll be made note of in
146   #   indicies_of_sometimes_cut.  They will then be remove FIXME
147   # * a conflict occurs if another enzyme's bind site is compromised do due
148   #   to another enzyme's cut.  Enzyme's bind sites may overlap and not be
149   #   competitive, however neither bind site may be part of the other
150   #   enzyme's cut or else they do become competitive.
151   #
152   # Take current EnzymeAction's entire bind site and compare it to all other
153   # EzymeAction's cut ranges.  Only look for vertical cuts as boundaries
154   # since trailing horizontal cuts would have no influence on the bind site.
155   #
156   # If example Enzyme A makes this cut pattern (cut range 2..5):
157   #
158   # 0 1 2|3 4 5 6 7
159   #      +-----+
160   # 0 1 2 3 4 5|6 7
161   #
162   # Then the bind site (and EnzymeAction range) for Enzyme B would need it's
163   # right side to be at index 2 or less, or it's left side to be 6 or greater.
164   
165   competition_indexes = Set.new
166 
167   all_enzyme_actions[0..-2].each_with_index do |current_enzyme_action, i|
168     next if competition_indexes.include? i
169     next if current_enzyme_action.cut_ranges.empty?  # no cuts, some enzymes are like this (ex. CjuI)
170     
171     all_enzyme_actions[i+1..-1].each_with_index do |comparison_enzyme_action, j|
172       j += (i + 1)
173       next if competition_indexes.include? j
174       next if comparison_enzyme_action.cut_ranges.empty?  # no cuts
175       
176       if (current_enzyme_action.right <= comparison_enzyme_action.cut_ranges.min_vertical) or
177          (current_enzyme_action.left > comparison_enzyme_action.cut_ranges.max_vertical)
178         # no conflict
179       else
180         competition_indexes += [i, j] # merge both indexes into the flat set
181       end
182     end
183   end
184       
185   sometimes_cut = all_enzyme_actions.values_at( *competition_indexes )
186   always_cut = all_enzyme_actions
187   always_cut.delete_if {|x| sometimes_cut.include? x }
188 
189   [sometimes_cut, always_cut]
190 end
cut_and_return_by_permutations( sequence, *args ) click to toggle source

See cut instance method


Arguments

  • sequence: String kind of object that will be used as a nucleic acid sequence.

  • args: Series of enzyme names, enzymes sequences with cut marks, or RestrictionEnzyme objects.

May also supply a Hash with the key “:max_permutations” to specificy how many permutations are allowed - a value of 0 indicates no permutations are allowed.

Returns

Hash Keys are a permutation ID, values are SequenceRange objects that have cuts applied.

also may return the Symbol ‘:sequence_empty’, ‘:no_cuts_found’, or ‘:too_many_permutations’

    # File lib/bio/util/restriction_enzyme/analysis.rb
 81 def cut_and_return_by_permutations( sequence, *args )
 82   my_hash = {}
 83   maximum_permutations = nil
 84 
 85   hashes_in_args = args.select { |i| i.class == Hash }
 86   args.delete_if { |i| i.class == Hash }
 87   hashes_in_args.each do |hsh|
 88     hsh.each do |key, value|
 89       case key
 90       when :max_permutations, 'max_permutations', :maximum_permutations, 'maximum_permutations'
 91         maximum_permutations = value.to_i unless value == nil
 92       when :view_ranges
 93       else
 94         raise ArgumentError, "Received key #{key.inspect} in argument - I only know the key ':max_permutations' and ':view_ranges' currently.  Hash passed: #{hsh.inspect}"
 95       end
 96     end
 97   end
 98   
 99   if !sequence.kind_of?(String) or sequence.empty?
100     logger.warn "The supplied sequence is empty." if defined?(logger)
101     return :sequence_empty
102   end
103   sequence = Bio::Sequence::NA.new( sequence )
104   
105   enzyme_actions, initial_cuts = create_enzyme_actions( sequence, *args )
106 
107   if enzyme_actions.empty? and initial_cuts.empty?
108     logger.warn "This enzyme does not make any cuts on this sequence." if defined?(logger)
109     return :no_cuts_found
110   end
111 
112   # * When enzyme_actions.size is equal to '1' that means there are no permutations.
113   # * If enzyme_actions.size is equal to '2' there is one
114   #   permutation ("[0, 1]")
115   # * If enzyme_actions.size is equal to '3' there are two
116   #   permutations ("[0, 1, 2]")
117   # * and so on..
118   if maximum_permutations and enzyme_actions.size > 1
119     if (enzyme_actions.size - 1) > maximum_permutations.to_i
120       logger.warn "More permutations than maximum, skipping.  Found: #{enzyme_actions.size-1}  Max: #{maximum_permutations.to_i}" if defined?(logger)
121       return :too_many_permutations
122     end
123   end
124   
125   if enzyme_actions.size > 1
126     permutations = permute(enzyme_actions.size)
127     
128     permutations.each do |permutation|
129       previous_cut_ranges = []
130       # Primary and complement strands are both measured from '0' to 'sequence.size-1' here
131       sequence_range = Bio::RestrictionEnzyme::Range::SequenceRange.new( 0, 0, sequence.size-1, sequence.size-1 )
132 
133       # Add the cuts to the sequence_range from each enzyme_action contained
134       # in initial_cuts.  These are the cuts that have no competition so are
135       # not subject to permutations.
136       initial_cuts.each do |enzyme_action|
137         enzyme_action.cut_ranges.each do |cut_range|
138           sequence_range.add_cut_range(cut_range)
139         end
140       end
141 
142       permutation.each do |id|
143         enzyme_action = enzyme_actions[id]
144 
145         # conflict is false if the current enzyme action may cut in it's range.
146         # conflict is true if it cannot due to a previous enzyme action making
147         # a cut where this enzyme action needs a whole recognition site.
148         conflict = false
149 
150         # If current size of enzyme_action overlaps with previous cut_range, don't cut
151         # note that the enzyme action may fall in the middle of a previous enzyme action
152         # so all cut locations must be checked that would fall underneath.
153         previous_cut_ranges.each do |cut_range|
154           next unless cut_range.class == Bio::RestrictionEnzyme::Range::VerticalCutRange  # we aren't concerned with horizontal cuts
155           previous_cut_left = cut_range.range.first 
156           previous_cut_right = cut_range.range.last
157 
158           # Keep in mind:
159           # * The cut location is to the immediate right of the base located at the index.
160           #   ex: at^gc -- the cut location is at index 1
161           # * The enzyme action location is located at the base of the index.
162           #   ex: atgc -- 0 => 'a', 1 => 't', 2 => 'g', 3 => 'c'
163           # method create_enzyme_actions has similar commentary if interested
164           if (enzyme_action.right <= previous_cut_left) or
165              (enzyme_action.left > previous_cut_right) or
166              (enzyme_action.left > previous_cut_left and enzyme_action.right <= previous_cut_right) # in between cuts
167             # no conflict
168           else
169             conflict = true
170           end
171         end
172 
173         next if conflict == true
174         enzyme_action.cut_ranges.each { |cut_range| sequence_range.add_cut_range(cut_range) }
175         previous_cut_ranges += enzyme_action.cut_ranges        
176       end # permutation.each
177 
178       # Fill in the source sequence for sequence_range so it knows what bases
179       # to use
180       sequence_range.fragments.primary = sequence
181       sequence_range.fragments.complement = sequence.forward_complement
182       my_hash[permutation] = sequence_range
183     end # permutations.each
184     
185   else # if enzyme_actions.size == 1
186     # no permutations, just do it
187     sequence_range = Bio::RestrictionEnzyme::Range::SequenceRange.new( 0, 0, sequence.size-1, sequence.size-1 )
188     initial_cuts.each { |enzyme_action| enzyme_action.cut_ranges.each { |cut_range| sequence_range.add_cut_range(cut_range) } }
189     sequence_range.fragments.primary = sequence
190     sequence_range.fragments.complement = sequence.forward_complement
191     my_hash[0] = sequence_range
192   end
193 
194   my_hash
195 end
find_match_locations( string, re ) click to toggle source

Returns an Array of the match indicies of a RegExp to a string.

Example:

find_match_locations('abccdefeg', /[ce]/) # => [2,3,5,7]

Arguments

  • string: The string to scan

  • re: A RegExp to use

Returns

Array with indicies of match locations

    # File lib/bio/util/restriction_enzyme/analysis_basic.rb
203 def find_match_locations( string, re )
204   md = string.match( re )
205   locations = []
206   counter = 0
207   while md
208     # save the match index relative to the original string
209     locations << (counter += md.begin(0))
210     # find the next match
211     md = string[ (counter += 1)..-1 ].match( re )
212   end
213   locations
214 end
fragments_for_display( hsh, view_ranges=false ) click to toggle source

Take the fragments from SequenceRange objects generated from add_cut_range and return unique results as a Bio::RestrictionEnzyme::Analysis::Fragment object.


Arguments

  • hsh: Hash Keys are a permutation ID, if any. Values are SequenceRange objects that have cuts applied.

Returns

Bio::RestrictionEnzyme::Analysis::Fragments object populated with Bio::RestrictionEnzyme::Analysis::Fragment objects.

    # File lib/bio/util/restriction_enzyme/analysis_basic.rb
 94 def fragments_for_display( hsh, view_ranges=false )
 95   ary = Fragments.new
 96   return ary unless hsh
 97 
 98   hsh.each do |permutation_id, sequence_range|
 99     sequence_range.fragments.for_display.each do |fragment|
100       if view_ranges
101         ary << Bio::RestrictionEnzyme::Fragment.new(fragment.primary, fragment.complement, fragment.p_left, fragment.p_right, fragment.c_left, fragment.c_right)
102       else
103         ary << Bio::RestrictionEnzyme::Fragment.new(fragment.primary, fragment.complement)
104       end
105     end
106   end
107   
108   ary.uniq! unless view_ranges
109   
110   ary
111 end
permute(count, permutations = [[0]]) click to toggle source

Returns permutation orders for a given number of elements.

Examples:

permute(0) # => [[0]]
permute(1) # => [[0]]
permute(2) # => [[1, 0], [0, 1]]
permute(3) # => [[2, 1, 0], [2, 0, 1], [1, 2, 0], [0, 2, 1], [1, 0, 2], [0, 1, 2]]
permute(4) # => [[3, 2, 1, 0],
                 [3, 2, 0, 1],
                 [3, 1, 2, 0],
                 [3, 0, 2, 1],
                 [3, 1, 0, 2],
                 [3, 0, 1, 2],
                 [2, 3, 1, 0],
                 [2, 3, 0, 1],
                 [1, 3, 2, 0],
                 [0, 3, 2, 1],
                 [1, 3, 0, 2],
                 [0, 3, 1, 2],
                 [2, 1, 3, 0],
                 [2, 0, 3, 1],
                 [1, 2, 3, 0],
                 [0, 2, 3, 1],
                 [1, 0, 3, 2],
                 [0, 1, 3, 2],
                 [2, 1, 0, 3],
                 [2, 0, 1, 3],
                 [1, 2, 0, 3],
                 [0, 2, 1, 3],
                 [1, 0, 2, 3],
                 [0, 1, 2, 3]]

Arguments

  • count: Number of different elements to be permuted

  • permutations: ignore - for the recursive algorithm

Returns

Array of Array objects with different possible permutation orders. See examples.

    # File lib/bio/util/restriction_enzyme/analysis.rb
235 def permute(count, permutations = [[0]])
236   return permutations if count <= 1
237   new_arrays = []
238   new_array = []
239 
240   (permutations[0].size + 1).times do |n|
241     new_array.clear
242     permutations.each { |a| new_array << a.dup }
243     new_array.each { |e| e.insert(n, permutations[0].size) }
244     new_arrays += new_array
245   end
246 
247   permute(count-1, new_arrays)
248 end