class Bio::RestrictionEnzyme::Analysis
Public Class Methods
See cut instance method
# File lib/bio/util/restriction_enzyme/analysis.rb 23 def self.cut( sequence, *args ) 24 self.new.cut( sequence, *args ) 25 end
See cut_without_permutations
instance method
# File lib/bio/util/restriction_enzyme/analysis_basic.rb 21 def self.cut_without_permutations( sequence, *args ) 22 self.new.cut_without_permutations( sequence, *args ) 23 end
Public Instance Methods
See main documentation for Bio::RestrictionEnzyme
cut
takes into account permutations of cut variations based on competitiveness of enzymes for an enzyme cutsite or enzyme bindsite on a sequence.
Example:
FIXME add output
Bio::RestrictionEnzyme::Analysis.cut('gaattc', 'EcoRI')
_same as:_
Bio::RestrictionEnzyme::Analysis.cut('gaattc', 'g^aattc')
Arguments
-
sequence
:String
kind of object that will be used as a nucleic acid sequence. -
args
: Series of enzyme names, enzymes sequences with cut marks, orRestrictionEnzyme
objects.
- Returns
-
Bio::RestrictionEnzyme::Fragments
object populated with Bio::RestrictionEnzyme::Fragment objects. (Note: unrelated toBio::RestrictionEnzyme::Range::SequenceRange::Fragments
) or aSymbol
containing an error code
# File lib/bio/util/restriction_enzyme/analysis.rb 48 def cut( sequence, *args ) 49 view_ranges = false 50 51 args.select { |i| i.class == Hash }.each do |hsh| 52 hsh.each do |key, value| 53 if key == :view_ranges 54 unless ( value.kind_of?(TrueClass) or value.kind_of?(FalseClass) ) 55 raise ArgumentError, "view_ranges must be set to true or false, currently #{value.inspect}." 56 end 57 view_ranges = value 58 end 59 end 60 end 61 62 res = cut_and_return_by_permutations( sequence, *args ) 63 return res if res.class == Symbol 64 # Format the fragments for the user 65 fragments_for_display( res, view_ranges ) 66 end
See main documentation for Bio::RestrictionEnzyme
Bio::RestrictionEnzyme.cut
is preferred over this!
USE AT YOUR OWN RISK
This is a simpler version of method cut
. cut
takes into account permutations of cut variations based on competitiveness of enzymes for an enzyme cutsite or enzyme bindsite on a sequence. This does not take into account those possibilities and is therefore faster, but less likely to be accurate.
This code is mainly included as an academic example without having to wade through the extra layer of complexity added by the permutations.
Example:
FIXME add output
Bio::RestrictionEnzyme::Analysis.cut_without_permutations('gaattc', 'EcoRI')
_same as:_
Bio::RestrictionEnzyme::Analysis.cut_without_permutations('gaattc', 'g^aattc')
Arguments
-
sequence
:String
kind of object that will be used as a nucleic acid sequence. -
args
: Series of enzyme names, enzymes sequences with cut marks, orRestrictionEnzyme
objects.
- Returns
-
Bio::RestrictionEnzyme::Fragments
object populated with Bio::RestrictionEnzyme::Fragment objects. (Note: unrelated toBio::RestrictionEnzyme::Range::SequenceRange::Fragments
)
# File lib/bio/util/restriction_enzyme/analysis_basic.rb 55 def cut_without_permutations( sequence, *args ) 56 return fragments_for_display( {} ) if !sequence.kind_of?(String) or sequence.empty? 57 sequence = Bio::Sequence::NA.new( sequence ) 58 59 # create_enzyme_actions returns two seperate array elements, they're not 60 # needed separated here so we put them into one array 61 enzyme_actions = create_enzyme_actions( sequence, *args ).flatten 62 return fragments_for_display( {} ) if enzyme_actions.empty? 63 64 # Primary and complement strands are both measured from '0' to 'sequence.size-1' here 65 sequence_range = Bio::RestrictionEnzyme::Range::SequenceRange.new( 0, 0, sequence.size-1, sequence.size-1 ) 66 67 # Add the cuts to the sequence_range from each enzyme_action 68 enzyme_actions.each do |enzyme_action| 69 enzyme_action.cut_ranges.each do |cut_range| 70 sequence_range.add_cut_range(cut_range) 71 end 72 end 73 74 # Fill in the source sequence for sequence_range so it knows what bases 75 # to use 76 sequence_range.fragments.primary = sequence 77 sequence_range.fragments.complement = sequence.forward_complement 78 79 # Format the fragments for the user 80 fragments_for_display( {0 => sequence_range} ) 81 end
Protected Instance Methods
Creates an array of EnzymeActions based on the DNA sequence and supplied enzymes.
Arguments
-
sequence
: The string of DNA to match the enzyme recognition sites against args
-
The enzymes to use.
- Returns
-
Array
with the first element being an array of EnzymeAction objects thatsometimes_cut
, and are subject to competition. The second is an array of EnzymeAction objects thatalways_cut
and are not subject to competition.
# File lib/bio/util/restriction_enzyme/analysis_basic.rb 120 def create_enzyme_actions( sequence, *args ) 121 all_enzyme_actions = [] 122 123 args.each do |enzyme| 124 enzyme = Bio::RestrictionEnzyme.new(enzyme) unless enzyme.class == Bio::RestrictionEnzyme::DoubleStranded 125 126 # make sure pattern is the proper size 127 # for more info see the internal documentation of 128 # Bio::RestrictionEnzyme::DoubleStranded.create_action_at 129 pattern = Bio::Sequence::NA.new( 130 Bio::RestrictionEnzyme::DoubleStranded::AlignedStrands.align( 131 enzyme.primary, enzyme.complement 132 ).primary 133 ).to_re 134 135 find_match_locations( sequence, pattern ).each do |offset| 136 all_enzyme_actions << enzyme.create_action_at( offset ) 137 end 138 end 139 140 # FIXME VerticalCutRange should really be called VerticalAndHorizontalCutRange 141 142 # * all_enzyme_actions is now full of EnzymeActions at specific locations across 143 # the sequence. 144 # * all_enzyme_actions will now be examined to see if any EnzymeActions may 145 # conflict with one another, and if they do they'll be made note of in 146 # indicies_of_sometimes_cut. They will then be remove FIXME 147 # * a conflict occurs if another enzyme's bind site is compromised do due 148 # to another enzyme's cut. Enzyme's bind sites may overlap and not be 149 # competitive, however neither bind site may be part of the other 150 # enzyme's cut or else they do become competitive. 151 # 152 # Take current EnzymeAction's entire bind site and compare it to all other 153 # EzymeAction's cut ranges. Only look for vertical cuts as boundaries 154 # since trailing horizontal cuts would have no influence on the bind site. 155 # 156 # If example Enzyme A makes this cut pattern (cut range 2..5): 157 # 158 # 0 1 2|3 4 5 6 7 159 # +-----+ 160 # 0 1 2 3 4 5|6 7 161 # 162 # Then the bind site (and EnzymeAction range) for Enzyme B would need it's 163 # right side to be at index 2 or less, or it's left side to be 6 or greater. 164 165 competition_indexes = Set.new 166 167 all_enzyme_actions[0..-2].each_with_index do |current_enzyme_action, i| 168 next if competition_indexes.include? i 169 next if current_enzyme_action.cut_ranges.empty? # no cuts, some enzymes are like this (ex. CjuI) 170 171 all_enzyme_actions[i+1..-1].each_with_index do |comparison_enzyme_action, j| 172 j += (i + 1) 173 next if competition_indexes.include? j 174 next if comparison_enzyme_action.cut_ranges.empty? # no cuts 175 176 if (current_enzyme_action.right <= comparison_enzyme_action.cut_ranges.min_vertical) or 177 (current_enzyme_action.left > comparison_enzyme_action.cut_ranges.max_vertical) 178 # no conflict 179 else 180 competition_indexes += [i, j] # merge both indexes into the flat set 181 end 182 end 183 end 184 185 sometimes_cut = all_enzyme_actions.values_at( *competition_indexes ) 186 always_cut = all_enzyme_actions 187 always_cut.delete_if {|x| sometimes_cut.include? x } 188 189 [sometimes_cut, always_cut] 190 end
See cut instance method
Arguments
-
sequence
:String
kind of object that will be used as a nucleic acid sequence. -
args
: Series of enzyme names, enzymes sequences with cut marks, orRestrictionEnzyme
objects.
May also supply a Hash
with the key “:max_permutations” to specificy how many permutations are allowed - a value of 0 indicates no permutations are allowed.
- Returns
-
Hash
Keys are a permutation ID, values are SequenceRange objects that have cuts applied.
also may return the Symbol
‘:sequence_empty’, ‘:no_cuts_found’, or ‘:too_many_permutations’
# File lib/bio/util/restriction_enzyme/analysis.rb 81 def cut_and_return_by_permutations( sequence, *args ) 82 my_hash = {} 83 maximum_permutations = nil 84 85 hashes_in_args = args.select { |i| i.class == Hash } 86 args.delete_if { |i| i.class == Hash } 87 hashes_in_args.each do |hsh| 88 hsh.each do |key, value| 89 case key 90 when :max_permutations, 'max_permutations', :maximum_permutations, 'maximum_permutations' 91 maximum_permutations = value.to_i unless value == nil 92 when :view_ranges 93 else 94 raise ArgumentError, "Received key #{key.inspect} in argument - I only know the key ':max_permutations' and ':view_ranges' currently. Hash passed: #{hsh.inspect}" 95 end 96 end 97 end 98 99 if !sequence.kind_of?(String) or sequence.empty? 100 logger.warn "The supplied sequence is empty." if defined?(logger) 101 return :sequence_empty 102 end 103 sequence = Bio::Sequence::NA.new( sequence ) 104 105 enzyme_actions, initial_cuts = create_enzyme_actions( sequence, *args ) 106 107 if enzyme_actions.empty? and initial_cuts.empty? 108 logger.warn "This enzyme does not make any cuts on this sequence." if defined?(logger) 109 return :no_cuts_found 110 end 111 112 # * When enzyme_actions.size is equal to '1' that means there are no permutations. 113 # * If enzyme_actions.size is equal to '2' there is one 114 # permutation ("[0, 1]") 115 # * If enzyme_actions.size is equal to '3' there are two 116 # permutations ("[0, 1, 2]") 117 # * and so on.. 118 if maximum_permutations and enzyme_actions.size > 1 119 if (enzyme_actions.size - 1) > maximum_permutations.to_i 120 logger.warn "More permutations than maximum, skipping. Found: #{enzyme_actions.size-1} Max: #{maximum_permutations.to_i}" if defined?(logger) 121 return :too_many_permutations 122 end 123 end 124 125 if enzyme_actions.size > 1 126 permutations = permute(enzyme_actions.size) 127 128 permutations.each do |permutation| 129 previous_cut_ranges = [] 130 # Primary and complement strands are both measured from '0' to 'sequence.size-1' here 131 sequence_range = Bio::RestrictionEnzyme::Range::SequenceRange.new( 0, 0, sequence.size-1, sequence.size-1 ) 132 133 # Add the cuts to the sequence_range from each enzyme_action contained 134 # in initial_cuts. These are the cuts that have no competition so are 135 # not subject to permutations. 136 initial_cuts.each do |enzyme_action| 137 enzyme_action.cut_ranges.each do |cut_range| 138 sequence_range.add_cut_range(cut_range) 139 end 140 end 141 142 permutation.each do |id| 143 enzyme_action = enzyme_actions[id] 144 145 # conflict is false if the current enzyme action may cut in it's range. 146 # conflict is true if it cannot due to a previous enzyme action making 147 # a cut where this enzyme action needs a whole recognition site. 148 conflict = false 149 150 # If current size of enzyme_action overlaps with previous cut_range, don't cut 151 # note that the enzyme action may fall in the middle of a previous enzyme action 152 # so all cut locations must be checked that would fall underneath. 153 previous_cut_ranges.each do |cut_range| 154 next unless cut_range.class == Bio::RestrictionEnzyme::Range::VerticalCutRange # we aren't concerned with horizontal cuts 155 previous_cut_left = cut_range.range.first 156 previous_cut_right = cut_range.range.last 157 158 # Keep in mind: 159 # * The cut location is to the immediate right of the base located at the index. 160 # ex: at^gc -- the cut location is at index 1 161 # * The enzyme action location is located at the base of the index. 162 # ex: atgc -- 0 => 'a', 1 => 't', 2 => 'g', 3 => 'c' 163 # method create_enzyme_actions has similar commentary if interested 164 if (enzyme_action.right <= previous_cut_left) or 165 (enzyme_action.left > previous_cut_right) or 166 (enzyme_action.left > previous_cut_left and enzyme_action.right <= previous_cut_right) # in between cuts 167 # no conflict 168 else 169 conflict = true 170 end 171 end 172 173 next if conflict == true 174 enzyme_action.cut_ranges.each { |cut_range| sequence_range.add_cut_range(cut_range) } 175 previous_cut_ranges += enzyme_action.cut_ranges 176 end # permutation.each 177 178 # Fill in the source sequence for sequence_range so it knows what bases 179 # to use 180 sequence_range.fragments.primary = sequence 181 sequence_range.fragments.complement = sequence.forward_complement 182 my_hash[permutation] = sequence_range 183 end # permutations.each 184 185 else # if enzyme_actions.size == 1 186 # no permutations, just do it 187 sequence_range = Bio::RestrictionEnzyme::Range::SequenceRange.new( 0, 0, sequence.size-1, sequence.size-1 ) 188 initial_cuts.each { |enzyme_action| enzyme_action.cut_ranges.each { |cut_range| sequence_range.add_cut_range(cut_range) } } 189 sequence_range.fragments.primary = sequence 190 sequence_range.fragments.complement = sequence.forward_complement 191 my_hash[0] = sequence_range 192 end 193 194 my_hash 195 end
Returns an Array
of the match indicies of a RegExp
to a string.
Example:
find_match_locations('abccdefeg', /[ce]/) # => [2,3,5,7]
Arguments
-
string
: The string to scan -
re
: A RegExp to use
- Returns
-
Array
with indicies of match locations
# File lib/bio/util/restriction_enzyme/analysis_basic.rb 203 def find_match_locations( string, re ) 204 md = string.match( re ) 205 locations = [] 206 counter = 0 207 while md 208 # save the match index relative to the original string 209 locations << (counter += md.begin(0)) 210 # find the next match 211 md = string[ (counter += 1)..-1 ].match( re ) 212 end 213 locations 214 end
Take the fragments from SequenceRange objects generated from add_cut_range and return unique results as a Bio::RestrictionEnzyme::Analysis::Fragment object.
Arguments
-
hsh
:Hash
Keys are a permutation ID, if any. Values are SequenceRange objects that have cuts applied.
- Returns
-
Bio::RestrictionEnzyme::Analysis::Fragments object populated with Bio::RestrictionEnzyme::Analysis::Fragment objects.
# File lib/bio/util/restriction_enzyme/analysis_basic.rb 94 def fragments_for_display( hsh, view_ranges=false ) 95 ary = Fragments.new 96 return ary unless hsh 97 98 hsh.each do |permutation_id, sequence_range| 99 sequence_range.fragments.for_display.each do |fragment| 100 if view_ranges 101 ary << Bio::RestrictionEnzyme::Fragment.new(fragment.primary, fragment.complement, fragment.p_left, fragment.p_right, fragment.c_left, fragment.c_right) 102 else 103 ary << Bio::RestrictionEnzyme::Fragment.new(fragment.primary, fragment.complement) 104 end 105 end 106 end 107 108 ary.uniq! unless view_ranges 109 110 ary 111 end
Returns permutation orders for a given number of elements.
Examples:
permute(0) # => [[0]] permute(1) # => [[0]] permute(2) # => [[1, 0], [0, 1]] permute(3) # => [[2, 1, 0], [2, 0, 1], [1, 2, 0], [0, 2, 1], [1, 0, 2], [0, 1, 2]] permute(4) # => [[3, 2, 1, 0], [3, 2, 0, 1], [3, 1, 2, 0], [3, 0, 2, 1], [3, 1, 0, 2], [3, 0, 1, 2], [2, 3, 1, 0], [2, 3, 0, 1], [1, 3, 2, 0], [0, 3, 2, 1], [1, 3, 0, 2], [0, 3, 1, 2], [2, 1, 3, 0], [2, 0, 3, 1], [1, 2, 3, 0], [0, 2, 3, 1], [1, 0, 3, 2], [0, 1, 3, 2], [2, 1, 0, 3], [2, 0, 1, 3], [1, 2, 0, 3], [0, 2, 1, 3], [1, 0, 2, 3], [0, 1, 2, 3]]
Arguments
-
count
:Number
of different elements to be permuted -
permutations
: ignore - for the recursive algorithm
- Returns
-
Array
ofArray
objects with different possible permutation orders. See examples.
# File lib/bio/util/restriction_enzyme/analysis.rb 235 def permute(count, permutations = [[0]]) 236 return permutations if count <= 1 237 new_arrays = [] 238 new_array = [] 239 240 (permutations[0].size + 1).times do |n| 241 new_array.clear 242 permutations.each { |a| new_array << a.dup } 243 new_array.each { |e| e.insert(n, permutations[0].size) } 244 new_arrays += new_array 245 end 246 247 permute(count-1, new_arrays) 248 end